嗨,伙计们,我正在用scala解析newsgroups.tar.gz中的一些数据
是我试图处理的文本:
val inputData = ""xref: cantaloupe.srv.cs.cmu.edu alt.atheism:51121 soc.motss:139944 rec.scouting:5318
newsgroups: alt.atheism,soc.motss,rec.scouting
path: cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!fs7.ece.cmu.edu!europa.eng.gtefsd.com!howland.reston.ans.net!wupost!uunet!newsgate.watson.ibm.com!yktnews.watson.ibm.com!watson!watson.ibm.com!strom
from: strom@watson.ibm.com (rob strom)
subject: re: [soc.motss, et al.] "princeton axes matching funds for boy scouts"
sender: @watson.ibm.com
message-id: <1993apr05.180116.43346@watson.ibm.com>
date: mon, 05 apr 93 18:01:16 gmt
distribution: usa
references: <c47efs.3q47@austin.ibm.com> <1993mar22.033150.17345@cbnewsl.cb.att.com> <n4hy.93apr5120934@harder.ccr-p.ida.org>
organization: ibm research
lines: 15
in article <n4hy.93apr5120934@harder.ccr-p.ida.org>, n4hy@harder.ccr-p.ida.org (bob mcgwier) writes:
|> [1] however, i hate economic terrorism and political correctness
|> worse than i hate this policy.
|> [2] a more effective approach is to stop donating
|> to any organizating that directly or indirectly supports gay rights issues
|> until they end the boycott on funding of scouts.
can somebody reconcile the apparent contradiction between [1] and [2]?
--
rob strom, strom@watson.ibm.com, (914) 784-7641
ibm research, 30 saw mill river road, p.o. box 704, yorktown heights, ny 10598",这是我需要的输出
in article <n4hy.93apr5120934@harder.ccr-p.ida.org>, n4hy@harder.ccr-p.ida.org (bob mcgwier) writes:
|> [1] however, i hate economic terrorism and political correctness
|> worse than i hate this policy.
|> [2] a more effective approach is to stop donating
|> to any organizating that directly or indirectly supports gay rights issues
|> until they end the boycott on funding of scouts.
can somebody reconcile the apparent contradiction between [1] and [2]?,这是我尝试过的:
val docParser = """([\\s\\S]+\\lines: \\d*)([\\s\\S]*\\n\\n)([\\s\\S]*)""".r
val docParser(metadata, content, footer) = inputText但是我得到了以下错误:
scala.MatchError:[Ljava.lang.String;@62f8fff1 [Ljava.lang.String;]
不过,在线regex构建器似乎有效:

有什么想法吗?)
发布于 2015-10-13 06:42:04
我以前从未用scala编写过程序,但从我在expressions.htm中可以看到的情况来看,您必须转义两次类似于数字的东西。
因此,\d将成为scala中的\\d等等。
https://stackoverflow.com/questions/33095773
复制相似问题