我必须分析一个电子邮件语料库,看看有多少句子是由leet speak (例如lol,brb等)主导的。
对于每个句子,我执行以下操作:
val words = sentence.split(" ")
for (word <- words) {
if (validWords.contains(word)) {
score += 1
} else if (leetWords.contains(word)) {
score -= 1
}
}有没有更好的方法使用Fold来计算分数?
发布于 2016-09-06 21:37:24
如果不限于折叠,使用sum会更简洁。
sentence.split(" ")
.iterator
.map(word =>
if (validWords.contains(word)) 1
else if (leetWords.contains(word)) -1
else 0
).sum发布于 2016-09-06 07:48:31
没有太大的不同,但另一种选择。
val words = List("one", "two", "three")
val valid = List("one", "two")
val leet = List("three")
def check(valid: List[String], invalid: List[String])(words:List[String]): Int = words.foldLeft(0){
case (x, word) if valid.contains(word) => x + 1
case (x, word) if invalid.contains(word) => x - 1
case (x, _ ) => x
}
val checkValidOrLeet = check(valid, leet)(_)
val count = checkValidOrLeet(words)发布于 2016-09-06 07:00:58
这里有一种使用折叠和部分应用的方法。还可以更优雅,我会继续考虑的。
val sentence = // ...your data....
val validWords = // ... your valid words...
val leetWords = // ... your leet words...
def checkWord(goodList: List[String], badList: List[String])(c: Int, w: String): Int = {
if (goodList.contains(w)) c + 1
else if (badList.contains(w)) c - 1
else c
}
val count = sentence.split(" ").foldLeft(0)(checkWord(validWords, leetWords))
print(count)https://stackoverflow.com/questions/39338524
复制相似问题