文章/答案/技术大牛

发布

社区首页 >问答首页 >对字符串进行标记会使一些单词合并。

问对字符串进行标记会使一些单词合并。
EN

Stack Overflow用户

提问于 2013-07-29 17:46:42

回答 3查看 48关注 0票数 2

我使用下面的代码来标记字符串，从stdin读取。

d=[]
cur = ''
for i in sys.stdin.readline():
    if i in ' .':
        if cur not in d and (cur != ''):
            d.append(cur)
            cur = ''
    else:
        cur = cur + i.lower()

这给了我一个不重复的words.However数组，在我的输出中，有些单词不会分裂。

我的意见是

Dan went to the north pole to lead an expedition during summer.

输出数组d为

“丹”，“去”，“to”，“北方”，“北极”，“托勒德”，“安”，“远征”，“在”期间，“夏天”

为什么tolead在一起？

python

stdin

回答 3

Stack Overflow用户

回答已采纳

发布于 2013-07-29 17:50:20

尝尝这个

d=[]
cur = ''
for i in sys.stdin.readline():
    if i in ' .':
        if cur not in d and (cur != ''):
            d.append(cur)
        cur = '' # note the different indentation
    else:
        cur = cur + i.lower()

票数 3

Stack Overflow用户

发布于 2013-07-29 17:55:33

试试这个：

for line in sys.stdin.readline():
    res = set(word.lower() for word in line[:-1].split(" "))
    print res

示例：

line = "Dan went to the north pole to lead an expedition during summer."
res = set(word.lower() for word in line[:-1].split(" "))
print res

set(['north', 'lead', 'expedition', 'dan', 'an', 'to', 'pole', 'during', 'went', 'summer', 'the'])

在注释之后，我编辑:此解决方案保留输入顺序并过滤分隔符。

import re
from collections import OrderedDict
line = "Dan went to the north pole to lead an expedition during summer."
list(OrderedDict.fromkeys(re.findall(r"[\w']+", line)))
# ['Dan', 'went', 'to', 'the', 'north', 'pole', 'lead', 'an', 'expedition', 'during', 'summer']

票数 1

Stack Overflow用户

发布于 2013-07-29 18:01:32

"to"已经在"d"了。因此，您的循环跳过了"to"和"lead"之间的空间，但是继续连接；一旦它到达下一个空间，它就会看到"tolead"不在d中，所以它会追加它。

更容易解决；它还去掉了所有形式的标点符号：

>>> import string
>>> set("Dan went to the north pole to lead an expedition during summer.".translate(None, string.punctuation).lower().split())
set(['summer', 'north', 'lead', 'expedition', 'dan', 'an', 'to', 'pole', 'during', 'went', 'the'])

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/17930734

复制

相似问题

问对字符串进行标记会使一些单词合并。
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问对字符串进行标记会使一些单词合并。EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问对字符串进行标记会使一些单词合并。
EN