文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Regex、Python的Unicode替换

问使用Regex、Python的Unicode替换
EN

Stack Overflow用户

提问于 2014-11-05 08:54:17

回答 2查看 182关注 0票数 0

我有一个字符串如下：

str1 = "heylisten\uff08there is something\uff09to say \uffa9"

我需要用两边的空格替换regex表达式检测到的unicode值。

所需的输出字符串：

out = "heylisten \uff08 there is something \uff09 to say  \uffa9 "

我使用了一个re.findall来获取所有的匹配，然后替换它们。看上去：

p1 = re.findall(r'\uff[0-9a-e][0-9]', str1, flags = re.U)  
out = str1
for item in p1:
    print item
    print out
    out= re.sub(item, r" " + item + r" ", out)

及其输出：

'heylisten\\ uff08 there is something\\ uff09 to say \\ uffa9 '

上面的内容有什么问题，它打印了一个额外的"\“并将它与uff分开？我甚至尝试过使用re.search，但它似乎只分开了\uff08。有更好的办法吗？

python

regex

unicode

回答 2

Stack Overflow用户

回答已采纳

发布于 2014-11-05 09:03:30

我有一个字符串如下： str1 =“听我说\uff08有话说\uffa9 9” 我需要替换unicode值..。

您没有任何unicode值。你有个旁白。

str1 = u"heylisten\uff08there is something\uff09to say \uffa9"
 ...
p1 = re.sub(ur'([\uff00-\uffe9])', r' \1 ', str1)

票数 1

Stack Overflow用户

发布于 2014-11-05 08:58:25

print re.sub(r"(\\uff[0-9a-e][0-9])", r" \1 ", x)

您可以直接使用这个re.sub。见演示。

http://regex101.com/r/sU3fA2/67

import re
p = re.compile(ur'(\\uff[0-9a-e][0-9])', re.UNICODE)
test_str = u"heylisten\uff08there is something\uff09to say \uffa9"
subst = u" \1 "

result = re.sub(p, subst, test_str)

输出：

嘿，听着\uff08有话要说

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/26752989

复制

相似问题

问使用Regex、Python的Unicode替换
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Regex、Python的Unicode替换EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Regex、Python的Unicode替换
EN