首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Regex:编写使用逗号拆分的模式

Regex:编写使用逗号拆分的模式
EN

Stack Overflow用户
提问于 2021-08-05 12:23:14
回答 3查看 47关注 0票数 0

我需要创建一个tokenizer,它将用逗号分割字符串。

使用split可以做到这一点

代码语言:javascript
复制
re.split(',+', str)

但我需要使用编译。我试过了

代码语言:javascript
复制
text = "5g, dynamic vision sensor (dvs), 3-d reconstruction, neuromorphic engineering, neural networks, humanoid robots, neuromorphics, closed loop systems, field programmable gate arrays, spiking motor controller, neuromorphic implementation, icub, relation neural network"
pattern = re.compile(r'[a-z0-9\(\)-]+')
re.findall(pattern, text)

输出结果是

代码语言:javascript
复制
['5g', 'dynamic', 'vision', 'sensor', '(dvs)', '3-d', 'reconstruction', 'neuromorphic', 'engineering', 'neural', 'networks', 'humanoid', 'robots', 'neuromorphics', 'closed', 'loop', 'systems', 'field', 'programmable', 'gate', 'arrays', 'spiking', 'motor', 'controller', 'neuromorphic', 'implementation', 'icub', 'relation', 'neural', 'network']

所需输出为

代码语言:javascript
复制
['5g', 'dynamic vision sensor (dvs)', '3-d reconstruction', 'neuromorphic engineering', 'neural networks', 'humanoid robots', 'neuromorphics', 'closed loop systems', 'field programmable gate arrays', 'spiking motor controller', 'neuromorphic implementation', 'icub', 'relation neural network']
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2021-08-05 12:34:01

正如@mama所说,你不需要为此使用正则表达式,但是如果你特别想使用re.compile,你可以用下面的代码来实现:

代码语言:javascript
复制
text = "5g, dynamic vision sensor (dvs), 3-d reconstruction, neuromorphic engineering, neural networks, humanoid robots, neuromorphics, closed loop systems, field programmable gate arrays, spiking motor controller, neuromorphic implementation, icub, relation neural network"
pattern = re.compile(r'([\sa-z0-9\(\)-]+)')
L=re.findall(pattern, text)
L=[l.lstrip(" ") for l in L]

票数 1
EN

Stack Overflow用户

发布于 2021-08-05 12:27:14

不要为此使用正则表达式。只需使用python内置的split()函数

代码语言:javascript
复制
text = "5g, dynamic vision sensor (dvs), 3-d reconstruction, neuromorphic engineering, neural networks, humanoid robots, neuromorphics, closed loop systems, field programmable gate arrays, spiking motor controller, neuromorphic implementation, icub, relation neural network"

print(text.split(', '))
票数 1
EN

Stack Overflow用户

发布于 2021-08-05 12:36:32

试试这个模式:[a-z0-9() -]+(?=,|$)

代码:

代码语言:javascript
复制
text = "5g, dynamic vision sensor (dvs), 3-d reconstruction, neuromorphic engineering, neural networks, humanoid robots, neuromorphics, closed loop systems, field programmable gate arrays, spiking motor controller, neuromorphic implementation, icub, relation neural network"
pattern = re.compile(r'[a-z0-9() -]+(?=,|$)')
print([x.strip() for x in re.findall(pattern, text)])

输出:

代码语言:javascript
复制
['5g', 'dynamic vision sensor (dvs)', '3-d reconstruction', 'neuromorphic engineering', 'neural networks', 'humanoid robots', 'neuromorphics', 'closed loop systems', 'field programmable gate arrays', 'spiking motor controller', 'neuromorphic implementation', 'icub', 'relation neural network']
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68666529

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档