文章/答案/技术大牛

发布

社区首页 >问答首页 >python纯文本正则解析

问python纯文本正则解析
EN

Stack Overflow用户

提问于 2014-03-22 12:12:23

回答 3查看 380关注 0票数 0

我需要编写一个小的解析器来从表单中提取数据。

数据将始终以一致的模式发布。详情如下：

Panakamanana
104412=Trident of Corrupted Waters
104411=Immerseus' Crystalline Eye
104435=Stonetoe's Tormented Treads
104455=Reality Ripper Ring
99716=Chest of the Cursed Protector
104509=Laser Burn Bracers
104531=Haromm's Talisman
99722=Gauntlets of the Cursed Protector
104562=Ring of Restless Energy
104606=Gleaming Eye of the Devilsaur
99725=Helm of the Cursed Protector
99719=Shoulders of the Cursed Protector
104616=Ticking Ebon Detonator
105686=Hellscream's Pig Sticker

我感兴趣的唯一数据是=符号之前的每个整数。然后，我希望能够迭代这些，所以也许把它们放在一个小块或数组中，或者其他什么东西都会很好。

python

regex

回答 3

Stack Overflow用户

回答已采纳

发布于 2014-03-22 12:19:46

有一种方法可以做到：

with open('somefile.txt') as f:
   next(f) # Skips the first line, which doesn't have =
   numbers = [line.split('=')[0] for line in f if len(line.strip())]

print(numbers)

如果您想使用正则表达式：

>>> import re
>>> s = "104412=Trident of Corrupted Waters"
>>> re.findall(r'^(\d+)', s)[0]
'104412'

票数 2

Stack Overflow用户

发布于 2014-03-22 12:18:47

只需使用split作为分隔符对字符串进行'='处理。实现这一点的最有效的方法是使用列表理解：

>>> [int(line.split('=')[0]) for line in your_lines[1:]]
[104412, 104411, ..., 105686]

其中，your_lines是在您的问题中演示的行的list。

票数 1

Stack Overflow用户

发布于 2014-03-22 12:14:14

您可以这样做：

new_list = [int(line[:line.find('=')]) for line in your_list]

print new_list

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/22577428

复制

相似问题

问python纯文本正则解析
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python纯文本正则解析EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python纯文本正则解析
EN