文章/答案/技术大牛

发布

社区首页 >问答首页 >从最多为6位的字符串中提取数字值，小数点可选2位。

问从最多为6位的字符串中提取数字值，小数点可选2位。
EN

Stack Overflow用户

提问于 2019-10-31 06:35:19

回答 2查看 125关注 0票数 1

我有一项任务，需要从表示数字值的文本中提取值。然而，我感兴趣的是提取最多有6位数的值，小数点是可选的。

例如，以下案文：

XYZ的薪酬总额为512.3456美元，2018年的其他薪酬为65万美元，约为该公司所有其他雇员年薪中位数的8.78倍，约为7.4万美元。另外一些薪水是56000英镑。

我需要提取

["650,000", "2018", "8.78", "74,000", "56000"]

从这里。

我使用的正则表达式：

((\d{1,3})(?:,[0-9]{3}){0,1}|(\d{1,6}))(\.\d{1,2})?

它正确地识别了650,000和74,000,000，但没有正确地识别其他人。

我找到了这 7位数的货币regex，并围绕着它做了一个6位数，但没有成功。我该如何纠正我的准则？

python

regex

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-10-31 07:28:56

试试这个：(?<![\d,.])(?:\d,?){0,5}\d(?:\.\d+)?(?!,?\d)

下面是一个详细的解释：

(?x) # flag for readable mode, whitespaces and comments are ignored 

# Make sure to not start in the middle of a number, so no digit, comma or dot before the match
(?<![\d,.])

# k-1 digits, with facultative comma between each. Therefore 5,4,3,2 are allowed for the sake of simplicity, be aware of that
(?:\d,?){0,5}

#The kth digit
\d

# Facultative dot and decimal part
(?:\.\d+)?

# Make sure to not stop in the middle of a big number, so no digit after. Comma is allowed, but only for the grammatical comma, so comma+digit is forbidden
(?!,?\d)

可能会有进步，但我想这是你想要的。可能有些案子没处理，如果你找到了就告诉我。在这里测试它：https://regex101.com/r/Wxi5Sj/2

票数 1

Stack Overflow用户

发布于 2019-10-31 07:05:46

试试下面的代码

import re
input = "Total compensation for Mr. XYZ was $5,123,456 and other salary which was $650,000 in fiscal 2018, was determined to be approximately 8.78 times the median annual compensation for all of the firm's other employees, which was approximately $74,000. Some other salaries are 56000. "
print(re.findall(r'(?<=\s)\$?\d{0,3}\,?\d{1,3}(?:\.\d{2})?(?!,?\d)', input))

输出

['$650,000', '2018', '8.78', '$74,000', '56000']

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58637991

复制

相似问题

问从最多为6位的字符串中提取数字值，小数点可选2位。
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从最多为6位的字符串中提取数字值，小数点可选2位。EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从最多为6位的字符串中提取数字值，小数点可选2位。
EN