首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在Python中使用正则表达式将一个字符串替换为另一个字符串: Error: re.error: bad转义\w位置0

在Python中使用正则表达式将一个字符串替换为另一个字符串: Error: re.error: bad转义\w位置0
EN

Stack Overflow用户
提问于 2019-05-07 00:12:09
回答 1查看 777关注 0票数 1

我正在尝试用'word_one‘替换出现的内容,例如'word one’。将空格替换为“_”。

下面是我的代码:

代码语言:javascript
复制
labels_ls = ['word <= 0.01', 'word_two <= 0.23', 'word three <= 0.01']

regex_whitespace = r'\w+\s+\w+\b'
new_regex = r'\w+\_+\w+\b'
pattern = re.compile(regex_whitespace) # this I just added after reviewing other related questions

# Loop through labels_ls to find any ngrams whitespace separated labels (i.e gilt maximal)

for i in labels_ls:
    if re.match(regex_whitespace, i):
        # replace the whitespace with a '_' to form gilt*maximal
        new_string = re.sub(pattern, new_regex, i)
        print('new string: ', new_string)

我已经在这里测试了我的正则表达式https://pythex.org,它按要求工作,但是当我运行这段代码时,我得到了以下错误:

re.error:位置0处的错误转义\w

我已经看过了所有相关的回答问题:

how to fix - error: bad escape \u at position 0

Regex: Replace one pattern with another

我已经尝试删除上面问题中提到的regex之前的r,但是仍然不起作用。

我也尝试过使用compile(),但也没有解决这个问题

代码语言:javascript
复制
labels_ls = ['internal_punctuation <= 0.042', 'darf <= 0.717', 'formal_global_yes <= 0.5', 'wert <= 0.272', 'signal <= 0.5', 'Flesch_Index <= 0.813', 'zulass <= 0.379', 'polarity <= 0.713', 'Nb_of_auxiliary <= 0.071', 'gini = 0.0', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'weakwords_nb <= 0.143', 'passive_global_yes <= 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Nb_of_verbs <= 0.094', 'passive_global_yes <= 0.5', 'WPS <= 0.062', 'measurement_values_no <= 0.5', 'gini = 0.0', 'SPW <= 0.575', 'weird_words <= 0.042', 'weakwords_nb <= 0.036', 'SPW <= 0.272', 'gini = 0.0', 'words_nb <= 0.033', 'gini = 0.5', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Flesch_Index <= 0.774', 'SPW <= 0.331', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.375', 'SPW <= 0.111', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'weird_words <= 0.208', 'zsdf <= 0.5', 'signal <= 0.297', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.164', 'Aux_Start_no <= 0.5', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'werden <= 0.125', 'darf <= 0.297', 'polarity <= 0.925', 'SPW <= 0.376', 'WPS <= 0.11', 'numerical_values <= 0.091', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'WPS <= 0.11', 'gini = 0.0', 'gini = 0.0', 'polarity <= 0.25', 'gini = 0.0', 'Flesch_Index <= 0.663', 'words_nb <= 0.033', 'SPW <= 0.475', 'gini = 0.0', 'gini = 0.0', 'Comp_conj <= 0.125', 'gini = 0.56', 'gini = 0.0', 'Flesch_Index <= 0.75', 'gini = 0.444', 'gini = 0.0', 'Aux_Start_yes <= 0.5', 'darf <= 0.241', 'Nb_of_verbs <= 0.156', 'gini = 0.0', 'SPW <= 0.246', 'polarity <= 0.675', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'Sub_Conj <= 0.25', 'numerical_values <= 0.227', 'funktion <= 0.348', 'internal_punctuation <= 0.458', 'polarity <= 0.375', 'gini = 0.0', 'Nb_of_verbs <= 0.031', 'gini = 0.0', 'Flesch_Index <= 0.409', 'gini = 0.0', 'numerical_values <= 0.136', 'WPS <= 0.065', 'darf <= 0.359', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'formal_global_no <= 0.5', 'WPS <= 0.164', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gini = 0.0', 'gilt randbeding <= 0.181', 'fahrzeug <= 0.352', 'gini = 0.0', 'zulass <= 0.082', 'gini = 0.0', 'gini = 0.0', 'fur <= 0.194', 'weakwords_nb <= 0.321', 'gini = 0.444', 'gini = 0.0', 'gini = 0.0', 'Nb_of_Umsetzbarkeit_conj <= 0.167', 'Nb_of_verbs <= 0.344', 'gini = 0.0', 'gini = 0.0', 'words_nb <= 0.178', 'gini = 0.0', 'words_nb <= 0.224', 'gini = 0.0', 'gini = 0.0']
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-05-07 00:20:29

您需要使用

代码语言:javascript
复制
regex_whitespace = r'(\w+)\s+(\w+)\b'

然后:

代码语言:javascript
复制
new_string = re.sub(pattern, r'\1_\2', i)

请参阅Python demo online

关键是,您需要将与第一个正则表达式匹配的单词chars捕获到capturing groups中,然后对匹配的组值使用backreferencesnew_regex = r'\w+\_+\w+\b'是多余的,因为您不能使用正则模式作为替换,替换模式只能包含反向引用和转义序列(文字反斜杠必须在那里转义)。

票数 5
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56008877

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档