你好,我对python和regex很陌生。我有一个字符串,我想重新格式化/替换
string = '1John Radcliffe Hospital/Oxford/United Kingdom, 11Ruhr-Universität
3/Bochum/Bochum/Germany, 3University of British Columbia/Vancouver/Canada, 4National
Institute of Neuroscience, National Center of Neurology and Psychiatry/Tokyo/Japan,
5University of Catania/Catania/Italy, 6F. Hoffmann-La Roche Ltd/Basel/Switzerland, 7
University of Colorado School of Medicine/Aurora/United States of America'我确实试过:
re.sub('(, \d+()?)', r'\1=', string).strip()预期产出:
string = '1=John Radcliffe Hospital/Oxford/United Kingdom, 11=Ruhr-Universität
3/Bochum/Bochum/Germany, 3=University of British Columbia/Vancouver/Canada, 4=National
Institute of Neuroscience, National Center of Neurology and Psychiatry/Tokyo/Japan,
5=University of Catania/Catania/Italy, 6=F. Hoffmann-La Roche Ltd/Basel/Switzerland,
7=University of Colorado School of Medicine/Aurora/United States of America'发布于 2021-05-11 20:16:19
您可以在不使用捕获组的情况下匹配字符串的开头,也可以匹配空格和逗号,并且在匹配单个数字后不能断言一个数字。
(?:^|, )\d+(?!/)模式匹配
(?:^|, )非捕获组,断言字符串的开始或修补程序,。\d+(?!/)匹配1+位数,而不是直接向右断言/。在替换中,使用完全匹配,后面跟着等号
\g<0>=示例
import re
string = ("1John Radcliffe Hospital/Oxford/United Kingdom, 11Ruhr-Universität \n"
"3/Bochum/Bochum/Germany, 3University of British Columbia/Vancouver/Canada, 4National \n"
"Institute of Neuroscience, National Center of Neurology and Psychiatry/Tokyo/Japan, \n"
"5University of Catania/Catania/Italy, 6F. Hoffmann-La Roche Ltd/Basel/Switzerland, 7 \n"
"University of Colorado School of Medicine/Aurora/United States of America")
result = re.sub(r'(?:^|, )\d+(?!/)', r'\g<0>=', string, 0, re.MULTILINE).strip()
print(result)输出
1=John Radcliffe Hospital/Oxford/United Kingdom, 11=Ruhr-Universität
3/Bochum/Bochum/Germany, 3=University of British Columbia/Vancouver/Canada, 4=National
Institute of Neuroscience, National Center of Neurology and Psychiatry/Tokyo/Japan,
5=University of Catania/Catania/Italy, 6=F. Hoffmann-La Roche Ltd/Basel/Switzerland, 7=
University of Colorado School of Medicine/Aurora/United States of America另一个选项可能是在匹配一个数字之后,使用正前瞻来断言大写字符[A-Z]。
(?:^|, )\d+(?=\s*[A-Z])https://stackoverflow.com/questions/67493741
复制相似问题