因此,我有以下字符串:
__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; expires=Sat, 09-Mar-19 03:35:03 GMT; path=/; domain=.coinmarketcap.com; HttpOnly, _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; path=/, __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974; domain=.coinmarketcap.com path=/但是,我需要从这里移除垃圾,比如
expires=Sat, 09-Mar-19 03:35:03 GMT或
domain=.coinmarketcap.com path=/所以我只留下了三个值:
__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974发布于 2018-03-09 03:48:43
指定要保留的键:
In [193]: keys = ['__cfduid', '_version', '__session']现在,调用re.findall (import re first):
In [194]: ' '.join(re.findall(r'(?:{}).*?;'.format('|'.join(keys)), text)
Out[194]: '__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503; _version=a90f44e909c03fdad3caed1ec676a98472deb0f6; __session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ%3D%3D--67cb39476896467f47bdd13bb638fd5479883974;'regex (?:{}).*?;指定您只想为这些选定的键找到键值对。其他的都被丢弃了。只要您的字符串具有一致的结构((key=value;)+)即可工作。
发布于 2018-03-09 04:09:35
对于以下划线开头的任何键,这都是更通用的解决方案。
import re
str_list = re.findall(r"_\w+=\w+", your_string)
out:
['__cfduid=dc3c9f85f65d39a5947d5f4850618237f1520566503',
'_version=a90f44e909c03fdad3caed1ec676a98472deb0f6',
'__session=NTgybXJTVFdKcjlrbG5JKsnaVm9V6SBhUWtxV0oxc3JZNTZUekRGb3RvYjFpZDF5WHNab2N0T3VxTDdzY1JnOGR0ZzdtUzdRZDQ3NjVwU2Lod93GG9lalMwMGNheUUybm45Q20rWWlSRUZ5YUlzNVZmd3h3b200TmR2cnRHUWY4OUxrVml3T2hMMUdrdXZOc0V6TnBxOHFBPT0tLTMyV0R3emYxME9OeDQ3cDJ4LzJycmc9PQ']re.findall返回list,您可以加入它以获得所需的输出。
"; ".join(str_list)发布于 2018-03-09 04:52:53
另一种方法,
keys = ('__cfduid', '_version', '__session')
' '.join([x for x in text.split() if x.startswith(keys)])https://stackoverflow.com/questions/49186267
复制相似问题