嗨,我正试图在一个文件中搜索一个特定的单词列表。如果找到其中一个单词,我想在下面添加一个换行符,并添加这个短语\colour =1(我不想删除我正在搜索的原词)。
为背景和格式的文件提取:位点contig_2_pilon_pilon 5558986 bp,DNA线性BCT 16-jun 2020定义大肠杆菌O 157:H7株(270078)加入版关键词。源大肠杆菌270078种大肠杆菌270078细菌;蛋白质细菌;γ亚类;肠杆菌科;大肠埃希菌。注释使用prokka 1.14.6来自https://github.com/tseemann/prokka。来源1.5558986/organism=“大肠埃希菌270078”/mol_type=“/strain=”株“/db_xref=”分类群:562“CDS 61523..61744 /gene="pspD“/locus_tag="JCCJNNLA_00057”/inference=从头预测:浪花:002006“/inference=”类似AA序列:RefSeq:EG10779-单体“/codon_start=1 /transl_table=11 /product=”外周膜热休克蛋白“/translation="MNTRWQQAGQKVKPGFKLAGKLVLLTALRYGPAGVAGWAIKSVA RRPLKMLLAVALEPLLSRAANKLAQRYKR”
以下是我在文件中寻找的单词列表之一:
regulation_list=["anti-repressor","anti-termination","antirepressor","antitermination","antiterminator","anti-terminator","cold-shock","cold shock","heat-shock","heat shock","regulation","regulator","regulatory","helicase","antibiotic resistance","repressor","zinc","sensor","dipeptidase","deacetylase","5-dehydrogenase","glucosamine kinase","glucosamine-kinase","dna-binding","dna binding","methylase","sulfurtransferase","acetyltransferase","control","ATP-binding","ATP binding","Cro","Ren protein","CII","inhibitor","activator","derepression","protein Sxy","sensing","sensor","Tir chaperone","Tir-cytoskeleton","Tir cytoskeleton","Tir protein","EspD"]如您所见,该摘录包含我正在寻找的其中一个e词组,我希望在下面添加一个带有短语/colour = 1的换行符。
任何帮助都会很好!
发布于 2020-08-21 15:59:57
# Create simple input file for testing:
cat > foo.txt <<EOF
foo
foo anti-termination
bar anti-repressor anti-termination
baz
EOF
python -c '
import re
# Using a shortened version of your list:
regulation_list=["anti-repressor", "anti-termination", "etc"]
# For speed and simplicity, compile the regular expression once, the reuse it later:
regulation_re = re.compile("|".join(regulation_list))
with open("foo.txt" , "r") as in_file:
for line in in_file:
line = line.strip()
print(line)
if re.search(regulation_re, line):
print("/colour = 1")
' > bar.txt
cat bar.txt指纹:
foo
foo anti-termination
/colour = 1
bar anti-repressor anti-termination
/colour = 1
baz您可能需要在/colour=1字符串中添加额外的换行符和空格,以便对齐(您的问题不清楚),如下所示:
print("\n /colour = 1")https://stackoverflow.com/questions/63522909
复制相似问题