我正在编写一个程序,该程序提示输入文件名,然后打开该文件并通读该文件,查找以下形式的行:
X-DSPAM-置信度: 0.8475
我想对这些行进行计数,并从每行中提取浮点值,然后计算这些值的平均值。我能得到一些帮助吗。我刚开始编程,所以我需要一些非常简单的东西。这是我已经写好的代码。
fname = raw_input("Enter file name: ")
if len(fname) == 0:
fname = 'mbox-short.txt'
fh = open(fname,'r')
count = 0
total = 0
#Average = total/num of lines
for line in fh:
if not line.startswith("X-DSPAM-Confidence:"): continue
count = count+1
print line发布于 2016-02-19 23:24:43
迭代文件(使用上下文管理器("with")自动处理闭合),查找这样的行(就像您做的那样),然后像这样读取它们:
fname = raw_input("Enter file name:")
if not fname:
fname = "mbox-short.txt"
scores = []
with open(fname) as f:
for line in f:
if not line.startswith("X-DSPAM-Confidence:"):
continue
_, score = line.split()
scores.append(float(score))
print sum(scores)/len(scores)或者更紧凑一些:
mean = lambda x: sum(x)/len(x)
with open(fname) as f:
result = mean([float(l.split()[1]) if line.startswith("X-DSPAM-Confidence:") for l in f])发布于 2016-02-19 23:25:44
尝试:
total += float(line.split(' ')[1])这样total / count就能给你答案了。
发布于 2016-02-20 01:37:37
一个像下面这样的程序应该能满足你的需求。如果您需要更改程序正在查找的内容,只需更改PATTERN变量来描述您尝试匹配的内容。代码是为Python3.x编写的,但如果需要,可以很容易地将其改编为Python2.x。
程序:
#! /usr/bin/env python3
import re
import statistics
import sys
PATTERN = r'X-DSPAM-Confidence:\s*(?P<float>[+-]?\d*\.\d+)'
def main(argv):
"""Calculate the average X-DSPAM-Confidence from a file."""
filename = argv[1] if len(argv) > 1 else input('Filename: ')
if filename in {'', 'default'}:
filename = 'mbox-short.txt'
print('Average:', statistics.mean(get_numbers(filename)))
return 0
def get_numbers(filename):
"""Extract all X-DSPAM-Confidence values from the named file."""
with open(filename) as file:
for line in file:
for match in re.finditer(PATTERN, line, re.IGNORECASE):
yield float(match.groupdict()['float'])
if __name__ == '__main__':
sys.exit(main(sys.argv))如果需要,您还可以通过以下方式实现get_numbers生成器。
替代方案:
def get_numbers(filename):
"""Extract all X-DSPAM-Confidence values from the named file."""
with open(filename) as file:
yield from (float(match.groupdict()['float'])
for line in file
for match in re.finditer(PATTERN, line, re.IGNORECASE))https://stackoverflow.com/questions/35508764
复制相似问题