文章/答案/技术大牛

发布

问CSV读写，输出CSV为空
EN

Stack Overflow用户

提问于 2021-11-11 16:55:43

回答 3查看 139关注 0票数 1

我的程序需要一个函数来从csv文件("all.csv")中读取数据，并提取所有与'Virginia‘相关的数据(提取其中包含'Virginia’的每一行)，然后将提取的数据写到另一个名为"Virginia.csv“的csv文件中。程序运行时没有错误；然而，当我打开"Virginia.csv”文件时，它是空白的。

以下是all.csv文件中的数据：

https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv

下面是我的代码：

import csv

input_file = 'all.csv'
output_file = 'Virginia.csv'
state = 'Virginia'
mylist = []

def extract_records_for_state (input_file, output_file, state):
    with open(input_file, 'r') as infile:
        contents = infile.readlines()
        
        with open(output_file, 'w') as outfile:
            writer = csv.writer(outfile)
        
            for row in range(len(contents)):
                contents[row] = contents[row].split(',') #split elements
            
            for row in range(len(contents)):
                for word in range(len(contents[row])):
                
                    if contents[row][2] == state:
                        writer.writerow(row)
                
                
extract_records_for_state(input_file,output_file,state)

python

csv

回答 3

Stack Overflow用户

发布于 2021-11-11 17:05:58

我运行了你的代码，它给了我一个错误

回溯(最近一次调用)：文件"c:\Users\Dolimight\Desktop\Stack Overflow\Geraldo\main.py"，第27行，在extract_records_for_state(input_file，output_file，state)文件"c:\Users\Dolimight\Desktop\Stack Overflow\Geraldo\main.py"，第24行，在extract_records_for_state writer.writerow(行) _csv.Error: iterable expected，not int，

我修复了这个错误，将row [contents[row]]的内容放入the writerow()函数，并再次运行它，数据显示在Virginia.csv中。它给了我重复的东西，所以我也去掉了for-loop这个词。

import csv

input_file = 'all.csv'
output_file = 'Virginia.csv'
state = 'Virginia'
mylist = []


def extract_records_for_state(input_file, output_file, state):
    with open(input_file, 'r') as infile:
        contents = infile.readlines()

        with open(output_file, 'w') as outfile:
            writer = csv.writer(outfile)

            for row in range(len(contents)):
                contents[row] = contents[row].split(',')  # split elements

            print(contents)

            for row in range(len(contents)):
                if contents[row][2] == state:
                    writer.writerow(contents[row]) # this is what I changed


extract_records_for_state(input_file, output_file, state)

票数 2

Stack Overflow用户

发布于 2021-11-11 17:14:14

您有两个错误。第一种方法是尝试在writer.writerow(row)处写入行索引-该行为contents[row]。第二种方法是在读取时将换行符留在最后一列，但不要在写入时剥离它。相反，您可以更充分地利用csv模块。让阅读器解析这些行。而不是读取使用大量内存的列表，而是逐行过滤和写入。

import csv

input_file = 'all.csv'
output_file = 'Virginia.csv'
state = 'Virginia'
mylist = []

def extract_records_for_state (input_file, output_file, state):
    with open(input_file, 'r', newline='') as infile, \
            open(output_file, 'w', newline="") as outfile:
        reader = csv.reader(infile)
        writer = csv.writer(outfile)
        # add header
        writer.writerow(next(reader))
        # filter for state
        writer.writerows(row for row in reader if row[2] == state)

extract_records_for_state(input_file,output_file,state)

票数 1

Stack Overflow用户

发布于 2021-11-11 20:38:47

看着你的代码，我突然想到了两件事：

我看到一堆嵌套语句(逻辑)
我看到您将CSV读为纯文本，然后自己解释为CSV (contents[row] = contents[row].split(',')).

我推荐两件事：

将逻辑分成不同的块:所有嵌套可能很难解释和调试；做一件事，证明它是有效的；做另一件事，证明它是有效的；将CSV API使用到其最充分的：使用它来读取和编写您的CSV

我不想尝试复制/修复你的代码，而是提供这个通用的方法来实现这两个目标：

import csv

# Read in
all_rows = []
with open('all.csv', 'r', newline='') as f:
    reader = csv.reader(f)
    next(reader)  # discard header (I didn't see you keep it)

    for row in reader:
        all_rows.append(row)

# Process
filtered_rows = []
for row in all_rows:
    if row[2] == 'Virginia':
        filtered_rows.append(row)

# Write out
with open('filtered.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(filtered_rows)

一旦您理解了这些离散步骤的逻辑和API，您就可以继续(前进)编写一些更复杂的东西，比如下面的代码，它读取一行，决定是否应该写入它，如果应该，则写入它：

import csv

with open('filtered.csv', 'w', newline='') as f_out:
    writer = csv.writer(f_out)

    with open('all.csv', 'r', newline='') as f_in:
        reader = csv.reader(f_in)
        next(reader) # discard header

        for row in reader:
            if row[2] == 'Virginia':
                writer.writerow(row)

在这个(真正缩小的) all.csv示例上使用这两段代码中的任何一段

date,county,state,fips,cases,deaths
2020-03-09,Fairfax,Virginia,51059,4,0
2020-03-09,Virginia Beach city,Virginia,51810,1,0
2020-03-09,Chelan,Washington,53007,1,1
2020-03-09,Clark,Washington,53011,1,0

给我一个看起来像这样的filtered.csv：

2020-03-09,Fairfax,Virginia,51059,4,0
2020-03-09,Virginia Beach city,Virginia,51810,1,0

考虑到这个数据集的大小，第二种在读循环中按需写入的方法既更快(在我的机器上大约快5倍)，而且使用的内存也明显更少(在我的机器上大约少40倍)，因为没有使用all_rows的中间存储。

但是，请花时间运行它们，仔细阅读它们，看看它们是如何工作的。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69932065

复制

相似问题

问CSV读写，输出CSV为空
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问CSV读写，输出CSV为空EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问CSV读写，输出CSV为空
EN