首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用csv文件过滤带有关键字的数据

使用csv文件过滤带有关键字的数据
EN

Stack Overflow用户
提问于 2018-04-15 05:49:03
回答 1查看 87关注 0票数 0

我正在尝试从csv文件中筛选出数据,并尝试以如下方式组织它

代码语言:javascript
复制
0 AIG,10,,,,Yes,,,Jr,,,MS,,
1 Baylor College of Medicine,19,Yes,Yes,,,,,,,,,,Recent
2 CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
3 Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
4 ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
5 Flow-Cal Inc.,16,Yes,,,Yes,,,Jr,Sr,,,,All
6 Global Shop Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
7 Harris County CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
8 HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
9 Hitachi Consulting,13,Yes,,,,,,,Sr,,MS,,
10 HP Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
11 INT Inc.,20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
12 JPMorgan Chase & Co,3,Yes,,,Yes,,,Jr,Sr,,,,
13 Leidos,390,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
14 McKesson,26,Yes,,,,,,,Sr,,,,
15 MRE Consulting Ltd.,2,Yes,,,,,,,Sr,PB,MS,,All
16 NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
17 PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
18 San Jacinto College ,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
19 SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
20 Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
21 Sogeti USA,15,Yes,,,,,,,Sr,PB,MS,,
22 Southwest Research Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
23 The Reynolds and Reynolds Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
24 UH Enterprise Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
25 U.S. Marine Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
26 ValuD Consuting LLC,5,Yes,,,,,,,Sr,PB,,,All
27 Wipro,24,Yes,,,,,,,Sr,PB,,,

然而,我的代码现在给我提供了这个

代码语言:javascript
复制
0 AIG,10,,,,Yes,,,Jr,,,MS,,
1 Baylor�College�of�Medicine,19,Yes,Yes,,,,,,,,,,Recent
2 CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
3 Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
4 ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
5 HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
6 Leidos,390,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
7 McKesson,26,Yes,,,,,,,Sr,,,,
8 NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
9 PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
10 SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
11 Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
12 Wipro,24,Yes,,,,,,,Sr,PB,,,
13 SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
14 NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
15 Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
16 AIG,10,,,,Yes,,,Jr,,,MS,,
17 ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
18 CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
19 Baylor�College�of�Medicine,19,Yes,Yes,,,,,,,,,,Recent
20 PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
21 Wipro,24,Yes,,,,,,,Sr,PB,,,
22 McKesson,26,Yes,,,,,,,Sr,,,,
23 Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
24 HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
25 Leidos,30,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,

正如你所看到的,它似乎重复了我使用的某些关键字,我将在下面发布我的代码。

代码语言:javascript
复制
#I made a dictonary of the problem stated
company_dict = {0:"Company", 1:"Booth",
                2:"Full-Time", 3:"Full-Time Visa Sponsor",
                4:"Part-Time", 5:"Internship",
                6:"Freshman", 7:"Sophomore",
                8:"Junior", 9:"Senior",
                10:"Post-Bacs", 11:"MS",
                12:"PhD", 13:"Alumni"}

#Loop to organize the company_dict
for lines in company_dict:
    print(repr(lines),company_dict[lines])

keywords = ("AIG","Baylor","CGG","Citi","ExxonMobil","Flow-Cal Inc.",
           "Global SHop Solutions","Harris Count CTS","HCSS",
           "Hitachi Consulting", "HP Inc.","INT Inc.","JPMorgan Chase & Co",
           "Leidos","McKesson","MRE Consulting Ltd.","NetIQ","PROS",
           "San Jacinto College","SAS","Smartbridge","Sogeti USA",
           "Southwest Research Institute","The Reynolds and Reynolds Company",
           "UH Enterprise Systems","U.S. Marine Corps","ValuD Consuting LLC","Wipro")

with f as filterf:
    output_line_counter = 0
    for line in filterf:
        if any(keyword in line for keyword in keywords):
            print(output_line_counter, line.strip())
            output_line_counter += 1

这一切都来自作业中包含的csv文件。我认为我是在正确的轨道上,但我不明白为什么我的代码给我重复,也错过了我要求它搜索的“关键字”。

我将在下面包括csv数据

代码语言:javascript
复制
ALPHABETICAL ORDER,,,,,,,,,,,,,
,,Positions,,,,Classifications,,,,,,,
Company,Booth,Full-Time,"Full-Time Visa Sponsor",Part-Time,Internship,Freshman,Sophomore,Junior,Senior,Post-Bacs,MS,PhD,Alumni
AIG,10,,,,Yes,,,Jr,,,MS,,
Baylor�College�of�Medicine,19,Yes,Yes,,,,,,,,,,Recent
CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
,...
Flow-Cal�Inc.,16,Yes,,,Yes,,,Jr,Sr,,,,All
Global�Shop�Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
Harris�County�CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Hitachi�Consulting,13,Yes,,,,,,,Sr,,MS,,
HP�Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
INT�Inc.,20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
JPMorgan�Chase�&�Co,3,Yes,,,Yes,,,Jr,Sr,,,,
Leidos,390,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
McKesson,26,Yes,,,,,,,Sr,,,,
,,,,,,,,,,,,,
MRE�Consulting�Ltd.,2,Yes,,,,,,,Sr,PB,MS,,All
NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
San�Jacinto�College��,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
Sogeti�USA,15,Yes,,,,,,,Sr,PB,MS,,
Southwest�Research�Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
The�Reynolds�and�Reynolds�Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
UH�Enterprise�Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
U.S.�Marine�Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
ValuD�Consuting�LLC,5,Yes,,,,,,,Sr,PB,,,All
Wipro,24,Yes,,,,,,,Sr,PB,,,
BOOTH ORDER,,,,,,,,,,,,,
,Booth,Positions,,,,Classifications,,,,,,,
Company,#,Full-Time,"Full-Time
Visa Sponsor",Part-Time,Internship,Freshman,Sophomore,Junior,Senior,Post-Bacs,MS,PhD,Alumni
HP�Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
"MRE�Consulting,�Ltd.",2,Yes,,,,,,,Sr,PB,MS,,All
JPMorgan�Chase�&�Co,3,Yes,,,Yes,,,Jr,Sr,,,,
SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
ValuD�Consuting�LLC,5,Yes,,,,,,,Sr,PB,,,All
NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
UH�Enterprise�Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
AIG,10,,,,Yes,,,Jr,,,MS,,
ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
Southwest�Research�Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
Hitachi�Consulting,13,Yes,,,,,,,Sr,,MS,,
San�Jacinto�College��,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
Sogeti�USA,15,Yes,,,,,,,Sr,PB,MS,,
"Flow-Cal,�Inc.",16,Yes,,,Yes,,,Jr,Sr,,,,All
CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
Global�Shop�Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
Baylor�College�of�Medicine,19,Yes,Yes,,,,,,,,,,Recent
"INT,�Inc.",20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
Harris�County�CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
The�Reynolds�and�Reynolds�Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
Wipro,24,Yes,,,,,,,Sr,PB,,,
U.S.�Marine�Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
McKesson,26,Yes,,,,,,,Sr,,,,
Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
Leidos,30,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,

我认为它必须对csv文件中框中的问号做些什么,但我不确定。我想根据给定的关键字搜索csv文件,并打印该行。非常感谢您的任何意见或建议:)

EN

回答 1

Stack Overflow用户

发布于 2018-04-15 09:25:07

答案是cvs文件只需要修改(希望它对项目没问题,它有奇怪的UTF错误)

我还添加了以下代码

代码语言:javascript
复制
DataList = []
with f as filterf:
    output_line_counter = 0
    for line in filterf:
        if any(keyword in line for keyword in keywords):
            output_line_counter += 1
            DataList.append(line)

CleanerData = sorted(set(DataList))
line_counter = 0
for i in CleanerData:
    line_counter += 1
    print(line_counter, i, end='')
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49836505

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档