我正在研究一项要求,有两个CSV如下-
CSV.csv
Short Description Category
Device is DOWN! Server Down
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization
Device Performance Alerts was triggered on Physical memory Memory Utilization
Device Performance Alerts was triggered on Physical memory Memory Utilization
Device Performance Alerts was triggered on Physical memory Memory Utilization
Disk Space Is Lowon ;E: Disk Space Utilization
Disk Space Is Lowon;C: Disk Space Utilization
Network Interface Down Interface Down
and reference.csv
Category Complexity
Server Down Simple
Network Interface down Complex
Drive Cleanup Windows Medium
CPU Utilization Medium
Memory Utilization Medium
Disk Space Utilization Unix Simple
Windows Service Restart Medium
UNIX Service Restart Medium
Web Tomcat Instance Restart Simple
Expected Output
Short Description Category Complexity
Device is DOWN! Server Down Simple
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization Medium
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization Medium
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization Medium
CPU Warning Monitoron XSSXSXSXSXSX.com CPU Utilization Medium
Device Performance Alerts was triggered on Physical memory Memory Utilization Medium
Device Performance Alerts was triggered on Physical memory Memory Utilization Medium
Device Performance Alerts was triggered on Physical memory Memory Utilization Medium
Disk Space Is Lowon ;E: Disk Space Utilization Medium
Disk Space Is Lowon;C: Disk Space Utilization Medium
Network Interface Down Interface Down Complex现在,我需要查询CSV1.csv和选择'Category'的值,并在reference.csv的Category列中查找所有可能的匹配项,并从reference.csv中获取相应的'Complexity',并将数据对应于CSV1.csv的各个类别。
我正在使用find.all来实现这一目标。我无法按预期做这件事。是否有更好的方法来实现同样的目标。
我尝试使用disct函数,这并没有像预期的那样给出结果。
发布于 2021-03-25 10:44:30
一种可能的办法:
my_dict = dict(zip(reference_df['Category'].values, reference_df['Complexity'].values))
def match_key(key, default_value):
for d_key in my_dict.keys():
if key in d_key or d_key in key:
return my_dict[d_key]
return default_value
CSV1_df['Complexity'] = CSV1_df['Category'].apply(lambda x: match_key(x, 'default'))解释:
通过压缩引用类和复杂性列(即{'Server Down': 'Simple', 'Network Interface down': 'Complex'...}
apply和lambda函数)构建dict,使用CSV1 Dataframe中的每个类别值作为key
Dataframe中的类别值是否为字典中任何键的子字符串,或者相反,并在apply
Dataframe中的新列。
https://stackoverflow.com/questions/66796929
复制相似问题