情况
.csv文件?的.csv行和列,并显示它们的位置(在df或列表中)。
样本码
df = pd.read_csv('../test.csv', sep='|', skiprows=1)
find_non_ascii = df.select_dtypes(object)
df[find_non_ascii.columns] = find_non_ascii.apply(lambda x: x.str.encode("ascii", "replace").str.decode("ascii")
)
df2 = df[find_non_ascii.columns]
quest = '\\?'
lster = []
try:
for col in cols:
df3 = df2.loc[df2[f'{col}'].str.contains(quest, na=False)]
if df3.items():
lster.append(df3)
except Exception as e:
print(e)
print(lster)输出
[ NAME EARPHONES MODEL_NUMBER ID CAR
0 d?fgh ?g?s s-s d?d asd,
NAME EARPHONES MODEL_NUMBER ID CAR
0 d?fgh ?g?s s-s d?d asd
1 dfg A? NaN af a,
Empty DataFrame
Columns: [NAME, EARPHONES, MODEL_NUMBER, ID, CAR]
Index: [],
NAME EARPHONES MODEL_NUMBER ID CAR
0 d?fgh ?g?s s-s d?d asd,
Empty DataFrame
Columns: [NAME, EARPHONES, MODEL_NUMBER, ID, CAR]
Index: []]```发布于 2021-11-24 22:53:50
您可以使用df.apply为每一行创建一个掩码(True/False),在这种情况下,df很可能是df2,但为了简单起见,我将其保留为df。
df['mask'] = df.apply(lambda x: any(x[col].__contains__('?') for col in df.columns), axis = 1)然后,可以使用此掩码筛选数据,只显示掩码为True的行(行包含任何'?')。
df.loc[df['mask']]并删除结果中显示的掩码列。
df.loc[df['mask'],df.columns[:-1]]https://stackoverflow.com/questions/70102769
复制相似问题