我感兴趣的是循环通过列转换成处理的系列。
下面是两列、四列数据框架的示例:
import pandas as pd
from rapidfuzz import process as process_rapid
from rapidfuzz import utils as rapid_utils
data = [['r/o ac. nephritis. /. nephrotic syndrome', ' ac. nephritis. /. nephrotic syndrome',1,'ac nephritis nephrotic syndrome'], [ 'sternocleidomastoid contracture','sternocleidomastoid contracture',0,"NA"]]# Create the pandas DataFrame
df_diagnosis = pd.DataFrame(data, columns = ['diagnosis_name', 'diagnosis_name_edited','is_spell_corrected','spell_corrected_value'])如果spell_corrected_value列超过1,我希望使用is_spell_corrected列。
目前,我有以下代码可以直接使用diagnosis_name_edited列。如何对is_spell_corrected列进行if-else/lambda检查?
unmapped_diag_series = (rapid_utils.default_process(d) for d in df_diagnosis['diagnosis_name_edited'].astype(str)) # characters (generator)
unmapped_processed_diagnosis = pd.Series(unmapped_diag_series) #谢谢。
发布于 2022-04-07 07:32:57
如果我说得对,可以使用numpy.where尝试这个快速解决方案:
df_diagnosis['new_column'] = np.where(df_diagnosis['is_spell_corrected'] > 1, df_diagnosis['spell_corrected_value'], df_diagnosis['diagnosis_name_edited'])https://stackoverflow.com/questions/71778017
复制相似问题