我有一个处理字符串的函数,并将它应用于dataframe列。
import pandas as pd
import numpy as np
def test_upper(d):
return d.upper()
def mainfunc():
df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')
print(df.head())
lambdafunc = lambda x: test_upper(x)
df['upper_cols'] = df['cols'].apply(lambdafunc)
print(df.head())
mainfunc()现在,除了multiprocessing.Pool,我还想做同样的事情。我在这里搜索了如何在堆栈溢出中这样做,这就是我想出来的:
import pandas as pd
import numpy as np
import multiprocessing as mp
def test_upper(d):
return d.upper()
def mainfunc():
df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')
print(df.head())
lambdafunc = lambda x: test_upper(x)
list_results = pd.Series()
def log_result(result):
list_results.append(result)
pool = mp.Pool(processes=4)
pool.apply_async(lambdafunc, (df['cols'], ), callback=log_result)
pool.close()
pool.join()
print(list_results)
mainfunc()结果是空白系列/列表,因为我已经尝试了这两个。我在这里做错什么了?谢谢!
发布于 2021-08-12 06:56:37
终于弄明白了
def test_upper(d):
output = d.apply(lambda x: x:upper())
return output
def mainfunc():
df = pd.read_csv("file.csv", sep='\t', encoding='utf-8')
print(df.head())
pool = mp.Pool(processes=4)
result = pool.apply_async(test_upper, (df['cols'], ))
pool.close()
pool.join()
print(result.get())
mainfunc()https://stackoverflow.com/questions/68710097
复制相似问题