我有一个DataFrame,在这里我希望重新排列给定列的数据。
我拥有的:
text KEYWORD
0 Fetch.ai will transform economies, healthcare,... supplies chain issues
1 self
2 secured key partnership
3 real world challenge
4 autonomous economic agent
5 learning traffic signal
6 autonomous machine learning
7 disruptive ai tech
8 parking issues
9 traffic reduction
10
11
12 The two most popular cryptocurrencies on the p... bitcoin
13 limited supplies
14 ethereum我想要的:
text KEYWORD
0 Fetch.ai will transform economies, healthcare,... supplies chain issues, self, secured key partnership, real world challenge, autonomous economic agent, learning traffic signal, autonomous machine learning, disruptive ai tech, parking issues, traffic reduction
1 The two most popular cryptocurrencies on the p... bitcoin, limited supplies, emphasized text, ethereum包含文本的每一行都显示在" text“列中。对“文本”列进行了分析,并从中提取了关键字,并将其显示在“关键字”列中。恼人的部分是,如果从"Text“列中提取出10个关键字,它将创建10行,并在每行添加1个关键字。我想将所有这些关键字合并到一行中(与好的文本相对应)。
不幸的是,我无法访问由软件完成的关键字提取过程。
发布于 2021-11-11 15:09:47
尝试使用groupby
#replace blank cells with NaN
df = df.replace(r"^\s*$",np.nan,regex=True)
#drop rows that are all NaN and forward fill
df = df.dropna(how="all").ffill()
#groupby and aggregate
output = df.groupby("text", as_index=False)["KEYWORD"].agg(", ".join)
>>> output
text KEYWORD
0 Fetch.ai will transform economies, healthcare,... supplies chain issues, self, secured key partn...
1 The two most popular cryptocurrencies on the p... bitcoin, limited supplies, ethereumhttps://stackoverflow.com/questions/69930509
复制相似问题