首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何去除csv文件中的NaN值?Python

如何去除csv文件中的NaN值?Python
EN

Stack Overflow用户
提问于 2020-02-26 17:52:58
回答 2查看 2.2K关注 0票数 2

首先,我知道有关于这件事的答案,但直到现在他们都没有为我工作。无论如何,我想知道你的答案,虽然我已经使用了这个解决方案。

我有一个名为mbti_datasets.csv的csv文件。第一列的标签是type,第二列称为description。每一行代表一种新的个性类型(及其各自的类型和描述)。

代码语言:javascript
复制
TYPE        | DESCRIPTION
 a          | This personality likes to eat apples...\nThey look like monkeys...\nIn fact, are strong people...
 b          | b.description
 c          | c.description
 d          | d.description
...16 types | ...

在下面的代码中,当描述有\n时,我尝试复制每个个性类型。

代码:

代码语言:javascript
复制
import pandas as pd

# Reading the file
path_root = 'gdrive/My Drive/Colab Notebooks/MBTI/mbti_datasets.csv'
root_fn = path_rooth + 'mbti_datasets.csv'
df = pd.read_csv(path_root, sep = ',', quotechar = '"', usecols = [0, 1])

# split the column where there are new lines and turn it into a series
serie = df['description'].str.split('\n').apply(pd.Series, 1).stack()

# remove the second index for the DataFrame and the series to share indexes
serie.index = serie.index.droplevel(1)

# give it a name to join it to the DataFrame
serie.name = 'description'

# remove original column
del df['description']

# join the series with the DataFrame, based on the shared index
df = df.join(serie)

# New file name and writing the new csv file
root_new_fn = path_root + 'mbti_new.csv'

df.to_csv(root_new_fn, sep = ',', quotechar = '"', encoding = 'utf-8', index = False)
new_df = pd.read_csv(root_new_fn)

print(new_df)

预期输出:

代码语言:javascript
复制
TYPE | DESCRIPTION
 a   | This personality likes to eat apples... 
 a   | They look like monkeys...
 a   | In fact, are strong people...
 b   | b.description
 b   | b.description
 c   | c.description
...  | ...

电流输出:

代码语言:javascript
复制
TYPE | DESCRIPTION
 a   | This personality likes to eat apples...
 a   | They look like monkeys...NaN
 a   | NaN
 a   | In fact, are strong people...NaN
 b   | b.description...NaN
 b   | NaN
 b   | b.description
 c   | c.description
...  | ...

我不是百分之百确定,但我认为NaN值是\r

应请求上载到github的文件: CSV文件

使用@YOLO解决方案的 CSV YOLO文件,例如哪里失败了:

代码语言:javascript
复制
2 INTJ  Existe soledad en la cima y-- siendo # adds -- in blank random blank spaces
3 INTJ  -- y las mujeres # adds -- in the beginning
3 INTJ  (...) el 0--8-- de la poblaci # doesnt end the word 'población'
10 INTJ icos-- un conflicto que parecer--a imposible. # starts letters randomly
12 INTJ c #adds just 1 letter

为充分理解而翻译:

代码语言:javascript
复制
2 INTJ There is loneliness at the top and-- being # adds -- in blank spaces
3 INTJ -- and women # adds - in the beginning
3 INTJ (...) on 0--8-- of the popula-- # doesnt end the word 'population'
10 INTJ icos-- a conflict that seems--to impossible. # starts letters randomly
12 INTJ c #adds just 1 letter

当我显示是否有NaN值和哪种类型:

代码语言:javascript
复制
print(new_df['descripcion'].isnull())

<class 'float'>
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7      True
8     False
9      True
10    False
11     True
continue...
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-02-27 00:14:20

这个问题可以归因于描述单元格,因为有两个新的连续行的部分,它们之间没有任何内容。

我只是使用.dropna()读取新创建的csv,并在没有NaN值的情况下重写它。无论如何,我认为重复这个过程并不是最好的方法,但它是一个直接的解决方案。

代码语言:javascript
复制
df.to_csv(root_new_fn, sep = ',', quotechar = '"', encoding = 'utf-8', index = False)
new_df = pd.read_csv(root_new_fn).dropna()

new_df.to_csv(root_new_fn, sep = ',', quotechar = '"', encoding = 'utf-8', index = False)
new_df = pd.read_csv(root_new_fn)

print(type(new_df.iloc[7, 1]))# where was a NaN value
print(new_df['descripcion'].isnull())

<class 'str'>
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
and continues...
票数 0
EN

Stack Overflow用户

发布于 2020-02-26 19:10:39

这里有一种方法,我必须找到一个替代\n字符的解决办法,不知怎么的,它不是以直接的方式工作的:

代码语言:javascript
复制
df['DESCRIPTION'] = df['DESCRIPTION'].str.replace('[^a-zA-Z0-9\s.]','--').str.split('--n')

df = df.explode('DESCRIPTION')

print(df)

           TYPE                               DESCRIPTION
0   a             This personality likes to eat apples...
0   a                           They look like monkeys...
0   a                      In fact-- are strong people...
1   b                                       b.description
2   c                                       c.description
3   d                                       d.description
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60419744

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档