首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在Pandas上运行AttributeError lower_case.translate和string.punctuation

在Pandas上运行AttributeError lower_case.translate和string.punctuation
EN

Stack Overflow用户
提问于 2020-08-25 18:36:04
回答 1查看 152关注 0票数 0

当我在包含评论的Pandas上运行AttributeError和string.punctuation时,我得到了它。输入的数据很难看。收到的错误是AttributeError: 'DataFrame' object has no attribute 'translate',完整的错误在下面。

I在注释中尝试了不同的验证

代码语言:javascript
复制
# cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation)))
# cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)

cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

,我也尝试了这个 ,并在上面添加了一个填充符,希望能修复它。

代码语言:javascript
复制
#checking for nulls if present any
print("Number of rows with null values:")
print(lower_case.isnull().sum().sum())

lower_case.fillna("")

用于数据帧[small sample excel][2] https://github.com/taylorjohn/Simple_RecSys/blob/master/sample-data.xlsx的a

代码

代码语言:javascript
复制
import string
from collections import Counter
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nltk.corpus import stopwords
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.stem import WordNetLemmatizer
from nltk.tokenize import word_tokenize

# data is in excel formatted ugly and unclean  columns are Artist Names rows are reviews for said Artist
df = pd.read_excel('sample-data.xlsx',encoding='utf8', errors='ignore')

lower_case = df.apply(lambda x: x.astype(str).str.lower())

#checking for nulls if present any
print("Number of rows with null values:")
print(lower_case.isnull().sum().sum())

lower_case.fillna("")


#cleaned_text = lower_case.translate(str.maketrans(string.punctuation, ' '*len(string.punctuation)))
# cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)

cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

收到的错误是

代码语言:javascript
复制
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-78-9f23b8a5e8e0> in <module>
      2 # cleaned_text = lower_case.translator = str.maketrans('', '', string.punctuation)
      3 
----> 4 cleaned_text = lower_case.translate(str.maketrans('', '', string.punctuation))

~\anaconda3\envs\nlp_course\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5273                 return self[name]
-> 5274             return object.__getattribute__(self, name)
   5275 
   5276     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'translate'
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-25 18:45:23

熊猫DataFrames没有.translate()方法,但是Python有。例如:

代码语言:javascript
复制
import string

my_str = "hello world!"                                                                                                                                                                            
my_str.translate(str.maketrans('', '', string.punctuation)) 

如果要将该转换应用于DataFrame行中的每个列值,则可以在列上使用.map().map()方法接受接受列值作为参数的函数,您可以返回转换后的值:

代码语言:javascript
复制
def remove_punctuation(value):
    return value.translate(str.maketrans('', '', string.punctuation))

df["my_cleaned_column"] = df["my_dirty_column"].map(remove_punctuation)

您还可以使用lambda函数,而不是定义新函数:

代码语言:javascript
复制
df["my_cleaned_column"] = df["my_dirty_column"].map(
    lambda x: x.translate(str.maketrans('', '', string.punctuation))
)

如果您有许多需要应用于此的列,则可以这样做:

代码语言:javascript
复制
for column_name in df.columns:
    df[column_name] = df[column_name].map(
        lambda x: x.translate(str.maketrans('', '', string.punctuation))
    )
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63585138

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档