首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >“串联”对象没有属性'lower‘tfidf

“串联”对象没有属性'lower‘tfidf
EN

Stack Overflow用户
提问于 2022-03-30 13:35:14
回答 1查看 422关注 0票数 0

我试着用tfidf来准备我的数据,但是我也有同样的错误。

代码语言:javascript
复制
X = df['Description'], df['Type']
y =df['Description'], df['Type']
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.33, random_state=42)


df['Description']=[" ".join(Description) for Description in df['Description'].values]

tfidf = TfidfVectorizer(stop_words='english')
t_x_train = tfidf.fit_transform(X_train)
t_x_test = tfidf.transform(y_test)

当我运行它时,会发生AttributeError: 'Series' object has no attribute 'lower'

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-30 20:51:55

Sklearn尝试将str.lower()应用于y_test中的元素。但是,数据类型似乎不兼容。

请核对:

使用tfidf或转换为字符串的数据类型,如下面所示,在传递给tfidf时,y_test是否应该替换为X_test

代码语言:javascript
复制
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
corpus = [
 ('This is the first document.',4),
 ('This document is the second document.',3),
 ('And this is the third one.',2),
 ('Is this the first document?',1)
]

df= pd.DataFrame(corpus, columns = ['Description', 'Type'])


X = df['Description']
# make sure your target is also a series of strings if not already
y = df['Type'].astype('str')

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.33, random_state=42)
# df['Description']=[" ".join(Description) for Description in df['Description'].values]

tfidf = TfidfVectorizer(stop_words='english')
t_x_train = tfidf.fit_transform(X_train)
t_x_test = tfidf.transform(y_test)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71678256

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档