嗨,对于我推荐的电影,我使用TF-IDF,但我有一个形状错误。
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer(analyzer='word', ngram_range=(1, 2), min_df=0, stop_words='english')
tfidf_matrix = tfidf.fit_transform(X_train.summary)
tfidf_matrix.shapeOutPut:
(3933, 56162)然后:
from sklearn.metrics.pairwise import linear_kernel
# Compute the cosine similarity matrix
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
def get_recommendations(title):
# Get the index of the movie that matches the title
idx = indices[title.lower()]
summary = data.summary[idx]
tfidf_vect = tfidf.transform([summary])
cosine_sim = linear_kernel(tfidf_matrix, tfidf_vect)
dis = cosine_sim - (np.abs(data.release_date -
data.release_date[idx]/56000)).to_numpy().reshape((-1,1))
movie_indices = dis.argsort(axis=0)[-5:][::-1].reshape((-1))
return pd.DataFrame(data[['title','release_date']].iloc[movie_indices])输出错误:
ValueError: operands could not be broadcast together with shapes (3933,1) (4917,1) 发布于 2020-04-27 03:13:03
你不能对任何形状使用python广播。在以下情况下,两个维度是兼容的
如果不满足这些条件,则会抛出ValueError:操作数不能一起广播异常,指示数组具有不兼容的形状。结果数组的大小是沿输入数组的每个维度的最大大小。
如果您不熟悉python广播,请查看documentation获取详细信息。
https://stackoverflow.com/questions/61446516
复制相似问题