文章/答案/技术大牛

发布

社区首页 >问答首页 >tensorflow项目精度低(多级分类)

问tensorflow项目精度低(多级分类)
EN

Stack Overflow用户

提问于 2022-06-22 15:06:03

回答 2查看 38关注 0票数 0

最初的dataframe有20,000行，为了演示，我准备了一个包含20行的df。我的模型给出了0.03级的acc (对于DF20k行)，并且在历次之间没有变化。如果我犯了错误，请给我建议，我将不胜感激。我的代码：

import keras
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import Adam 
import pandas as pd

df = pd.DataFrame({'lettr': ['T','I','D','N','G','S','B','A','J','M','X','O','G','M','R','F','O','C','T', 'J'],
        'x-box': [2, 5, 4, 7, 2, 4, 4, 1, 2, 11, 3, 6, 4, 6, 5, 6, 3, 7, 6, 2],
        'y-box': [8, 12, 11, 11, 1, 11, 2, 1, 2, 15, 9, 13, 9, 9, 9, 9, 4, 10, 11, 2],
        'width': [3, 3, 6, 6, 3, 5, 5, 3, 4, 13, 5, 4, 6, 8, 5, 5, 4, 5, 6, 3],
        'high': [5, 7, 8, 6, 1, 8, 4, 2, 4, 9, 7, 7, 7, 6, 7, 4, 3, 5, 8, 3],
        'onpix':[1, 2, 6, 3, 1, 3, 4, 1, 2, 7, 4, 4, 6, 9, 6, 3, 2, 2, 5, 1],
        'x-bar':[8, 10, 10, 5, 8, 8, 8, 8, 10, 13, 8, 6, 7, 7, 6, 10, 8, 6, 6, 10],
        'y-bar':[13, 5, 6, 9, 6, 8, 7, 2, 6, 2, 7, 7, 8, 8, 11, 6, 7, 8, 11, 6],
        'x2bar':[0, 5, 2, 4, 6, 6, 6, 2, 2, 6, 3, 6, 6, 6, 7, 3, 7, 6, 5, 3],
        'y2bar':[6, 4, 6, 6, 6, 9, 6, 2, 6, 2, 8, 3, 2, 5, 3, 5, 5, 8, 6, 6],
        'xybar':[6, 13, 10, 4, 6, 5, 7, 8, 12, 12, 5, 10, 6, 7, 7, 10, 7, 11, 11, 12],
        'x2ybr':[10, 3, 3, 4, 5, 6, 6, 2, 4, 1, 6, 7, 5, 5, 3, 5, 6, 7, 9, 4],
        'xy2br':[8, 9, 7, 10, 9, 6, 6, 8, 8, 9, 8, 9, 11, 8, 9, 7, 8, 11, 4, 9],
        'x-ege':[0, 2, 3, 6, 1, 0, 2, 1, 1, 8, 2, 5, 4, 8, 2, 3, 2, 2, 3, 0],
        'xegvy':[8, 8, 7, 10, 7, 8, 8, 6, 6, 1, 8, 9, 8, 9, 7, 9, 8, 8, 12, 7],
        'y-ege':[0, 4, 3, 2, 5, 9, 7, 2, 1, 1, 6, 5, 7, 8, 5, 6, 3, 5, 2, 1],
        'yegvx':[8, 10, 9, 8, 10, 7, 10, 7, 7, 8, 7, 8, 8, 6, 11, 9, 8, 9, 4, 7],
        })
 
def naiveEncode(col):
  values = list(col.unique())
  return col.apply(lambda x: values.index(x))
df["lettr"] = naiveEncode(df["lettr"])

X = df.iloc[:,1:].values
y = df.iloc[:, 0].values

from sklearn.preprocessing import LabelEncoder
encoder =  LabelEncoder()
y1 = encoder.fit_transform(y)
Y = pd.get_dummies(y1).values

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = tf.keras.models.Sequential([
  tf.keras.layers.Dense(128, activation="relu", input_shape=(16,)),                            
  tf.keras.layers.Dense(64, activation="relu",),                                    
  tf.keras.layers.Dense(32, activation="relu",),                                    
  tf.keras.layers.Dense(26, activation="softmax",)                                    
])
model.compile(Adam(lr=0.04),'categorical_crossentropy',metrics=['accuracy'])
model.summary()

model.fit(X_train,y_train,epochs=10)

tensorflow

machine-learning

multiclass-classification

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-06-22 15:43:01

很抱歉，我的第一个答案，我完全错过了事实，您使用的categorical_crossentropy损失在错误的情况下。

在您的例子中，您应该使用sparse_categorical_crossentropy而不是categorical_crossentropy。您应该检查：https://stats.stackexchange.com/questions/326065/cross-entropy-vs-sparse-cross-entropy-when-to-use-one-over-the-other

此外，您应该使用softmax激活函数将输出层更新为符合类数量的密集层(如果您有所有字母表，则为26)。

顺便提一下，我建议您添加测试数据，作为模型培训的验证指标。

修改后的代码：

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = tf.keras.models.Sequential([
  tf.keras.layers.Dense(128, activation="relu", input_shape=(16,)),                            
  tf.keras.layers.Dense(64, activation="relu",),                                    
  tf.keras.layers.Dense(32, activation="relu",),                                    
  tf.keras.layers.Dense(26, activation="softmax",)                                    
])
model.compile(Adam(lr=0.04),'sparse_categorical_crossentropy',metrics=['accuracy'])
model.summary()

model.fit(X_train,y_train,
  validation_data= (X_test, y_test),
  epochs=10)

票数 0

Stack Overflow用户

发布于 2022-06-22 15:19:58

问题来自于y变量不包含所需的目标列这一事实。

y = df.iloc[:, 0].values以您的第一列为目标，而您希望df"lettr“作为目标。你应将其替换为：

y = df["lettr"].values

然后，您需要这样调整您的输入X：

X = df.loc[:, df.columns != "lettr"].values

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72717915

复制

相似问题

问tensorflow项目精度低(多级分类)
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问tensorflow项目精度低(多级分类)EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问tensorflow项目精度低(多级分类)
EN