我正在使用Python3中的TensorFlow创建一个CNN,它基于光子能量形状(20,1)的向量创建一个多类(即预期输出是92个概率中的3个)。我下面的模型是多次迭代和逐渐增加复杂性的结果。
然而,无论我做什么增加(或减少),模型似乎都会一致地达到某个损失值。
下面的代码是我使用Keras-Tuner优化的模型和一些超参数。
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-5, 3e-4, 5e-5, 5e-6])
hp_activation_C_1 = hp.Choice('activation_c1', values=["relu", "swish"])
hp_activation_C_2 = hp.Choice('activation_c2', values=["relu", "swish"])
hp_activation_D_1 = hp.Choice('activation_d1', values=["softsign", "relu", "swish"])
hp_activation_D_2 = hp.Choice('activation_d2', values=["softsign", "relu", "swish"])
hp_drop = hp.Choice('dropout_%', values=[0.05, 0.04, 0.03, 0.02])
hp_filters_1 = hp.Choice('num_filters_1', values=[32, 64, 96])
hp_filters_2 = hp.Choice('num_filters_2', values=[64, 96, 128, 256])
hp_filters_3 = hp.Choice('num_filters_3', values=[96, 128, 256, 384])
hp_kernel_size_1 = hp.Choice('kernel_size_1', values=[3, 5])
hp_units_1 = hp.Int('units_1', min_value = 64, max_value = 2624, step = 128)
hp_units_2 = hp.Int('units_2', min_value = 64, max_value = 2624, step = 128)
# hp_pool_size_1 = hp.Choice('pool_size_1', values=[2, 3, 4])
model = Sequential()
model.add(Conv1D(filters=hp_filters_1, kernel_size=hp_kernel_size_1,
activation=hp_activation_C_1, input_shape=(20, 1)))
model.add(BatchNormalization())
model.add(Dropout(hp_drop))
model.add(Conv1D(filters=hp_filters_2, kernel_size=3, activation=hp_activation_C_2))
model.add(BatchNormalization())
model.add(Dropout(hp_drop))
model.add(AveragePooling1D(pool_size=3, strides=2))
# model.add(MaxPooling1D(pool_size=hp_pool_size_1,strides=3))
model.add(Conv1D(filters=hp_filters_3, kernel_size=3, activation=hp_activation_C_2))
model.add(BatchNormalization())
model.add(Dropout(hp_drop))
model.add(AveragePooling1D(pool_size=3,strides=2))
# model.add(MaxPooling1D(pool_size=2,strides=2))
model.add(BatchNormalization())
model.add(Dropout(hp_drop))
model.add(Flatten())
model.add(Dense(hp_units_1, activation=hp_activation_D_1))
model.add(BatchNormalization())
model.add(Dropout(hp_drop))
model.add(Dense(hp_units_2, activation=hp_activation_D_2))
model.add(Dense(92, activation='softmax'))
early_stop = EarlyStopping(monitor='val_mse',
patience=5,
restore_best_weights=True,
min_delta=0.00005)
reduce_lr = ReduceLROnPlateau(monitor="val_mse",
factor=0.5,
patience=3,
min_lr=1e-6,
min_delta=0.00008)所以我的问题是,为了达到所需的目标,我是否过度复杂化了模型?我如何才能提高性能以进一步减少损失?
发布于 2021-01-18 11:26:39
你可以尝试使用一个可调的学习率。Keras回调ReduceLROnPlateau让这件事变得很容易。Documentation is here.设置用于监控验证丢失的回调。我推荐的代码如下:
red_lr=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss",factor=0.5,
patience=2,verbose=1,mode="auto", min_delta=0.0001, cooldown=0, min_lr=0)在model.fit中添加callbacks=[red_lr]
https://stackoverflow.com/questions/65768237
复制相似问题