文章/答案/技术大牛

发布

社区首页 >问答首页 >OpenAI-Gym和Keras-RL: DQN期望每个动作都有一个一维的模型。

问OpenAI-Gym和Keras-RL: DQN期望每个动作都有一个一维的模型。
EN

Stack Overflow用户

提问于 2021-12-07 13:52:47

回答 2查看 1.3K关注 0票数 2

我正试图在OpenAI健身房中设置一个具有自定义环境的深度Q学习代理.我有4个连续状态变量和3个具有个体限制的整数作用变量。

以下是代码：

#%% import 
from gym import Env
from gym.spaces import Discrete, Box, Tuple
import numpy as np


#%%
class Custom_Env(Env):

    def __init__(self):
        
       # Define the state space
       
       #State variables
       self.state_1 = 0
       self.state_2 =  0
       self.state_3 = 0
       self.state_4_currentTimeSlots = 0
       
       #Define the gym components
       self.action_space = Box(low=np.array([0, 0, 0]), high=np.array([10, 20, 27]), dtype=np.int)    
                                                                             
       self.observation_space = Box(low=np.array([20, -20, 0, 0]), high=np.array([22, 250, 100, 287]),dtype=np.float16)

    def step(self, action ):

        # Update state variables
        self.state_1 = self.state_1 + action [0]
        self.state_2 = self.state_2 + action [1]
        self.state_3 = self.state_3 + action [2]

        #Calculate reward
        reward = self.state_1 + self.state_2 + self.state_3
       
        #Set placeholder for info
        info = {}    
        
        #Check if it's the end of the day
        if self.state_4_currentTimeSlots >= 287:
            done = True
        if self.state_4_currentTimeSlots < 287:
            done = False       
        
        #Move to the next timeslot 
        self.state_4_currentTimeSlots +=1

        state = np.array([self.state_1,self.state_2, self.state_3, self.state_4_currentTimeSlots ])

        #Return step information
        return state, reward, done, info
        
    def render (self):
        pass
    
    def reset (self):
       self.state_1 = 0
       self.state_2 =  0
       self.state_3 = 0
       self.state_4_currentTimeSlots = 0
       state = np.array([self.state_1,self.state_2, self.state_3, self.state_4_currentTimeSlots ])
       return state

#%% Set up the environment
env = Custom_Env()

#%% Create a deep learning model with keras


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam

def build_model(states, actions):
    model = Sequential()
    model.add(Dense(24, activation='relu', input_shape=states))
    model.add(Dense(24, activation='relu'))
    model.add(Dense(actions[0] , activation='linear'))
    return model

states = env.observation_space.shape 
actions = env.action_space.shape 
print("env.observation_space: ", env.observation_space)
print("env.observation_space.shape : ", env.observation_space.shape )
print("action_space: ", env.action_space)
print("action_space.shape : ", env.action_space.shape )


model = build_model(states, actions)
print(model.summary())

#%% Build Agent wit Keras-RL
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

def build_agent (model, actions):
    policy = BoltzmannQPolicy()
    memory = SequentialMemory(limit = 50000, window_length=1)
    dqn = DQNAgent (model = model, memory = memory, policy=policy,
                    nb_actions=actions, nb_steps_warmup=10, target_model_update= 1e-2)
    return dqn

dqn = build_agent(model, actions)
dqn.compile(Adam(lr=1e-3), metrics = ['mae'])
dqn.fit (env, nb_steps = 4000, visualize=False, verbose = 1)

运行此代码时，将收到以下错误消息

ValueError: Model output "Tensor("dense_23/BiasAdd:0", shape=(None, 3), dtype=float32)" has invalid shape. DQN expects a model that has one dimension for each action, in this case (3,).

由行dqn = DQNAgent (model = model, memory = memory, policy=policy, nb_actions=actions, nb_steps_warmup=10, target_model_update= 1e-2)引发

有谁能告诉我，为什么会出现这个问题，以及如何解决这个问题？我假设它与构建的模型有关，因此与动作和状态空间有关。但我不知道到底是什么问题。

提醒赏金：我的赏金很快就要到期了，不幸的是，我仍然没有收到任何答复。如果你至少能猜出如何解决这个问题，我会非常感激你能和我分享你的想法，我会非常感激的。

keras

reinforcement-learning

openai-gym

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-12-23 11:19:51

正如我们在评论中提到的，似乎不再支持Keras-rl库(存储库中的上一次更新是在2019年)，所以现在所有的东西都可能在Keras内部。我看了一下Keras文档，没有构建强化学习模型的高级函数，但是可以使用更低级别的函数。

下面是如何在Keras中使用深度Q学习的一个示例：link

另一种解决方案可能是将Tensorflow 1.0降级为Tensorflow 1.0，因为版本2.0中的一些更改似乎会导致兼容性问题。我没有进行测试，但可能Keras+ Tensorflow 1.0可能有效。

还有一个支持Tensorflow 2.0的Keras-rl的branch，存储库是存档的，但是它有可能对您起作用

票数 1

Stack Overflow用户

发布于 2022-03-02 10:55:37

在最终输出之前添加平坦层可以解决此错误。示例：

def build_model(states, actions):
    model = Sequential()
    model.add(Dense(24, activation='relu', input_shape=states))
    model.add(Dense(24, activation='relu'))
    model.add(Flatten())
    model.add(Dense(actions[0] , activation='linear'))
    return model

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70261352

复制

相似问题

问OpenAI-Gym和Keras-RL: DQN期望每个动作都有一个一维的模型。
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问OpenAI-Gym和Keras-RL: DQN期望每个动作都有一个一维的模型。EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问OpenAI-Gym和Keras-RL: DQN期望每个动作都有一个一维的模型。
EN