我已经用tf-agents创建了一个用于强化学习的自定义环境(不需要回答这个问题),如果我通过将num_parallel_environments设置为1来实例化一个线程,那么它工作得很好,但是当我将num_parallel_environments增加到50时,它会抛出不常见的和看似随机的错误,比如random.shuffle()内部的IndexError。代码如下:
train.py内幕
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv()] * num_parallel_environments))在我的环境中,这是在线程中运行的
self.cardStack = getFullDeck()
random.shuffle(self.cardStack)这是一个普通的函数,在每个线程类中导入
def getFullDeck():
deck = []
for rank in Ranks:
for suit in Suits:
deck.append(Card(rank, suit))
return deck这是一个可能的错误:
Traceback (most recent call last):
File "e:\Users\tmp\.vscode\extensions\ms-python.python-2019.1.0\pythonFiles\ptvsd_launcher.py", line 45, in <module>
main(ptvsdArgs)
File "e:\Users\tmp\.vscode\extensions\ms-python.python-2019.1.0\pythonFiles\lib\python\ptvsd\__main__.py", line 348, in main
run()
File "e:\Users\tmp\.vscode\extensions\ms-python.python-2019.1.0\pythonFiles\lib\python\ptvsd\__main__.py", line 253, in run_file
runpy.run_path(target, run_name='__main__')
File "C:\Python37\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Python37\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "e:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\train_v2.py", line 320, in <module>
app.run(main)
File "C:\Python37\lib\site-packages\absl\app.py", line 300, in run
_run_main(main, args)
File "C:\Python37\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "e:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\train_v2.py", line 315, in main
num_eval_episodes=FLAGS.num_eval_episodes)
File "E:\Users\tmp\AppData\Roaming\Python\Python37\site-packages\gin\config.py", line 1032, in wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "E:\Users\tmp\AppData\Roaming\Python\Python37\site-packages\gin\utils.py", line 49, in augment_exception_message_and_reraise
six.raise_from(proxy.with_traceback(exception.__traceback__), None)
File "<string>", line 3, in raise_from
File "E:\Users\tmp\AppData\Roaming\Python\Python37\site-packages\gin\config.py", line 1009, in wrapper
return fn(*new_args, **new_kwargs)
File "e:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\train_v2.py", line 251, in train_eval
collect_driver.run()
File "C:\Python37\lib\site-packages\tf_agents\drivers\dynamic_episode_driver.py", line 149, in run
maximum_iterations=maximum_iterations)
File "C:\Python37\lib\site-packages\tf_agents\utils\common.py", line 111, in with_check_resource_vars
return fn(*fn_args, **fn_kwargs)
File "C:\Python37\lib\site-packages\tf_agents\drivers\dynamic_episode_driver.py", line 180, in _run
name='driver_loop'
File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2457, in while_loop_v2
return_same_structure=True)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2689, in while_loop
loop_vars = body(*loop_vars)
File "C:\Python37\lib\site-packages\tf_agents\drivers\dynamic_episode_driver.py", line 103, in loop_body
next_time_step = self.env.step(action_step.action)
File "C:\Python37\lib\site-packages\tf_agents\environments\tf_environment.py", line 232, in step
return self._step(action)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 232, in graph_wrapper
return func(*args, **kwargs)
File "C:\Python37\lib\site-packages\tf_agents\environments\tf_py_environment.py", line 218, in _step
_step_py, flat_actions, self._time_step_dtypes, name='step_py_func')
File "C:\Python37\lib\site-packages\tensorflow\python\ops\script_ops.py", line 488, in numpy_function
return py_func_common(func, inp, Tout, stateful=True, name=name)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\script_ops.py", line 452, in py_func_common
result = func(*[x.numpy() for x in inp])
File "C:\Python37\lib\site-packages\tf_agents\environments\tf_py_environment.py", line 203, in _step_py
self._time_step = self._env.step(packed)
File "C:\Python37\lib\site-packages\tf_agents\environments\py_environment.py", line 174, in step
self._current_time_step = self._step(action)
File "C:\Python37\lib\site-packages\tf_agents\environments\batched_py_environment.py", line 140, in _step
zip(self._envs, unstacked_actions))
File "C:\Python37\lib\multiprocessing\pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
File "C:\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Python37\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "C:\Python37\lib\site-packages\tf_agents\environments\batched_py_environment.py", line 139, in <lambda>
lambda env_action: env_action[0].step(env_action[1]),
File "C:\Python37\lib\site-packages\tf_agents\environments\py_environment.py", line 174, in step
self._current_time_step = self._step(action)
File "e:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\environment.py", line 116, in _step
canRoundContinue = self._table.runUntilChoice(action)
File "e:\Users\tmp\Documents\Programming\Neural Nets\Poker_AI\table.py", line 326, in runUntilChoice
random.shuffle(self.cardStack)
File "C:\Python37\lib\random.py", line 278, in shuffle
x[i], x[j] = x[j], x[i]
IndexError: list index out of range
In call to configurable 'train_eval' (<function train_eval at 0x000002722713A158>)我怀疑这个错误的发生是因为线程同时改变了数组,但我不明白为什么会这样:
所有的事情都发生在一个类实例中,getFullDeck()返回的数组在每次调用函数时都会被重新创建,所以应该不会有多个线程可以访问同一个引用,对吧?
发布于 2019-10-23 04:51:29
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv()] * num_parallel_environments))您将为每个并行实例重用相同的环境,而不是为每个实例创建新的环境。您可能想尝试一下,比如
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv() for _ in range(num_parallel_environments)]))https://stackoverflow.com/questions/57793505
复制相似问题