文章/答案/技术大牛

发布

社区首页 >问答首页 >如何利用GPU训练包含嵌入层的深度学习神经网络？

问如何利用GPU训练包含嵌入层的深度学习神经网络？
EN

Stack Overflow用户

提问于 2021-02-23 19:06:52

回答 1查看 147关注 0票数 0

我在嵌入层上得到了一个InvalidArgumentError：

Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
GatherV2: GPU CPU 
Cast: GPU CPU 
Const: GPU CPU 
ResourceSparseApplyAdagradV2: CPU 
_Arg: GPU CPU 
ReadVariableOp: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  model_6_user_embedding_embedding_lookup_readvariableop_resource (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  adagrad_adagrad_update_1_update_0_resourcesparseapplyadagradv2_accum (_Arg)  framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
  model_6/User-Embedding/embedding_lookup/ReadVariableOp (ReadVariableOp) 
  model_6/User-Embedding/embedding_lookup/axis (Const) 
  model_6/User-Embedding/embedding_lookup (GatherV2) 
  gradient_tape/model_6/User-Embedding/embedding_lookup/Shape (Const) 
  gradient_tape/model_6/User-Embedding/embedding_lookup/Cast (Cast) 
  Adagrad/Adagrad/update_1/update_0/ResourceSparseApplyAdagradV2 (ResourceSparseApplyAdagradV2) /job:localhost/replica:0/task:0/device:GPU:0

     [[{{node model_6/User-Embedding/embedding_lookup/ReadVariableOp}}]] [Op:__inference_train_function_2997]

链接到google：zstuI-EsKjw7Max1f73v?usp=共享

这是一个非常简单的神经网络，数据可以从Kaggle下载--你可以把它拖到colabs中去工作。

我也尝试过设置软设备放置=真 tf.config.set_soft_device_placement(True)，但这似乎不起作用。

从错误日志来看，MirroredStrategy似乎将嵌入查找操作分配给了GPU (这是GPU不兼容的，我可以理解原因)，我希望tf.config.set_soft_device_placement(True)会要求Tensorflow使用CPU，但这似乎被忽略了。

以前有人见过这个问题，知道解决办法吗？

embedding

tensorflow

keras

gpu

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-02-24 12:36:28

在TF1.14中发现了类似的问题：https://github.com/tensorflow/tensorflow/issues/31318

看起来MirroredStrategy不能支持使用基于动量的优化器来训练嵌入层。

克隆上面的笔记本并使用RMSprop (与momentum=0)似乎是可行的：M7vmQfclL59eRj?usp=sharing

在这个问题得到解决之前，我将使用RMSProp，暂时不使用任何动量。错误信息当然没有帮助！

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66339524

复制

相似问题

问如何利用GPU训练包含嵌入层的深度学习神经网络？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何利用GPU训练包含嵌入层的深度学习神经网络？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何利用GPU训练包含嵌入层的深度学习神经网络？
EN