如何在AI平台管道上使用GPU?我的管道在其中一个操作中使用了set_gpu_limit(1),但最终得到了一个This step is in Pending state with this message: Unschedulable: 0/3 nodes are available: 3 Insufficient nvidia.com/gpu.错误。
发布于 2020-12-17 02:15:31
几分钟后就拿到了。我跟踪了normal Kubeflow on GPU instructions
export GPU_POOL_NAME=gpu-pool
export CLUSTER_NAME=cluster-1gcloud container node-pools create ${GPU_POOL_NAME} \
--accelerator type=nvidia-tesla-k80,count=1 \
--zone us-central1-a --cluster ${CLUSTER_NAME} \
--num-nodes=0 --machine-type=n1-standard-4 --min-nodes=0 --max-nodes=1 --enable-autoscalingkubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml
https://stackoverflow.com/questions/65328649
复制相似问题