我正在遵循官方的AWS EKS tutorial为Tensorflow模型训练设置一个分布式GPU集群,但遇到了一些问题。
在使用eksctl创建新集群并验证网关节点上存在相应的~/.kube/config文件后,本教程将指导我在网关节点上执行download ksonnet操作,并使用它来初始化新应用程序:
$ ks init <app-name>但是,当我尝试运行此命令时,收到以下错误:
INFO Using context "arn:aws:eks:us-west-2:131397771409:cluster/<cluster name>" from kubeconfig file "/home/ubuntu/.kube/config"
INFO Creating environment "default" with namespace "default", pointing to "version:v1.18.9" cluster at address <cluster address>
ERROR No Major.Minor.Patch elements found我在Github/SO上做了一些搜索,但没有找到这个问题的解决方案。我怀疑真正的答案是不再使用ksonnet,因为它不再需要维护(最近两年似乎也不是),但目前我只想完成本教程:)
任何洞察力都是值得感谢的!
我的~/.kube/config的内容
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <certificate>
server: <server>
name: arn:aws:eks:us-west-2:131397771409:cluster/<name>
contexts:
- context:
cluster: arn:aws:eks:us-west-2:131397771409:cluster/<name>
user: arn:aws:eks:us-west-2:131397771409:cluster/<name>
name: arn:aws:eks:us-west-2:131397771409:cluster/<name>
current-context: arn:aws:eks:us-west-2:131397771409:cluster/<name>
kind: Config
preferences: {}
users:
- name: arn:aws:eks:us-west-2:131397771409:cluster/<name>
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
args:
- --region
- us-west-2
- eks
- get-token
- --cluster-name
- <name>
command: awshttps://stackoverflow.com/questions/64724380
复制相似问题