我使用由node.js项目指定的Jenkins代理pod,在Kubernetes中使用Jenkins构建了一个node.js服务。我正在努力消除对Jenkins UI的手动触摸。一切都运行在一个库伯奈特星系团中。
我遵循这个博客并稍微修改它,但是遇到了一些问题:
‘Jenkins’ doesn’t have label ‘test-pod’在Kubernetes中成功地创建了构建代理。test-pod标签是由Jenkinsfile指定的,所以我不知道为什么会出现这个错误。它是如何无限循环的?
podTemplate(
name: 'test-pod',
label: 'test-pod',
containers: [
containerTemplate(name: 'node14', image: 'node:14-alpine'),
containerTemplate(name: 'docker', image:'trion/jenkins-docker-client'),
],
{
node('test-pod') {
stage('Build'){
container('node14') {
// do nothing just yet
}
}
}
}
)以下是Jenkins控制台输出的一部分:
Started by user admin
Obtained Jenkinsfile from git ssh://git@kube-master.cluster.dev/git/hello.git
Running in Durability level: MAX_SURVIVABILITY
[Pipeline] Start of Pipeline
[Pipeline] podTemplate
[Pipeline] {
[Pipeline] node
Created Pod: kubernetes jenkins/test-pod-2hdfp-9kcjj
[Normal][jenkins/test-pod-2hdfp-9kcjj][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-9kcjj to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-9kcjj][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-9kcjj][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-9kcjj][Started] Started container jnlp
jenkins/test-pod-2hdfp-9kcjj Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-9kcjj][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-gc2qb
[Normal][jenkins/test-pod-2hdfp-gc2qb][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-gc2qb to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-gc2qb][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-gc2qb][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-gc2qb][Started] Started container jnlp
jenkins/test-pod-2hdfp-gc2qb Container node14 was terminated (Exit Code: 0, Reason: Completed)
Still waiting to schedule task
‘Jenkins’ doesn’t have label test-pod’
[Normal][jenkins/test-pod-2hdfp-gc2qb][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-xwkm2
[Normal][jenkins/test-pod-2hdfp-xwkm2][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-xwkm2 to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-xwkm2][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-xwkm2][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-xwkm2][Started] Started container jnlp
jenkins/test-pod-2hdfp-xwkm2 Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-xwkm2][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-4ltq3
[Normal][jenkins/test-pod-2hdfp-4ltq3][Scheduled] Successfully assigned jenkins/test-pod-2hdfp-4ltq3 to kube-worker2.cluster.dev
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "node:14-alpine" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container node14
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container node14
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "trion/jenkins-docker-client" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container docker
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container docker
[Normal][jenkins/test-pod-2hdfp-4ltq3][Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine
[Normal][jenkins/test-pod-2hdfp-4ltq3][Created] Created container jnlp
[Normal][jenkins/test-pod-2hdfp-4ltq3][Started] Started container jnlp
jenkins/test-pod-2hdfp-4ltq3 Container node14 was terminated (Exit Code: 0, Reason: Completed)
[Normal][jenkins/test-pod-2hdfp-4ltq3][Killing] Stopping container docker
Created Pod: kubernetes jenkins/test-pod-2hdfp-0216w
...更新最新发现
主日志(请参阅调试)提供的其他内容不多:
...
2021-04-30 11:52:42.715+0000 [id=4660] INFO hudson.slaves.NodeProvisioner#lambda$update$6: test-pod-gb4vq-hf3d4 provisioning successfully completed. We have now 2 computer(s)
2021-04-30 11:52:42.741+0000 [id=4659] INFO o.c.j.p.k.KubernetesLauncher#launch: Created Pod: kubernetes jenkins/test-pod-gb4vq-hf3d4
2021-04-30 11:52:42.847+0000 [id=4680] WARNING o.c.j.p.k.KubernetesLauncher#launch: Error in provisioning; agent=KubernetesSlave name: test-pod-gb4vq-pdd69, template=PodTemplate{id='f29ecbdd-9c1d-468f-86ff-dd46ff40f306', name='test-pod-gb4vq', namespace='jenkins', label='test-pod', containers=[ContainerTemplate{name='node14', image='node:14-alpine'}, ContainerTemplate{name='docker', image='trion/jenkins-docker-client'}], annotations=[PodAnnotation{key='buildUrl', value='http://172.16.1.12/job/hello/14/'}, PodAnnotation{key='runUrl', value='job/hello/14/'}]}
java.lang.IllegalStateException: Pod is no longer available: jenkins/test-pod-gb4vq-pdd69
...但这表明容器正在启动,然后失败。循环似乎是因为Kubernetes插件中的错误处理没有正确地捕捉到它并导致作业失败。
通过监视构建结束符(使用k9s),我能够捕获荚的日志,而Unknown client name听起来也是由快速容器终端引起的:
jnlp INFO: [JNLP4-connect connection to 172.16.1.12/172.16.1.12:50000] Local headers refused by remote: Unknown client name: test-pod-34sd7-5xhs2
jnlp Apr 29, 2021 10:42:15 PM hudson.remoting.jnlp.Main$CuiListener status
jnlp INFO: Protocol JNLP4-connect encountered an unexpected exception
jnlp java.util.concurrent.ExecutionException: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: test-pod-34sd7-5xhs2
jnlp at org.jenkinsci.remoting.util.SettableFuture.get(SettableFuture.java:223)
jnlp at hudson.remoting.Engine.innerRun(Engine.java:743)
jnlp at hudson.remoting.Engine.run(Engine.java:518)
jnlp Caused by: org.jenkinsci.remoting.protocol.impl.ConnectionRefusalException: Unknown client name: test-pod-34sd7-5xhs2
jnlp at org.jenkinsci.remoting.protocol.impl.ConnectionHeadersFilterLayer.newAbortCause(ConnectionHeadersFilterLayer.java:378)刚刚发现了一个类似的问题
这是有用的:我在podRetention: always(),之后将podTemplate()添加到label中,这样代理荚就不会终止,并显示Error。
好发现
由于上面错误地保留了豆荚,我现在可以找到/var/log/containers/<failed pod>.log,它使我找到了一个根本原因。
2021-04-30T08:59:36.047989534-04:00 stderr F java.net.UnknownHostException: updates.jenkins.io
这是因为dnsPolicy限制DNS只能进行集群查找.解决方法是将hostNetwork: true添加到label旁边的podTemplate()中。
接下来,博客推荐的图像trion/jenkins-docker-client是客户端和服务器,所以它是错误的图像。
切换到jenkins/agent会产生一个新的问题。这个吊舱现在上上下下什么都不做,甚至连伐木都不做。我怀疑这是一个发射参数问题。
现在很明显,我甚至不应该在Jenkinsfile中有一个Jenkins容器,因为Kubernetes插件将自动启动一个JNLP容器。
这意味着问题是,最后,node14容器-它要么是立即错误,要么立即发现什么都没有做和终止。
发布于 2021-04-30 15:36:24
错误处理很难理解和排除故障,博客也是错误的。
从最起码的工作代理Jenkinsfile开始:
podTemplate(
name: 'build-pod',
namespace: 'jenkins',
podRetention: always(), // for debugging
{
node(POD_LABEL) {
stage('Build') {
sh "echo hello"
}
}
}
)从这里开始,用containers、volumes、container构建部分等一步一步地扩展它。
使用日志进行故障排除:
kubectl get pods -n jenkins列出荚名,然后列出kubectl logs -f <jenkins-pod> -n jenkins
(假设jenkins是您的Kubernetes命名空间)
https://stackoverflow.com/questions/67325765
复制相似问题