首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何从持久卷向elasticSearch摄取.json文件

如何从持久卷向elasticSearch摄取.json文件
EN

Stack Overflow用户
提问于 2021-09-21 13:40:59
回答 2查看 89关注 0票数 0

我有一个网络爬虫,抓取数以千计的网站,并将其存储在Kubernetes上的永久卷中。

在这个pod终止之后,我想将PV中的.json文件摄取到ECK中。我已经成功地运行了elastic search和Kibana。另外,我正在遵循Quickstart指南。

EN

回答 2

Stack Overflow用户

发布于 2021-09-22 04:32:08

票数 0
EN

Stack Overflow用户

发布于 2021-09-22 10:55:09

您可以使用filebeat守护程序来实现这一点,可以从以下位置下载:curl -L -O https://raw.githubusercontent.com/elastic/beats/7.14/deploy/kubernetes/filebeat-kubernetes.yaml

代码语言:javascript
复制
filebeat.autodiscover:
     providers:
       - type: kubernetes
         node: ${NODE_NAME}
         hints.enabled: true
         hints.default_config:
           type: container
           paths:
             - /var/log/containers/*${data.kubernetes.container.id}.log

Filebeat守护程序从stderr读取,您可以遵循以下两种方法中的任何一种:

  1. 您可以这样创建dockerfile,它将日志写入stderr,这类似于Nginx人员所做的事情:

代码语言:javascript
复制
# forward request and error logs to docker log collector
RUN ln -sf /dev/stdout /var/log/nginx/access.log \
    && ln -sf /dev/stderr /var/log/nginx/error.log

回购信息(https://github.com/nginxinc/docker-nginx/blob/8921999083def7ba43a06fabd5f80e4406651353/mainline/jessie/Dockerfile#L21-L23)

2)使用流式侧车,如Kubernetes docs:https://kubernetes.io/docs/concepts/cluster-administration/logging/#streaming-sidecar-container中所述。

示例Dockerfile

代码语言:javascript
复制
FROM  node:13.3.0-alpine
WORKDIR /app
COPY package.json .
RUN npm install --only=prod
COPY . .
RUN ln -sf /dev/stdout /app/path/jsonfile.json
CMD ["npm","start"]

文件节拍守护进程示例

代码语言:javascript
复制
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    # filebeat.inputs:
    # - type: container
    #   paths:
    #     - /var/log/containers/*.log
    #   json.keys_under_root: true
    #   json.add_error_key: true
    #   json.message_key: message
    #   processors:
    #     - add_kubernetes_metadata:
    #         host: ${NODE_NAME}
    #         matchers:
    #         - logs_path:
    #             logs_path: "/var/log/containers/"
            
    # To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
    filebeat.autodiscover:
     providers:
       - type: kubernetes
         node: ${NODE_NAME}
         hints.enabled: true
         hints.default_config:
           type: container
           paths:
             - /var/log/containers/*${data.kubernetes.container.id}.log

    processors:
      - add_cloud_metadata:
      - add_host_metadata:
      # - add_kubernetes_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME}
      password: ${ELASTICSEARCH_PASSWORD}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  selector:
    matchLabels:
      k8s-app: filebeat
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
        - name: filebeat
          image: docker.elastic.co/beats/filebeat:7.13.3
          args: ["-c", "/etc/filebeat.yml", "-e"]
          env:
            - name: ELASTICSEARCH_HOST
              value: "elasticsearch:9200"
            - name: ELASTICSEARCH_PORT
              value: "9200"
            - name: ELASTICSEARCH_USERNAME
              value: "elastic"
            - name: ELASTICSEARCH_PASSWORD
              value: "password"
            - name: ELASTIC_CLOUD_ID
              value:
            - name: ELASTIC_CLOUD_AUTH
              value:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          securityContext:
            runAsUser: 0
            # If using Red Hat OpenShift uncomment this:
            #privileged: true
          resources:
            limits:
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
            - name: config
              mountPath: /etc/filebeat.yml
              readOnly: true
              subPath: filebeat.yml
            - name: data
              mountPath: /usr/share/filebeat/data
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
            - name: varlog
              mountPath: /var/log
              readOnly: true
      volumes:
        - name: config
          configMap:
            defaultMode: 0640
            name: filebeat-config
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: varlog
          hostPath:
            path: /var/log
        # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
        - name: data
          hostPath:
            # When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
            path: /var/lib/filebeat-data
            type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
  - kind: ServiceAccount
    name: filebeat
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
  - apiGroups: [""] # "" indicates the core API group
    resources:
      - namespaces
      - pods
      - nodes
    verbs:
      - get
      - watch
      - list
  - apiGroups: ["apps"]
    resources:
      - replicasets
    verbs: ["get", "list", "watch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---

Elasticsearch讨论类似问题的论坛:(https://discuss.elastic.co/t/kubernetes-autodiscovery-pod-with-multiple-log-files/207405)

票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69269889

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档