文章/答案/技术大牛

发布

社区首页 >问答首页 >为s3distcp配置AWS日志记录

问为s3distcp配置AWS日志记录
EN

Stack Overflow用户

提问于 2016-06-01 17:26:54

回答 1查看 760关注 0票数 0

我希望更改s3distcp和其他hadoop命令，以便只记录WARN消息或更糟的消息，而目前它只记录INFO和更糟的消息。

如何在AWS EMR集群的头节点上配置这一点？

下面是我试图隐藏的输出的一个例子：

$ hadoop jar ~hadoop/lib/emr-s3distcp-1.0.jar --src /user/myusername/test --dest s3://some-bucket/myusername/data/test
6/06/01 17:18:03 INFO s3distcp.S3DistCp: Running with args: --src /user/myusername/test --dest s3://some-bucket/myusername/data/test 
16/06/01 17:18:03 INFO s3distcp.S3DistCp: S3DistCp args: --src /user/myusername/test --dest s3://some-bucket/myusername/data/test 
16/06/01 17:18:06 INFO s3distcp.S3DistCp: Using output path 'hdfs:/tmp/97139b69-ea86-400e-9ce4-f0718ff2b669/output'
16/06/01 17:18:06 INFO s3distcp.S3DistCp: GET http://x.x.x.x/latest/meta-data/placement/availability-zone result: us-east-1b
16/06/01 17:18:06 INFO s3distcp.FileInfoListing: Opening new file: hdfs:/tmp/97139b69-ea86-400e-9ce4-f0718ff2b669/files/1
16/06/01 17:18:06 INFO s3distcp.S3DistCp: Created 1 files to copy 88 files 
16/06/01 17:18:06 INFO s3distcp.S3DistCp: Reducer number: 15
16/06/01 17:18:06 INFO client.RMProxy: Connecting to ResourceManager at /x.x.x.x:9022
16/06/01 17:18:07 INFO input.FileInputFormat: Total input paths to process : 1
16/06/01 17:18:07 INFO mapreduce.JobSubmitter: number of splits:1
16/06/01 17:18:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1464201102672_0019
16/06/01 17:18:07 INFO impl.YarnClientImpl: Submitted application application_1464201102672_0019
16/06/01 17:18:07 INFO mapreduce.Job: The url to track the job: http://x.x.x.x:9046/proxy/application_1464201102672_0019/
16/06/01 17:18:07 INFO mapreduce.Job: Running job: job_1464201102672_0019
16/06/01 17:18:13 INFO mapreduce.Job: Job job_1464201102672_0019 running in uber mode : false
16/06/01 17:18:13 INFO mapreduce.Job:  map 0% reduce 0%
16/06/01 17:18:19 INFO mapreduce.Job:  map 100% reduce 0%
16/06/01 17:18:30 INFO mapreduce.Job:  map 100% reduce 5%
16/06/01 17:18:31 INFO mapreduce.Job:  map 100% reduce 10%
16/06/01 17:18:32 INFO mapreduce.Job:  map 100% reduce 22%
16/06/01 17:18:33 INFO mapreduce.Job:  map 100% reduce 23%
16/06/01 17:18:34 INFO mapreduce.Job:  map 100% reduce 33%
16/06/01 17:18:35 INFO mapreduce.Job:  map 100% reduce 40%
16/06/01 17:18:36 INFO mapreduce.Job:  map 100% reduce 50%
16/06/01 17:18:37 INFO mapreduce.Job:  map 100% reduce 57%
16/06/01 17:18:38 INFO mapreduce.Job:  map 100% reduce 77%
16/06/01 17:18:39 INFO mapreduce.Job:  map 100% reduce 85%
16/06/01 17:18:40 INFO mapreduce.Job:  map 100% reduce 90%
16/06/01 17:18:41 INFO mapreduce.Job:  map 100% reduce 95%
16/06/01 17:18:42 INFO mapreduce.Job:  map 100% reduce 98%
16/06/01 17:18:43 INFO mapreduce.Job:  map 100% reduce 100%
16/06/01 17:18:43 INFO mapreduce.Job: Job job_1464201102672_0019 completed successfully
16/06/01 17:18:43 INFO mapreduce.Job: Counters: 54
    File System Counters
        FILE: Number of bytes read=5447
        FILE: Number of bytes written=1640535
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=113570708
        HDFS: Number of bytes written=56776676
        HDFS: Number of read operations=401
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=206
        S3: Number of bytes read=0
        S3: Number of bytes written=0
        S3: Number of read operations=0
        S3: Number of large read operations=0
        S3: Number of write operations=0
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=15
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=166005
        Total time spent by all reduces in occupied slots (ms)=18351000
        Total time spent by all map tasks (ms)=3689
        Total time spent by all reduce tasks (ms)=203900
        Total vcore-seconds taken by all map tasks=3689
        Total vcore-seconds taken by all reduce tasks=203900
        Total megabyte-seconds taken by all map tasks=5312160
        Total megabyte-seconds taken by all reduce tasks=587232000
    Map-Reduce Framework
        Map input records=88
        Map output records=88
        Map output bytes=20500
        Map output materialized bytes=5387
        Input split bytes=138
        Combine input records=0
        Combine output records=0
        Reduce input groups=88
        Reduce shuffle bytes=5387
        Reduce input records=88
        Reduce output records=0
        Spilled Records=176
        Shuffled Maps =15
        Failed Shuffles=0
        Merged Map outputs=15
        GC time elapsed (ms)=2658
        CPU time spent (ms)=98620
        Physical memory (bytes) snapshot=5777489920
        Virtual memory (bytes) snapshot=50741022720
        Total committed heap usage (bytes)=9051308032
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=17218
    File Output Format Counters 
        Bytes Written=0
16/06/01 17:18:43 INFO s3distcp.S3DistCp: Try to recursively delete hdfs:/tmp/97139b69-ea86-400e-9ce4-f0718ff2b669/tempspace

amazon-web-services

emr

s3distcp

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-06-02 16:15:20

这样做的最佳方法似乎是更改HADOOP_ROOT_LOGGER环境变量。您可以在当前会话的linux命令行中运行此命令行，也可以将其添加到hadoop-env.sh脚本(如果应该总是这样的话)。

export HADOOP_ROOT_LOGGER="WARN,console"

WARN指定只有WARN或更糟的消息才应该被记录，而console指定也应该将消息打印到命令行。

注意:如果您想修改hadoop-env.sh文件，您可以在/etc/hadoop/conf/hadoop-env.sh或旧的电子病历集群/home/hadoop/conf/hadoop-env.sh中找到它。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/37575183

复制

相似问题

问为s3distcp配置AWS日志记录
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为s3distcp配置AWS日志记录EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为s3distcp配置AWS日志记录
EN