首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用假脱机目录写入管道如何重命名文件

使用假脱机目录写入管道如何重命名文件
EN

Stack Overflow用户
提问于 2016-05-30 17:29:45
回答 1查看 742关注 0票数 1

我是写信给hdfs使用水槽线轴目录。这是我的密码

代码语言:javascript
复制
 #initialize agent's source, channel and sink
agent.sources = test
agent.channels = memoryChannel
agent.sinks = flumeHDFS

# Setting the source to spool directory where the file exists
agent.sources.test.type = spooldir
agent.sources.test.spoolDir = /johir
agent.sources.test.fileHeader = false
agent.sources.test.fileSuffix = .COMPLETED

# Setting the channel to memory
agent.channels.memoryChannel.type = memory
# Max number of events stored in the memory channel
agent.channels.memoryChannel.capacity = 10000
# agent.channels.memoryChannel.batchSize = 15000
agent.channels.memoryChannel.transactioncapacity = 1000000

# Setting the sink to HDFS
agent.sinks.flumeHDFS.type = hdfs
agent.sinks.flumeHDFS.hdfs.path =/user/root/
agent.sinks.flumeHDFS.hdfs.fileType = DataStream

# Write format can be text or writable
agent.sinks.flumeHDFS.hdfs.writeFormat = Text

# use a single csv file at a time
agent.sinks.flumeHDFS.hdfs.maxOpenFiles = 1

# rollover file based on maximum size of 10 MB
agent.sinks.flumeHDFS.hdfs.rollCount=0
agent.sinks.flumeHDFS.hdfs.rollInterval=0
agent.sinks.flumeHDFS.hdfs.rollSize = 1000000
agent.sinks.flumeHDFS.hdfs.batchSize =1000

# never rollover based on the number of events
agent.sinks.flumeHDFS.hdfs.rollCount = 0

# rollover file based on max time of 1 min
#agent.sinks.flumeHDFS.hdfs.rollInterval = 0
# agent.sinks.flumeHDFS.hdfs.idleTimeout = 600

# Connect source and sink with channel
agent.sources.test.channels = memoryChannel
agent.sinks.flumeHDFS.channel = memoryChannel

但他的问题是,数据被写入文件,被重命名为某个随机的tmp名称。如何将hdfs中的文件重命名为源目录中的原始文件名。例如,我有文件day1.txt、day2.txt、day3.txt。这是两天的数据。我希望将它们保存在hdfs中,如day1.txt、day2.txt、day3.txt。但是这三个文件被合并并作为FlumeData.1464629158164.tmp文件存储在hdfs中。有办法这样做吗?

EN

回答 1

Stack Overflow用户

发布于 2016-06-01 21:42:58

如果要保留原始文件名,则应将文件名作为标题附加到每个事件。

  1. 将basenameHeader属性设置为true。这将创建一个带有basename键的标题,除非使用basenameHeaderKey属性设置为其他内容。
  2. 使用hdfs.filePrefix属性使用basenameHeader值设置文件名。

将以下属性添加到配置文件中。

代码语言:javascript
复制
#source properties
agent.sources.test.basenameHeader = true

#sink properties
agent.sinks.flumeHDFS.type = hdfs
agent.sinks.flumeHDFS.hdfs.filePrefix = %{basename}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37531021

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档