首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >不能将org.apache.hadoop.io.Text转换为org.apache.hadoop.io.NullWritable

不能将org.apache.hadoop.io.Text转换为org.apache.hadoop.io.NullWritable
EN

Stack Overflow用户
提问于 2015-06-02 06:39:55
回答 1查看 3.4K关注 0票数 0

我希望在MapReduce中将序列文件转换为ORC文件。键/值的输入类型是文本/文本。

我的程序看起来就像

代码语言:javascript
复制
public class ANR extends Configured implements Tool{


public static void main(String[] args) throws  Exception {
    // TODO Auto-generated method stub

     int res = ToolRunner.run(new Configuration(),new ANR(), args);
     System.exit(res); 
}

public int run(String[] args) throws Exception {
    Logger log = Logger.getLogger(ANRmap.class.getName());
    Configuration conf = getConf();
    Job job;

    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();

     conf.set("orc.create.index", "true");  


     job = Job.getInstance(conf);

 /////

    job.setJobName("ORC Output"); 
    job.setJarByClass(ANR.class);
    job.setInputFormatClass(SequenceFileInputFormat.class);
    SequenceFileInputFormat.addInputPath(job, new Path(args[0]));   
    job.setMapperClass(ANRmap.class);
    job.setNumReduceTasks(0);
    job.setOutputFormatClass(OrcNewOutputFormat.class);  
    OrcNewOutputFormat.setCompressOutput(job,true);  

      OrcNewOutputFormat.setOutputPath(job,new Path(args[1]));  

    return job.waitForCompletion(true) ? 0: 1;  
}

马佩尔

代码语言:javascript
复制
    public class ANRmap extends Mapper<Text,Text,NullWritable,Writable> {
    private final OrcSerde serde = new OrcSerde(); 

     public void map(Text key, Text value,
                OutputCollector<NullWritable, Writable> output)
                throws IOException {
            output.collect(NullWritable.get(),serde.serialize(value, null));
        }
}

这是个例外

代码语言:javascript
复制
Error: java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.NullWritable
    at org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.write(OrcNewOutputFormat.java:37)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

OrcNewOutputFormat中的输出键是NullWritable。如何以另一种方式将文本转换为NullWritable或修复此异常?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-06-02 12:30:32

尝试使用Context而不是OutputCollector.

代码语言:javascript
复制
public class ReduceTask extends Reducer<Text,Text, Text, NullWritable>{

    public void reduce(Text key,Iterable<Text> values,Context context){

        for(Text value:values){
            try {
                context.write(key,NullWritable.get());
            } catch (IOException e) {

                e.printStackTrace();
            } catch (InterruptedException e) {

                e.printStackTrace();
            }
        }

    }
}
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/30589073

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档