MapReduce编程：数字排序-白红宇

强烈建议你试试无所不能的chatGPT，快点击我

MapReduce编程：数字排序

阅读量：4696 次

发布时间：2019-06-09

本文共 3067 字，大约阅读时间需要 10 分钟。

问题描述

将乱序数字按照升序排序。

思路描述

按照mapreduce的默认排序，依次输出key值。

代码

1 package org.apache.hadoop.examples; 2  3 import java.io.IOException; 4 import java.util.Iterator; 5 import java.util.StringTokenizer; 6 import org.apache.hadoop.conf.Configuration; 7 import org.apache.hadoop.fs.Path; 8 import org.apache.hadoop.io.IntWritable; 9 import org.apache.hadoop.io.Text;10 import org.apache.hadoop.mapreduce.Job;11 import org.apache.hadoop.mapreduce.Mapper;12 import org.apache.hadoop.mapreduce.Reducer;13 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;14 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;15  16 public class sort {17     public sort() {18     }19  20     public static void main(String[] args) throws Exception {21         Configuration conf = new Configuration();22         23         String fileAddress = "hdfs://localhost:9000/user/hadoop/";24         25         //String[] otherArgs = (new GenericOptionsParser(conf, args)).getRemainingArgs();26         String[] otherArgs = new String[]{fileAddress+"number.txt", fileAddress+"output"};27         if(otherArgs.length < 2) {28             System.err.println("Usage: sort 
     
       [
      
       ...] 
       
        ");29             System.exit(2);30         }31  32         Job job = Job.getInstance(conf, "sort");33         job.setJarByClass(sort.class);34         job.setMapperClass(sort.TokenizerMapper.class);35         //job.setCombinerClass(sort.SortReducer.class);36         job.setReducerClass(sort.SortReducer.class);37         job.setOutputKeyClass(IntWritable.class);38         job.setOutputValueClass(IntWritable.class);39  40         for(int i = 0; i < otherArgs.length - 1; ++i) {41             FileInputFormat.addInputPath(job, new Path(otherArgs[i]));42         }43  44         FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1]));45         System.exit(job.waitForCompletion(true)?0:1);46     }47     48     49     public static class TokenizerMapper extends Mapper
        
          {50          51         public TokenizerMapper() {52         }53  54         public void map(Object key, Text value, Context context) throws IOException, InterruptedException {55             StringTokenizer itr = new StringTokenizer(value.toString());56  57             while(itr.hasMoreTokens()) {58                 context.write(new IntWritable(Integer.parseInt(itr.nextToken())), new IntWritable(1));59             }60  61         }62     }63  64     65     public static class SortReducer extends Reducer
         
           {66  67         private static IntWritable num = new IntWritable(1);68         69         public SortReducer() {70         }71  72         public void reduce(IntWritable key, Iterable
          
            values, Context context) throws IOException, InterruptedException {73 74 for(Iterator
           
             i$ = values.iterator(); i$.hasNext();i$.next()) {75 context.write(num, key);76 }77 num = new IntWritable(num.get()+1);78 }79 }80 81 }

注：不能有combiner操作。

不然就会变成

转载于:https://www.cnblogs.com/zyb993963526/p/10469521.html

你可能感兴趣的文章

读《图解HTTP》有感-（返回结果的HTTP状态码）

转：文本分类问题

tensorflow_python中文手册

Vs2012在Linux应用程序开发（3）：加入新平台hi3516

adb shell am 的用法

实现自动点击

MVP开发模式的理解

Unity多开的方法

File类中的list()和listFiles()方法

我的VS CODE插件配置主要针对.NET和前端插件配置

关于js中的事件

一致性哈希算法运用到分布式

决策树和随机森林->信息熵和条件熵

iOS10 UI教程视图和子视图的可见性

Maven学习笔记

FindChildControl与FindComponent

1、简述在java网络编程中，服务端程序与客户端程序的具体开发步骤？

C# Web版报表

中国城市json

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！-- 愿君每日到此一游！

当前时间: 2024-10-23 06:31:19 当前IP: 18.116.63.105 联系邮箱:javaeecc@qq.com Copyright © 2020 - 2022 baihongyu.com 京ICP备2021015314号-2

强烈建议你试试无所不能的CHAT-GPT，快点击我

强烈建议你试试无所不能的CHAT-GPT，快点击我