寫在前面
本系列文章索引以及一些默認好的條件在 傳送門
step1 eclipse訪問hadoop
首先需要明確eclipse安裝目錄,然后將hadoop-eclipse-plugin_版本號.jar插件放在安裝目錄的dropins下
關(guān)于插件,可以通過博主上傳到csdn的免費資源獲取,鏈接
具體版本可以自己選擇:
step2 重啟并配置eclipse
在eclipse界面中依次選擇:Window→show view→other→MapReduce Tools/Map/Reduce Locations
然后在界面上會顯示一個大象符號,如下圖:
點擊大象符號,然后就會配置一個DFS Location
配置參數(shù)為:
配置好后點擊finish
然后會出現(xiàn):
step3 新建Map/Reduce項目
在Eclipse中依次選擇File→New→Other…→Map/Reduce Project→Next,
輸入項目名TestWordCount,瀏覽并選擇Hadoop路徑/usr/local/hadoop,單擊“Finish”按鈕,則新建項目TestWordCount成功
新建包并在包下創(chuàng)建相應(yīng)的類,具體項目結(jié)構(gòu)如下:
step4 復(fù)制粘貼代碼
package wordCount;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class WordCountMapper extends Mapper<Object, Text, Text, IntWritable> {
/**
* 定義final 靜態(tài)變量one,用來標示鍵值對<String,One>
* 然后定義Text類型的變量來存儲字符串
*/
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {//
word.set(itr.nextToken());
/**
* 寫入鍵值對
*/
context.write(word, one);
}
}
}
public static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
if (args.length != 2) {
System.err.println("Usage: <in> <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(WordCountMapper.class);
job.setCombinerClass(WordCountReducer.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
step5 配置運行參數(shù)
啟動hadoop集群后在終端寫入:文章來源:http://www.zghlxwxcb.cn/news/detail-406862.html
hadoop fs -mkdir /tmp
hadoop fs -chmod -R 777 /tmp
Run→Run Configurations選項,以配置輸入輸出參數(shù)
大致為:
點擊run直接跑
然后在左側(cè)的目錄下會顯示文件夾:
點進去看下part-r-00000文件就會發(fā)現(xiàn),該文件存儲結(jié)果:
其中輸入為:
結(jié)果與預(yù)期一致,項目成功~文章來源地址http://www.zghlxwxcb.cn/news/detail-406862.html
整個hadoop就到此結(jié)束啦
到了這里,關(guān)于Hadoop集群搭建記錄 | 云計算[CentOS7] | 偽分布式集群 Master運行WordCount的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!