JRuby on Hadoop¶ ↑

Description¶ ↑

JRuby on Hadoop is a thin wrapper for Hadoop Mapper / Reducer by JRuby. This is a fork of the original project and works with Hadoop 0.20.1, whilst the original works with the deprecated Hadoop API for the latest version of Hadoop.

Install¶ ↑

Required gems are all on GemCutter.

Upgrade your rubygem to 1.3.5
Clone this project, build the gem and install it

Usage¶ ↑

Run Hadoop cluster on your machines and set HADOOP_HOME env variable.
put files into your hdfs. ex) test/inputs/file1
Now you can run ‘joh’ like below:

$ joh examples/wordcount.rb test/inputs/file1 test/outputs

You can get Hadoop job results in your hdfs test/outputs/part-*

Example ¶ ↑

see also examples/wordcount.rb

def setup(job)
  #configure the MapReduce Job
end

def map(key, value, context)
  value.split.each do |word|
    context.collect(word, 1)
  end
end

def reduce(key, values, context)
  sum = 0
  values.each {|v| sum += v }
  context.collect(key, sum)
end

Build¶ ↑

You can build hadoop-ruby.jar by “ant”.

ant

Required to set env HADOOP_HOME for your system. Assumed Hadoop version is 0.20.1.

Authors¶ ↑

Koichi Fujikawa <fujibee@gmail.com> Abhinay Mehta <abhinay.mehta@gmail.com>

Copyright¶ ↑

License: Apache License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rdoc

README.rdoc

JRuby on Hadoop¶ ↑

Description¶ ↑

Install¶ ↑

Usage¶ ↑

Example ¶ ↑

Build¶ ↑

Authors¶ ↑

Copyright¶ ↑

Files

README.rdoc

Latest commit

History

README.rdoc

File metadata and controls

JRuby on Hadoop¶ ↑

Description¶ ↑

Install¶ ↑

Usage¶ ↑

Example ¶ ↑

Build¶ ↑

Authors¶ ↑

Copyright¶ ↑