I wasn't able to find this project hosted at the original location anymore, so I published it here. All credit goes to the original authors. Fork of http://asterixdb.ics.uci.edu/fuzzyjoin/ Efficient Parallel Set-Similarity Joins Using MapReduce. Rares Vernica, Michael J. Carey, Chen Li SIGMOD 2010
Copyright 2010-2011 The Regents of the University of California
Licensed under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: Rares Vernica <rares (at) ics.uci.edu>
Please see fuzzyjoin-hadoop/README.txt or fuzzyjoin-hadoop/README.html for general usage and a Quick Start guide.
Please see http://asterix.ics.uci.edu/fuzzyjoin-mapreduce/FAQ.html for an updated FAQ list.
For SIGMOD 2010 Repeatability & Workability Evaluation please see fuzzyjoin-hadoop/src/test/scripts/rwe/README.txt.