A collection of pig scripts and UDFs for performing web archive analysis tasks. The scripts rely on Apache Hadoop and Pig Latin and make use of the archive-metadata-extractor library.
A collection of pig scripts and UDFs for performing web archive analysis tasks. The scripts rely on Apache Hadoop and Pig Latin and make use of the archive-metadata-extractor library.