Skip to content

Latest commit

 

History

History
29 lines (13 loc) · 1 KB

fu08-pig_udfs.asciidoc

File metadata and controls

29 lines (13 loc) · 1 KB

Pig UDFs (User-Defined Functions)

LoadFunc / StoreFunc : Wonderdog — an ElasticSearch UDF

placeholder


[^1] Doesn’t have to be Pig — Hive, Cascading, and Crunch and other high-level frameworks abstract out the boring stuff while still making it easy to write custom components.

[^2] If the novel lasts all week, someone will tell this joke and then we will walk carefully to the bar.

The church, it is close by -- but the way is cold and icy.
The bar, it is far away -- but we shall walk carefully.

[^3] …​ and when the harsh reality of a production dataset reveals that your data has an unforseen and crippling "stuck reducer" problem, you’re facing a fundamental re-think of your program’s design rather than a one-line change from JOIN to SKEW JOIN. See the chapter on Advanced Pig.

Algebraic UDFs let Pig go fast

One of the great things

Geographic Merge JOIN

I think we will want a specialized merge join for the geo-gridded data. So that will go here I think.