Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 752 Bytes

fu06-processing_graphs.asciidoc

File metadata and controls

22 lines (14 loc) · 752 Bytes

Processing Graphs

Assumptions:

  • V fits in ram

  • E does not fit in ram

  • Graph is sparse — E/V^2 << 1 (the typical degree is smear less than V)

Power law Distribution of keys and the Skew Problem

Notation

TODO: move this to early in book TODO: reconcile with pig notation

{a, b} -- record with fields a and b
<P | a, b> -- reduce able record with partition key P
<P | S | a, b> -- reduceable record with partition key P and secondary sort key S (all objects with same P arrive to same reducer, sorted by P then S)
<P1, P2 | S1, S2 | a, b> -- partitioned by P1 and P2, sorted by P1 then by P2 then by S1 then by S2. Objects with the same P1 may go to different reducers.
{a, b, {(c})} -- bag field c