Skip to content

Latest commit

 

History

History
59 lines (33 loc) · 3.7 KB

ba3d-locality-efficient_santa.asciidoc

File metadata and controls

59 lines (33 loc) · 3.7 KB

Sorted Batches

Santa delivers presents in order as the holidays arrive, racing the sun from New Zealand, through Asia and Africa and Europe, until the finish in American Samoa.

This is a literal locality: the presents for Auckland must go in a sack together, and Sydney, and Petropavlovsk, and so forth.

Recall that each elephant carries the work orders destined for one workstation. What’s more, on the back of each pygmy elephant is a vertical file like you find at a very organized person’s desk:

paper sorter

Chimpanzees file each toy request in the order of Santa’s path through the world. This is easy, because the files never grow very large and because chimpanzees are very dextrous. So when a pygmy elephant trundles off, all the puppy requests are together in order from Auckland to Samoa, and the robot requests are together, also in order, and so on:

Secondary sort

This makes life very efficient for the elves. They just start pulling work orders from their elephants, always choosing the request that’s next in Santa Visit Order:

Secondary sort

Elves do not have the prodigious memory that elephants do, but they can easily keep track of the next few dozen work orders each elephant holds. That way there is very little time spent seeking out the next work order. Elves assemble toys as fast as their hammers can fly, and the toys come out in the order Santa needs to make little children happy.

The Map-Reduce Haiku

As you recall, the bargain that Map/Reduce proposes is that you agree to only write programs that fit this Haiku:

data flutters by
    elephants make sturdy piles
  insight shuffles forth

More prosaically,

  1. label  — turn each input record into any number of labelled records

  2. group/sort — hadoop groups those records uniquely under each label, in a sorted order

  3. reduce  — for each group, process its records in order; emit anything you want.

The trick lies in the 'group/sort' step: assigning the same label to two records in the 'label' step ensures that they will become local in the reduce step.

The machines in stage 1 ('label') are allowed no locality. They see each record exactly once, but with no promises as to order, and no promises as to which one sees which record. We’ve 'moved the compute to the data', allowing each process to work quietly on the data in its work space.

As each pile of output products starts to accumulate, we can begin to group them. Every group is assigned to its own reducer. When a pile reaches a convenient size, it is shipped to the appropriate reducer while the mapper keeps working. Once the map finishes, we organize those piles for its reducer to process, each in proper order.

If you notice, the only time data moves from one machine to another is when the intermediate piles of data get shipped. Instead of monkeys flinging poo, we now have a dignified elephant parade conducted in concert with the efforts of our diligent workers.

Note

Stream steps become mapper-only jobs, but don’t conflate a reshape step with the reduce phase of a job. A reshape step typically becomes at least one mapper phase and reducer phase.

The Reducer Guarantee

It’s very important that you understand the guarantee Hadoop provides to each reducer, and what it unlocks.

  • Each mapper-output record goes to exactly one reducer, as solely determined by its key.

  • All records for a given key go to the same reducer

  • If a reducer sees any element of a group, it sees all elements of the group. A reducer may see multiple groups.

  • Records are fed to the reducer in order by key.