Skip to content

Commit

Permalink
Edits for Meghan
Browse files Browse the repository at this point in the history
  • Loading branch information
rjurney committed Jun 15, 2015
1 parent eaea572 commit 9f26ae8
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 7 deletions.
8 changes: 5 additions & 3 deletions Ch00-preface.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -53,21 +53,23 @@ That is because for almost everyone, the cost of the cluster is far less than th

The book does have some content on provisioning and deploying Hadoop, and on a few important settings. But it does not cover advanced algorithms, operations or tuning in any real depth.

We'll be presenting the theory behind Hadoop using an allegory, with our friends J.T. (as in Job Tracker) the chimpanzee and Nanette (as in Name Node) the elephant. Afterwards we'll cover the practice of operating Hadoop to perform simple operations and then move on to analytics patterns.

==== Theory: Chimpanzee and Elephant

Starting with Chapter 2, you'll meet the zealous members of the Chimpanzee and Elephant Company. Elephants have prodigious memories and move large heavy volumes with ease. They'll give you a physical analogue for using relationships to assemble data into context, and help you understand what's easy and what's hard in moving around massive amounts of data. Chimpanzees are clever but can only think about one thing at a time. They'll show you how to write simple transformations with a single concern and how to analyze petabytes of data with no more than megabytes of working space.
Starting in Chapter 1, you'll meet these zealous members of the Chimpanzee and Elephant Company. Elephants have prodigious memories and move large heavy volumes with ease. They'll give you a physical analogue for using relationships to assemble data into context, and help you understand what's easy and what's hard in moving around massive amounts of data. Chimpanzees are clever but can only think about one thing at a time. They'll show you how to write simple transformations with a single concern and how to analyze petabytes of data with no more than megabytes of working space.

Together, they'll equip you with a physical metaphor for how to work with data at scale.

==== Practice: Hadoop ====
==== Practice: Hadoop

In Doug Cutting's words, Hadoop is the "kernel of the big-data operating system". It is the dominant batch-processing solution, has both commercial enterprise support and a huge open source community, runs on every platform and cloud, and there are no signs any of that will change in the near term.

The code in this book will run unmodified on your laptop computer or on an industrial-strength Hadoop cluster. We'll provide you with a virtual Hadoop cluster using docker that will run on your own laptop.

==== Example Code ====

he [source code for the book](https://github.com/bd4c/big_data_for_chimps-code) is available at `https://github.com/bd4c/big_data_for_chimps-code`. You can check it out with git via:
The https://github.com/bd4c/big_data_for_chimps-code[source code for the book] is available at `https://github.com/bd4c/big_data_for_chimps-code`. You can check it out with git via:

----
git clone https://github.com/bd4c/big_data_for_chimps-code
Expand Down
2 changes: 0 additions & 2 deletions Ch01-hadoop_basics.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@
[[hadoop_basics]]
== Hadoop Basics

=== Introduction

++++
<remark>Please make sure the Chimpanzee and Elephant Start a Business big does NOT appear before this Introduction.</remark>
++++
Expand Down
2 changes: 0 additions & 2 deletions Ch02-map_reduce.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@

== Map/Reduce

=== Introduction

In this chapter we're going to build upon what we learned last chapter about HDFS and the map-only portion of map/reduce and introduce a full map/reduce job and the mechanics of map/reduce. This time we'll include both the shuffle/sort phase, and the reduce phase. Once again, we begin with a physical metaphor in the form of a story. After that we'll walk you through building our first full-blown map/reduce job in Python. At the end of this chapter, you should have an intuitive understanding of how map/reduce works, including its map, shuffle/sort and reduce phases.

First, we begin with a metaphoric story... about how Chimpanzee and Elephant saved Christmas.
Expand Down

0 comments on commit 9f26ae8

Please sign in to comment.