Experiences with MapReduce, an Abstraction for Large-Scale Computation Jeff Dean Google, Inc. 2 Outline • Overview of our computing environment • MapReduce – overview, examples – implementation details – usage stats • Implications for parallel program development 3 Problem: lots of data • Example: 20+ billion web pages x 20KB = 400+ terabytes • One computer can read 30-35 MB/sec from disk – ~four

