MapReduce Fundamentals and Examples

MapReduce is a programming model used for parallel and distributed processing of large datasets. It consists of two distinct tasks - the map task and the reduce task. The map task processes the data and generates intermediate key-value pairs. The reduce task aggregates the intermediate key-value pairs into a smaller set of key-value pairs that represent the final output. MapReduce was created by Google to solve the issue of bottlenecking that occurs when trying to process large, complex datasets on centralized systems. It divides tasks into smaller parts that are distributed across many computers to be processed in parallel.

Uploaded by

Sanidhya Singh Rajawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views2 pages

MapReduce Fundamentals and Examples

Uploaded by

Sanidhya Singh Rajawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

Fundamentals of MapReduce with Example

MapReduce is one of the core building blocks of processing in Hadoop

framework. MapReduce became the genesis of the Hadoop processing model. So,
MapReduce is a programming model that allows us to perform parallel and
distributed processing on huge data sets.

MapReduce consists of two distinct tasks – Map and Reduce. As the name
MapReduce suggests, reducer phase takes place after mapper phase has been
completed. So, the first is the map job, where a block of data is read and processed
to produce key-value pairs as intermediate outputs. The output of a Mapper or map
job (key-value pairs) is input to the Reducer. Then, the reducer aggregates those
intermediate data tuples (intermediate key-value pair) into a smaller set of tuples
or key-value pairs which is the final output.

But why MapReduce came into picture? The answer is pretty simple. Traditional
Enterprise Systems normally have a centralized server to store and process data.
This approach was not suitable to handle the data which has one or more of the
following aspects – velocity, variety, volume and complexity.

Google solved this bottleneck issue using an algorithm called MapReduce.

MapReduce divides a task into small parts and assigns them to many computers.
Later, the results are collected at one place and integrated to form the result
dataset.

The MapReduce algorithm performs the following actions-

Tokenize − Tokenizes the tweets into maps of tokens and writes them as key-value
pairs.
Filter − Filters unwanted words from the maps of tokens and writes the filtered
maps as key-value pairs.
Count − Generates a token counter per word.
Aggregate Counters − Prepares an aggregate of similar counter values into small
manageable units.
MapReduce consists of 2 steps:
• Map Function – It takes a set of data and converts it into another set of data,
where individual elements are broken down into tuples (Key-Value pair).
Example -
Input - Bus, Car, bus, car, train, car, bus, car, train, bus, TRAIN,BUS, buS,
caR, CAR, car, BUS, TRAIN.
Convert into another set of data(Key, Value) - (Bus,1), (Car,1), (bus,1),
(car,1), (train,1), (car,1), (bus,1), (car,1), (train,1), (bus,1), (TRAIN,1),
(BUS,1), (buS,1), (caR,1), (CAR,1), (car,1), (BUS,1), (TRAIN,1).
• Reduce Function – Takes the output from Map as an input and combines
those data tuples into a smaller set of tuples.
Example -
Input – Set of tuples from previous step.
Output – Smaller set of tuples – (BUS,7), (CAR,7), (TRAIN,7)

Understanding MapReduce Basics
No ratings yet
Understanding MapReduce Basics
7 pages
Understanding MapReduce Reducer Tasks
No ratings yet
Understanding MapReduce Reducer Tasks
7 pages
MongoDB Map-Reduce Overview
No ratings yet
MongoDB Map-Reduce Overview
17 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
18 pages
Introduction to MapReduce Algorithm
No ratings yet
Introduction to MapReduce Algorithm
4 pages
Big Data Mining
No ratings yet
Big Data Mining
22 pages
Understanding MapReduce for Big Data
No ratings yet
Understanding MapReduce for Big Data
7 pages
Understanding MapReduce for Big Data
No ratings yet
Understanding MapReduce for Big Data
6 pages
Understanding MapReduce Basics
No ratings yet
Understanding MapReduce Basics
25 pages
MapReduce Fundamentals for Big Data Analysis
No ratings yet
MapReduce Fundamentals for Big Data Analysis
14 pages
Bda Unit III
No ratings yet
Bda Unit III
37 pages
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
No ratings yet
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
36 pages
Introduction to MapReduce for Big Data
No ratings yet
Introduction to MapReduce for Big Data
18 pages
Understanding MapReduce Basics
No ratings yet
Understanding MapReduce Basics
17 pages
Understanding MapReduce in Big Data
No ratings yet
Understanding MapReduce in Big Data
120 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
6 pages
MapReduce Workflow and Examples
No ratings yet
MapReduce Workflow and Examples
14 pages
Next-Gen Databases: MapReduce Guide
No ratings yet
Next-Gen Databases: MapReduce Guide
73 pages
MapReduce for Large Data Processing
No ratings yet
MapReduce for Large Data Processing
3 pages
MapReduce Framework: Benefits & Process
No ratings yet
MapReduce Framework: Benefits & Process
12 pages
3 Chapter
No ratings yet
3 Chapter
29 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
35 pages
CouchDB MapReduce Overview
No ratings yet
CouchDB MapReduce Overview
57 pages
Understanding MapReduce Algorithm
No ratings yet
Understanding MapReduce Algorithm
8 pages
Understanding the MapReduce Paradigm
No ratings yet
Understanding the MapReduce Paradigm
3 pages
Understanding MapReduce in Hadoop
No ratings yet
Understanding MapReduce in Hadoop
10 pages
MapReduce for Big Data Analysis
No ratings yet
MapReduce for Big Data Analysis
7 pages
MapReduce Word Count on Multi Node Cluster
No ratings yet
MapReduce Word Count on Multi Node Cluster
10 pages
Understanding Hadoop MapReduce Framework
No ratings yet
Understanding Hadoop MapReduce Framework
4 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
2 pages
MapReduce for Big Data Processing
No ratings yet
MapReduce for Big Data Processing
22 pages
MapReduce Basics: Tasks and Algorithms
No ratings yet
MapReduce Basics: Tasks and Algorithms
18 pages
MapReduce Execution Steps Explained
No ratings yet
MapReduce Execution Steps Explained
10 pages
Bda Unit 3
No ratings yet
Bda Unit 3
14 pages
Understanding MapReduce in Cloud Computing
No ratings yet
Understanding MapReduce in Cloud Computing
3 pages
MapReduce Applications in Big Data
No ratings yet
MapReduce Applications in Big Data
8 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
10 pages
Introduction to MapReduce Concepts
No ratings yet
Introduction to MapReduce Concepts
9 pages
Understanding MapReduce Basics and Workflow
No ratings yet
Understanding MapReduce Basics and Workflow
43 pages
Understanding MapReduce Fundamentals
No ratings yet
Understanding MapReduce Fundamentals
45 pages
Unit 3B
No ratings yet
Unit 3B
36 pages
What Is MapReduce
No ratings yet
What Is MapReduce
22 pages
Understanding MapReduce Workflow
No ratings yet
Understanding MapReduce Workflow
3 pages
MapReduce Word Count Tutorial
No ratings yet
MapReduce Word Count Tutorial
4 pages
Advanced MapReduce Concepts Explained
No ratings yet
Advanced MapReduce Concepts Explained
3 pages
Unit 3 Final
No ratings yet
Unit 3 Final
17 pages
MapReduce Fundamentals and Examples
0% (1)
MapReduce Fundamentals and Examples
28 pages
Control Flow in Hadoop Job Execution
No ratings yet
Control Flow in Hadoop Job Execution
12 pages
Anatomy of MapReduce Job Execution
No ratings yet
Anatomy of MapReduce Job Execution
28 pages
MapReduce Overview: Functions & Algorithms
No ratings yet
MapReduce Overview: Functions & Algorithms
32 pages
MapReduce Phases and Features Explained
No ratings yet
MapReduce Phases and Features Explained
40 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
22 pages
Understanding MapReduce Simplified
No ratings yet
Understanding MapReduce Simplified
7 pages
Introduction to MapReduce in C
No ratings yet
Introduction to MapReduce in C
74 pages
Introduction to MapReduce Framework
No ratings yet
Introduction to MapReduce Framework
74 pages
MapReduce and HDFS Architecture Explained
No ratings yet
MapReduce and HDFS Architecture Explained
9 pages
Understanding MapReduce Fundamentals
No ratings yet
Understanding MapReduce Fundamentals
6 pages
Tcl Language and Tk GUI Overview
No ratings yet
Tcl Language and Tk GUI Overview
3 pages
ActionScript 2.0 OOP Features Overview
No ratings yet
ActionScript 2.0 OOP Features Overview
89 pages
Efficient Path Conditions in Dependence Graphs For Early Software Safety Analysis
No ratings yet
Efficient Path Conditions in Dependence Graphs For Early Software Safety Analysis
48 pages
Digital Lending Developer Role at Super.money
No ratings yet
Digital Lending Developer Role at Super.money
2 pages
Process Synchronization
No ratings yet
Process Synchronization
30 pages
PP Lab Manual CSE2025-2026
No ratings yet
PP Lab Manual CSE2025-2026
37 pages
Newton School Curriculum Overview
No ratings yet
Newton School Curriculum Overview
5 pages
Pseudocode for Tower of Hanoi Solution
No ratings yet
Pseudocode for Tower of Hanoi Solution
2 pages
Data Science Courses On Edx: Accelerate Your Career With A Data Science Program
No ratings yet
Data Science Courses On Edx: Accelerate Your Career With A Data Science Program
6 pages
C Program for Priority Queue with Heaps
No ratings yet
C Program for Priority Queue with Heaps
6 pages
Python Factorial Programming Task
No ratings yet
Python Factorial Programming Task
4 pages
Stacks and Lists in Data Structures
No ratings yet
Stacks and Lists in Data Structures
11 pages
CS11102: Intro to Computer Science Syllabus
No ratings yet
CS11102: Intro to Computer Science Syllabus
5 pages
C++ Programming II Mid-Term Exam Guide
No ratings yet
C++ Programming II Mid-Term Exam Guide
3 pages
C++ Heat Transfer Simulation Code
No ratings yet
C++ Heat Transfer Simulation Code
9 pages
ERC4626 Vault Security Audit Summary
No ratings yet
ERC4626 Vault Security Audit Summary
14 pages
HDL Verifier™ Release Notes
No ratings yet
HDL Verifier™ Release Notes
30 pages
Python Essentials for Data Analytics
No ratings yet
Python Essentials for Data Analytics
5 pages
Cloud Application Development Lab Record
No ratings yet
Cloud Application Development Lab Record
38 pages
Instruction Sequencing in Computer Org
No ratings yet
Instruction Sequencing in Computer Org
5 pages
Largay Travel Inc Integration Test Report
No ratings yet
Largay Travel Inc Integration Test Report
7 pages
AI Fundamentals: Unit-wise Questions Guide
No ratings yet
AI Fundamentals: Unit-wise Questions Guide
3 pages
SAP EWM Condition Procedures Guide
No ratings yet
SAP EWM Condition Procedures Guide
5 pages
Student Result Mini Project Assignment
No ratings yet
Student Result Mini Project Assignment
3 pages
Summer 2025 Exam Timetable
No ratings yet
Summer 2025 Exam Timetable
8 pages
Gams Users Guide PDF
No ratings yet
Gams Users Guide PDF
293 pages
B-64310EN - 03 0i-D Parameter Manual
No ratings yet
B-64310EN - 03 0i-D Parameter Manual
505 pages
Room Rental Solutions: Literature Review
No ratings yet
Room Rental Solutions: Literature Review
3 pages
Golang Quick Intro Study Notes
No ratings yet
Golang Quick Intro Study Notes
29 pages
CS-1 ALL Practicals
No ratings yet
CS-1 ALL Practicals
48 pages

MapReduce Fundamentals and Examples

Uploaded by

MapReduce Fundamentals and Examples

Uploaded by

Fundamentals of MapReduce with Example

MapReduce is one of the core building blocks of processing in Hadoop

Google solved this bottleneck issue using an algorithm called MapReduce.

The MapReduce algorithm performs the following actions-

You might also like