R20 MCA
SIDDHARTH INSTITUTE OF ENGINEERING & TECHNOLOGY :: PUTTUR
(AUTONOMOUS)
MCA II Year II Semester L T P C
3 1 - 4
(20MC9141) BIG DATA ANALYTICS
( PROGRAMME ELECTIVE IV)
COURSE OBJECTIVES
1. To explore the fundamental concepts of big data analytics
2. To learn to analyze the big data using intelligent techniques.
3. To understand the applications using Map Reduce Concepts.
COURSE OUTCOMES
The students will be able to:
1. Work with big data platform and analyze the big data analytic techniques for useful
business applications.
2. Design efficient algorithms for mining the data from large volumes.
3. Analyze the HADOOPtechnologies associated with big data analytics
4. Analyze the Map Reduce technologies associated with big data analytics
5. Explore on Big Data applications Using Pig and Hive
6. Understand the fundamentals of various bigdata analysis techniques
UNIT-I
Introduction to BigData Platform: Challenges of Conventional Systems - Intelligent data
analysis Nature of Data - Analytic Processes and Tools - Analysis vs Reporting - Modern
Data Analytic Tools.
Statistical Concepts: Sampling Distributions - Re-Sampling - Statistical Inference -
Prediction Error.
UNIT-II
Introduction To Streams Concepts: Stream Data Model and Architecture - Stream
Computing - Sampling Data in a Stream Filtering Streams Counting Distinct Elements in
a Stream Estimating Moments Counting Oneness in a Window Decaying Window.
Real time Analytics Platform: RTAP Applications - Case Studies - Real Time Sentiment
Analysis, Stock Market Predictions.
UNIT-III
History of Hadoop: - The Hadoop Distributed File System Components of Hadoop -
Analyzing the Data with Hadoop - Scaling Out - Hadoop Streaming - Design of HDFS - Java
interfaces to HDFS Basics.
Developing a Map Reduce Application: How Map Reduce Works - Anatomy of a Map
Reduce Job run Failures - Job Scheduling - Shuffle and Sort Task execution - Map
Reduce Types and Formats - Map Reduce Features.
UNIT-IV
Setting up a Hadoop Cluster: Cluster specification - Cluster Setup and Installation
Hadoop Configuration - Security in Hadoop.
P a g e 90 | 93
R20 MCA
Administering Hadoop: Administering HDFS Monitoring Maintenance Hadoop
benchmarks - Hadoop in the cloud.
UNIT-V
Applications on Big Data Using Pig and Hive: Data processing operators in Pig Hive
services HiveQL Querying Data in Hive - fundamentals of HBase and ZooKeeper - IBM
InfoSphereBigInsights and Streams.
Visualizations: Visual data analysis techniques, interaction techniques; Systems and
applications
TEXT BOOKS
1. Intelligent Data Analysis, Michael Berthold, David J. Hand, Springer, 2007.
2. Hadoop: The Definitive Guide, Tom White,
REFERENCES
1. Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming
Data,Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos,
McGrawHill Publishing, 2012
2. Mining of Massive Datasets, AnandRajaraman and Jeffrey David Ullman,
Cambridge University Press, 2012.
3. Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams
with Advanced Analytics, Bill Franks, JohnWiley& sons, 2012.
4. Making Sense of Data, Glenn J. Myatt, John Wiley & Sons, 2007
P a g e 91 | 93
This document was created with the Win2PDF “Print to PDF” printer available at
[Link]
This version of Win2PDF 10 is for evaluation and non-commercial use only.
Visit [Link] for a 30 day trial license.
This page will not be added after purchasing Win2PDF.
[Link]