0% found this document useful (0 votes)

3 views9 pages

Big Data Analytics: Challenges & Insights

The document discusses Big Data, its challenges, and the importance of Big Data Analytics. It covers various concepts including the CAP theorem, NewSQL, Hadoop ecosystem components, and NoSQL databases. Additionally, it highlights the characteristics of data and the differences between traditional business intelligence and Big Data analytics.

Uploaded by

shahinmulla851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views9 pages

Big Data Analytics: Challenges & Insights

Uploaded by

shahinmulla851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

BIG DATA ANALYTICS ( BCS714D )

Module 1
All Answer are from the prescribed VTU textbook for the above course

1 What is Big Data? Explain the challenges With Big Data.

Big Data:
Big data is high-volume, high-velocity, and high-variety information assets that demand cost effective,
innovative forms of information processing for enhanced insight and decision making.

Challenges Of Big Data:

1. Exponential Data Growth:

Data is growing very rapidly, making it difficult to decide which data is useful, how much to analyze,
and how to separate meaningful insights from noise.
2. Cloud & Infrastructure Decisions:
While cloud computing offers scalability and cost efficiency, deciding whether to host big data
solutions inside or outside the organization remains a challenge.
3. Data Retention Period:
Determining how long to store data is difficult, as some data is valuable long-term while other data
becomes irrelevant very quickly.
4. Lack of Skilled Professionals:
There is a shortage of highly skilled data science professionals required to design and manage big
data solutions.
5. Data Management Complexity:
Capturing, storing, processing, securing, and analyzing large, fast-moving, and unstructured data
exceeds the capabilities of traditional databases.
6. Data Visualization Challenges:
Although data visualization is gaining importance, there is a lack of experts who can effectively
present complex big data insights.

2 What is big data analytics? Explain classification of analytics.

Big data analytics is the process of examining large data sets containing a variety of data types to discover
some knowledge in databases, to identify interesting patterns and establish relationships to solve problems,
market trends, customer preferences, and other useful information.

Classification Of Analytics:
3 Explain CAP Theorem?

The CAP theorem is also called the Brewer's Theorem. It states that in a distributed computing environment
(a collection of interconnected nodes that share data), it is impossible to provide the following guarantees.
At best you can have two of the following three - one must be sacrificed.
Consistency
Availability
Partition tolerance

1. Consistency (C):
Consistency means that every read operation returns the most recent write. In other words, all
nodes see the same data at the same time.
2. Availability (A):
Availability means that every request (read or write) receives a response within a reasonable amount
of time, even if some nodes fail.
3. Partition Tolerance (P):
Partition tolerance means that the system continues to operate even when network failures or
communication breakdowns occur between nodes.

4 What is NewSQL? Explain the Characteristics of NewSQL

A database that has the same scalable performance of NoSQL systems for On Line Transaction Processing
(OLTP) while still maintaining the ACID guarantees of a traditional database. This new modern RDBMS is
called NewSQL.. It supports relational data model and uses SQL as Their primary interface.
5 Explain the Hadoop Ecosystem Components for Data Processing and Data Analysis

Hadoop Ecosystem Components for Data Processing

1. MapReduce
MapReduce is a distributed programming model used to process large datasets in parallel.
• Data is read from HDFS.
• Map phase converts input data into key–value pairs.
• Reduce phase aggregates and processes this data.
• Final output is stored back in HDFS.
It provides fault-tolerant and scalable batch processing.

2. Spark
Spark is a fast, in-memory data processing framework and an alternative to MapReduce.
• Processes data in memory, making it 10–100 times faster.
• Reads data from HDFS but does not use MapReduce.
• Can run on YARN or in standalone mode.
• Supports Scala, Python, Java, and R.
Spark Libraries:
• Spark SQL – SQL-based data querying
• Spark Streaming – real-time data processing
• MLlib – machine learning support
• GraphX – graph processing

Hadoop Ecosystem Components for Data Analysis

1. Pig
Pig is a high-level scripting platform for analyzing large datasets.
• Uses Pig Latin, a SQL-like language.
• Pig scripts are converted into MapReduce jobs.
• Used mainly for ETL, data transformation, filtering, and analysis.
• Preferred by non-SQL developers.
2. Hive
Hive is a data warehousing tool built on Hadoop.
• Uses HiveQL (HQL), a SQL-like query language.
• Converts queries into MapReduce jobs.
• Used for querying, summarization, and analysis of large datasets.
• Suitable for SQL-oriented users.

Extra Imp Ques

1 Characteristics Of Data.

1. Composition
The composition of data deals with the structure of data, such as the sources of data, its granularity, types,
and nature—whether the data is static or real-time streaming.

2. Condition
The condition of data refers to the state and quality of data, that is, whether the data can be used directly
for analysis or requires cleansing, enhancement, or enrichment before use.

3. Context
The context of data explains the background of data, including where it was generated, why it was
generated, how sensitive it is, and the events or situations associated with it.
2 Why Big Data?

3 TRADITIONAL BUSINESS INTELLIGENCE (Bl) VERSUS BIG DATA

4 NoSQL

NoSQL (Not Only SQL) is a type of database that stores and manages data without using traditional
relational tables.
It is designed to handle large amounts of data, provide high performance, and support flexible data
structures, especially in distributed systems.

Few features of NoSQL databases are as follows:

1. They are open source.
2. They are non-relational.
3. They are distributed.
4. They are schema-less.
5. They are cluster friendly.
6. They are born out of 21st-century web applications.

Types of NoSQL Databases:

1. Key-Value Store – Stores data as simple key and value pairs.
2. Document Store – Stores data in document format like JSON or XML.
3. Column-Based Store – Stores data in columns instead of rows.
4. Graph Database – Stores data as nodes and relationships.

5 Typical Hadoop Environment

6 WHY IS BIG DATA ANALYTICS IMPORTANT?

1. Reactive – Business Intelligence (BI):

Uses past and historical data to generate reports, dashboards, alerts, and notifications for better
decision-making.
2. Reactive – Big Data Analytics:
Analyzes very large datasets but still works on static, historical data.
3. Proactive – Analytics:
Uses techniques like data mining and predictive modeling to support future decisions, but has limits
in storage and processing.
4. Proactive – Big Data Analytics:
Analyzes massive data volumes to quickly extract useful insights and solve complex problems.

Big Data Analytics Lecture Notes
No ratings yet
Big Data Analytics Lecture Notes
20 pages
Bad601 Model Question Paper Overview
100% (1)
Bad601 Model Question Paper Overview
32 pages
Understanding Big Data: Types and Benefits
No ratings yet
Understanding Big Data: Types and Benefits
49 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
96 pages
Big Data Analytics Overview and Hadoop Guide
No ratings yet
Big Data Analytics Overview and Hadoop Guide
20 pages
BDA Question Bank
No ratings yet
BDA Question Bank
73 pages
Big Data Analytics and Hadoop Overview
100% (3)
Big Data Analytics and Hadoop Overview
33 pages
Fulafia Sta 212
No ratings yet
Fulafia Sta 212
42 pages
BD Intqb
No ratings yet
BD Intqb
11 pages
Big Data Analytics Overview and Tools
No ratings yet
Big Data Analytics Overview and Tools
84 pages
Big Data Analytics Exam Guide
No ratings yet
Big Data Analytics Exam Guide
15 pages
Big Data Analytics Course Notes for VTU
No ratings yet
Big Data Analytics Course Notes for VTU
46 pages
Understanding Big Data: Evolution & Challenges
No ratings yet
Understanding Big Data: Evolution & Challenges
10 pages
Big Data Analytics Valuation Scheme
No ratings yet
Big Data Analytics Valuation Scheme
18 pages
Big Data and NoSQL: Key Concepts Explained
No ratings yet
Big Data and NoSQL: Key Concepts Explained
6 pages
Understanding the 5 V's of Big Data
No ratings yet
Understanding the 5 V's of Big Data
2 pages
Understanding Data Analytics and Hadoop
No ratings yet
Understanding Data Analytics and Hadoop
47 pages
Big Data Analytics Overview and Concepts
No ratings yet
Big Data Analytics Overview and Concepts
168 pages
Understanding Big Data Concepts and Technologies
No ratings yet
Understanding Big Data Concepts and Technologies
9 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
45 pages
Types of Big Data in DBMS
No ratings yet
Types of Big Data in DBMS
4 pages
Cp5293 Big Data Analytics Question Bank
0% (1)
Cp5293 Big Data Analytics Question Bank
13 pages
cp5293 Big Data Analytics Question Bank
0% (1)
cp5293 Big Data Analytics Question Bank
13 pages
Module 1 Bda
No ratings yet
Module 1 Bda
8 pages
Understanding Big Data and Analytics
No ratings yet
Understanding Big Data and Analytics
18 pages
Big Data Analytics
No ratings yet
Big Data Analytics
45 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
98 pages
CCS334 Big Data Analytics Notes
No ratings yet
CCS334 Big Data Analytics Notes
33 pages
Big Data Concepts and Analytics Tools
No ratings yet
Big Data Concepts and Analytics Tools
21 pages
Big Data Analysis: Key Concepts & Tools
No ratings yet
Big Data Analysis: Key Concepts & Tools
28 pages
Big - Data - Analysis. NOTES
No ratings yet
Big - Data - Analysis. NOTES
33 pages
Understanding Big Data and Its Impact
No ratings yet
Understanding Big Data and Its Impact
47 pages
Understanding Big Data and Analytics
No ratings yet
Understanding Big Data and Analytics
35 pages
Big Data Analytics Question Answers
No ratings yet
Big Data Analytics Question Answers
29 pages
Big Data Final
No ratings yet
Big Data Final
26 pages
Big Data Analysis: Tools and Techniques
No ratings yet
Big Data Analysis: Tools and Techniques
41 pages
BDT SEM Answers
No ratings yet
BDT SEM Answers
101 pages
Big Data Analytics Fundamentals Guide
No ratings yet
Big Data Analytics Fundamentals Guide
151 pages
Big Data Analytics Overview and Techniques
No ratings yet
Big Data Analytics Overview and Techniques
72 pages
Big Data Analytics For 5th Sem PGDM Notes
No ratings yet
Big Data Analytics For 5th Sem PGDM Notes
25 pages
Big Data Characteristics and Hadoop Overview
No ratings yet
Big Data Characteristics and Hadoop Overview
7 pages
CCS334 Big Data Analytics Overview
No ratings yet
CCS334 Big Data Analytics Overview
20 pages
Big Data and Data Science Overview
No ratings yet
Big Data and Data Science Overview
60 pages
Big Data Analytics Overview
No ratings yet
Big Data Analytics Overview
18 pages
Types of Big Data Analytics Explained
No ratings yet
Types of Big Data Analytics Explained
21 pages
Big Data Analytics Overview and Methods
No ratings yet
Big Data Analytics Overview and Methods
62 pages
Bda Unit 1
No ratings yet
Bda Unit 1
309 pages
Digital Data Classification and Big Data Insights
No ratings yet
Digital Data Classification and Big Data Insights
24 pages
CCS334 Question Bank Big Data
No ratings yet
CCS334 Question Bank Big Data
20 pages
Big Data Analytics Notes Unit 1,2,3,4,&5
No ratings yet
Big Data Analytics Notes Unit 1,2,3,4,&5
61 pages
Big Data Analytics Overview and Tools
No ratings yet
Big Data Analytics Overview and Tools
26 pages
Types and Characteristics of Digital Data
No ratings yet
Types and Characteristics of Digital Data
11 pages
Understanding Big Data: Key Concepts & Challenges
No ratings yet
Understanding Big Data: Key Concepts & Challenges
15 pages
Understanding Big Data and Analytics
No ratings yet
Understanding Big Data and Analytics
1 page
Big Data Analytics: Ecosystem & Insights
No ratings yet
Big Data Analytics: Ecosystem & Insights
27 pages
Data and Databases Overview Module
No ratings yet
Data and Databases Overview Module
5 pages
11451-Article Text-13099-1-10-20061123 PDF
No ratings yet
11451-Article Text-13099-1-10-20061123 PDF
17 pages
Relational Database Integrity Terms
No ratings yet
Relational Database Integrity Terms
16 pages
ER Diagram for Bloomfield Garden Centre
100% (1)
ER Diagram for Bloomfield Garden Centre
5 pages
CRUD Operations Project Overview
No ratings yet
CRUD Operations Project Overview
12 pages
Database Management System Study Guide
No ratings yet
Database Management System Study Guide
155 pages
LIC Record Management System Overview
No ratings yet
LIC Record Management System Overview
19 pages
Overview of Distributed Databases
No ratings yet
Overview of Distributed Databases
16 pages
NoSQL Databases: Types and Use Cases
No ratings yet
NoSQL Databases: Types and Use Cases
70 pages
Hive: Efficient Data Warehousing Solution
No ratings yet
Hive: Efficient Data Warehousing Solution
24 pages
Article1380729079 - Sapuan Et Al
No ratings yet
Article1380729079 - Sapuan Et Al
8 pages
CDS View Creation and Transport Guide
No ratings yet
CDS View Creation and Transport Guide
27 pages
MySQL Queries Comprehensive Guide
No ratings yet
MySQL Queries Comprehensive Guide
8 pages
Python & SQL Programming Tasks
No ratings yet
Python & SQL Programming Tasks
3 pages
SAP HANA Database Schema Export Error by Irshad Rather 1743864573
No ratings yet
SAP HANA Database Schema Export Error by Irshad Rather 1743864573
5 pages
Managing Full SYSAUX Tablespace Issues
No ratings yet
Managing Full SYSAUX Tablespace Issues
47 pages
Understanding MongoDB: Features & Use Cases
No ratings yet
Understanding MongoDB: Features & Use Cases
8 pages
Understanding ER Diagrams in DBMS
No ratings yet
Understanding ER Diagrams in DBMS
7 pages
Total Queries in MS Access Explained
No ratings yet
Total Queries in MS Access Explained
4 pages
DBMS UGC NET December 2022 PYQs Analysis
No ratings yet
DBMS UGC NET December 2022 PYQs Analysis
38 pages
Authentication in Distributed Systems Guide
No ratings yet
Authentication in Distributed Systems Guide
34 pages
KodNest Python Full Stack Internship Report
No ratings yet
KodNest Python Full Stack Internship Report
34 pages
MySQL Storage Engines Overview
No ratings yet
MySQL Storage Engines Overview
4 pages
Understanding GIS: Definition & Components
No ratings yet
Understanding GIS: Definition & Components
9 pages
OpenCAD EAM Integration Guide
No ratings yet
OpenCAD EAM Integration Guide
14 pages
Graph Database Performance Tuning: JVM Optimization For Enterprise Scale 176287
No ratings yet
Graph Database Performance Tuning: JVM Optimization For Enterprise Scale 176287
4 pages
Understanding ER Diagrams and Models
No ratings yet
Understanding ER Diagrams and Models
32 pages
Understanding Genetic Data Warehouses
No ratings yet
Understanding Genetic Data Warehouses
8 pages
Oracle 1Z0-148 Exam Guide
No ratings yet
Oracle 1Z0-148 Exam Guide
19 pages
Multiversion Concurrency Control Explained
No ratings yet
Multiversion Concurrency Control Explained
2 pages

Big Data Analytics: Challenges & Insights

Uploaded by

Big Data Analytics: Challenges & Insights

Uploaded by

BIG DATA ANALYTICS ( BCS714D )

1 What is Big Data? Explain the challenges With Big Data.

Challenges Of Big Data:

1. Exponential Data Growth:

2 What is big data analytics? Explain classification of analytics.

4 What is NewSQL? Explain the Characteristics of NewSQL

Hadoop Ecosystem Components for Data Processing

Hadoop Ecosystem Components for Data Analysis

Extra Imp Ques

3 TRADITIONAL BUSINESS INTELLIGENCE (Bl) VERSUS BIG DATA

Few features of NoSQL databases are as follows:

Types of NoSQL Databases:

5 Typical Hadoop Environment

1. Reactive – Business Intelligence (BI):

You might also like