0% found this document useful (0 votes)
11 views33 pages

Big Data

Big Data refers to large and complex data sets that require advanced tools and techniques for processing and analysis. It encompasses high-volume, high-velocity, and high-variety information that can provide valuable insights for decision-making across various sectors. The increasing generation of data from diverse sources necessitates innovative approaches to manage and extract meaningful information effectively.

Uploaded by

shashwatmaths
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views33 pages

Big Data

Big Data refers to large and complex data sets that require advanced tools and techniques for processing and analysis. It encompasses high-volume, high-velocity, and high-variety information that can provide valuable insights for decision-making across various sectors. The increasing generation of data from diverse sources necessitates innovative approaches to manage and extract meaningful information effectively.

Uploaded by

shashwatmaths
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Big Data

1
Theme

Large-Scale Data Management


Big Data Analytics
Data Science and Analytics
• How to manage very large amounts of data and extract value and
knowledge from them
2
2
Introduction to Big Data

What is Big Data?


What makes data, “Big” Data?

3
What is Big Data?
• Big data is a collection of data sets so large and complex that it
becomes difficult to process using on-hand database
management tools.
• The challenges include capture, curation, storage, search,
sharing, analysis, and visualization.
• The trend to larger data sets is due to the additional information
derivable from analysis of a single large set of related data, as
compared to separate smaller sets with the same total amount of
data, allowing correlations to be found to
• "spot business trends, determine quality of research, prevent
diseases, link legal citations, combat crime, and determine real-
time roadway traffic conditions. (Wikipedia)
4
Big Data Definition

• No single standard definition…

“Big Data” is data whose scale, diversity, and


complexity require new architecture, techniques,
algorithms, and analytics to manage it and extract
value and hidden knowledge from it…

5
Big Data: A definition

• Put another way, big data is the


realization of greater business
intelligence by storing, processing, and
analyzing data that was previously
ignored due to the limitations of
traditional data management
technologies.

6
Definition and Characteristics
• “BD is high-volume, high-velocity and high-variety
information assets that demand cost-effective, innovative
forms of information processing for enhanced insight and
decision making”– Gartner

• “While enterprises struggle to consolidate systems and


collapse redundant databases to enable greater
operational, analytical & collaborative consistencies,
changing economic conditions have made this job more
difficult. E-commerce, in particular, has exploded data mgt
challenges along dimensions: volumes, velocity & variety.
IT organizations much compile a variety of approaches to
have at their disposal for dealing each.” Doug Laney
7
What made Big Data needed?
• Increased analytics need

• Increased computation need

• Increased data volumes

• Lowered barrier to entry and success

• Innovative techniques

• Cost effective

8
Lots of data

• 2.5 quintillion bytes of data are generated every day!


• A quintillion is 1018

• Data come from many quarters.


• Social media sites
• Sensors
• Digital photos
• Business transactions
• Location-based data

9
Characteristics of Big Data:
1-Scale (Volume)
• Data Volume
• 44x increase from 2009 2020
• From 0.8 zettabytes to 35zb

• Data volume is increasing exponentially

Exponential increase in
collected/generated data

10
Characteristics of Big Data:
2-Complexity (Varity)
• Various formats, types, and structures
• Text, numerical, images, audio, video,
sequences, time series, social media
data, multi-dim arrays, etc…
• Static data vs. streaming data
• A single application can be
generating/collecting many types of
data

11
Characteristics of Big Data:
3-Speed (Velocity)
• Data is begin generated fast and need to be processed fast

• Online Data Analytics

• Late decisions ➔ missing opportunities

• Examples
• E-Promotions: Based on your current location, your purchase history,
what you like ➔ send promotions right now for store next to you

• Healthcare monitoring: sensors monitoring your activities and body ➔


any abnormal measurements require immediate reaction

12
Big Data: 3V’s

13
Some Make it 4V’s

14
The four dimensions of use

• Aspects of the way in which users want to interact with


their data…
• Totality: Users have an increased desire to process and
analyze all available data
• Exploration: Users apply analytic approaches where the
schema is defined in response to the nature of the query
• Frequency: Users have a desire to increase the rate of
analysis in order to generate more accurate and timely
business intelligence
• Dependency: Users’ need to balance investment in existing
technologies and skills with the adoption of new techniques

15
So, in a nutshell

• Big Data is about better analytics!

16
Why Big Data

17
18
Big Data Conundrum

• Problems:
• Although there is a massive spike
available data, the percentage of the
data that an enterprise can understand
is on the decline
• The data that the enterprise is trying
to understand is saturated with both
useful signals and lots of noise.
19
The Big Data platform Manifesto
imperatives and underlying technologies

20
IBM’s Big Data Platform

21
What to do with the data

22
Harnessing Big Data

• OLTP: Online Transaction Processing (DBMSs)

• OLAP: Online Analytical Processing (Data Warehousing)

• RTAP: Real-Time Analytics Processing (Big Data Architecture & technology)

23
Who’s Generating Big Data

Mobile devices
(tracking all objects all the time)

Social media and networks Scientific instruments


(all of us are generating data) (collecting all sorts of data)

Sensor technology and networks


(measuring all kinds of data)

• The progress and innovation is no longer hindered by the ability to collect data

• But, by the ability to manage, analyze, summarize, visualize, and discover


knowledge from the collected data in a timely manner and in a scalable fashion

24
The Model Has Changed…
• The Model of Generating/Consuming Data has Changed

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data

25
What’s driving Big Data
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time

- Ad-hoc querying and reporting


- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets

26
Value of Big Data Analytics
• Big data is more real-time in nature
than traditional DW applications

• Traditional DW architectures (e.g.


Exadata, Teradata) are not well-
suited for big data apps

• Shared nothing, massively parallel


processing, scale out architectures
are well-suited for big data apps

27
Challenges in Handling Big Data

• The Bottleneck is in technology


• New architecture, algorithms, techniques are needed

• Also in technical skills


• Experts in using the new technology and dealing with big data

28
What Technology Do We Have
For Big Data ??

29
30
31
Big Data Technology

32
Big Data Initiatives possible Course Of Action

• Complex BD applications in Science, Engg,


Medicine, Healthcare, Finance, Law & ducation

• Indian traditional Knowledge

• Transportation

• BD analytics in SMEs

• Real-life case-studies of value creation through BD


analytics………………….

33

You might also like