0% found this document useful (0 votes)

4 views93 pages

Mod 6

The document discusses the importance of data visualization, emphasizing that visual representations of data enhance understanding, retention, and decision-making. It outlines a workflow for creating effective visualizations, including data collection, preparation, exploration, and communication. Additionally, it explains different dataset types, data semantics, and the significance of choosing appropriate chart types for effective data representation.

Uploaded by

saipranav.ss2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views93 pages

Mod 6

Uploaded by

saipranav.ss2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data Visualization

Visualization
• “Data is the new oil”
• Like oil, data in its raw, unrefined form is pretty worthless
• To unlock its value, data needs to be refined, analyzed and
understood
• More and more organizations are seeing potential in their data
connections, but how do you allow non-experts to analyze data at
scale and extract potentially complex insights? One answer is
through interactive graph visualization.
Visualization
• Visualization is the quickest way for understanding the data
• 90% of the information transmitted to the brain is visual
• Humans process images 60,000 times faster than text
• 70% of our sensory receptors are in our eyes
• 65% of people are visual learners
• In short, visual data is easier to remember than words
• Studies have also shown that while only 10-20% of written or spoken data is
remembered, 65% of information is remember when it’s presented visually
• That’s why it’s important to present your most important data visually
• It ensures your information is processed faster, more easily understood and
remembered.
Visualization
Why is data visualization important?
Data visualization has 5 additional benefits:
1. Amplifies your message: Your message is amplified in a few different
ways
• First of all, by taking the time to create data visualizations, you show
your audience that you’ve done your homework
• That alone gives a sense of credibility to your content
• Without visualizations, you run the risk of your audience not
understanding what you are trying to present
• Your data might even be received as meaningless, and your entire
message lost
Visualization
2. Gives meaning to your data
• Visualizations communicate valuable insights by creating visual
representations of your data
• For example, an Excel spreadsheet showing that Microsoft’s sales
revenue has almost doubled between 2011 and 2018 isn’t nearly as
effective as graphic that data in a simple column with some formatting
• And notice how much easier it is to visualize that change in revenue in
the picture above on the right
• This also gets into the importance of highlighting your point visually in
your data visualizations, which you can learn more about in the video
below
Cont’d
Visualization
3. Saves time
• Instead of spending the time trying to figure out what the facts and
figures mean, your audience members can ENGAGE with the meaning
• A visual representation allows you to analyze huge amounts of info in
the blink of an eye
• As we know, the human eye can recognize and process visual
information much faster than text
4. Makes for better decision making
• Assuming your data visualizations contain correct data and are done
properly, you’ll not only be able to make decisions faster, but they will
be based on data that you fully comprehend
Visualization
5. Is more shareable and digestible
• One of the best things about data visualization is that they are
accessible and easier to share across departments, with
colleagues, your boss, or with a large audience
• They can be inserted in your PowerPoint presentation, printed for
seminar handouts, or even posted and shared on social media.
Visualization Workflow
• Visualization workflow is the process of converting raw data into
meaningful visuals
• Helps in better understanding, analysis, and decision-making
• Widely used in business, research, and analytics
Workflow Overview

• Data Collection
• Data Preparation
• Data Exploration
• Define Objective
• Choose Visualization
• Build Visualization
• Interpretation
• Communication
• Feedback & Deployment
Data Collection
• Gather data from multiple sources:
• Databases
• APIs
• Surveys
• Files (CSV, Excel)
• Ensure data is relevant and accurate
Data Preparation
• Clean the data:
• Remove duplicates
• Handle missing values
• Transform data:
• Normalize values
• Convert formats
• Select required attributes
Data Exploration (EDA)
• Understand the dataset
• Use:
• Summary statistics
• Correlation analysis
• Identify:
• Patterns
• Trends
• Outliers
Define Objective
• Identify the goal of visualization
• Questions to consider:
• What problem are we solving?
• Who is the audience?
• Example:
• Business → dashboards
• Research → detailed analysis
Choose Visualization Type
• Based on data:
• Bar Chart → Comparison
• Line Graph → Trends
• Pie Chart → Composition
• Histogram → Distribution
• Scatter Plot → Relationships
Design & Build Visualization

• Use tools like:

• Tableau
• Power BI
• Python libraries
• Follow design principles:
• Clear labels and titles
• Proper color usage
• Avoid clutter
Interpretation
• Analyze the visuals
• Identify:
• Key insights
• Patterns
• Anomalies
• Convert data into meaningful information
Communication
• Present results through:
• Dashboards
• Reports
• Presentations
• Make it easy for the audience to understand
Feedback & Iteration
• Collect feedback from users
• Improve visualization clarity
• Refine based on requirements
Deployment

• Share dashboards via:

• Web platforms
• Business intelligence tools
• Enable real-time updates if needed
Data Abstraction
Data abstraction
Data abstraction
Data abstraction

❖ This figure shows the abstract types of what can be visualized.

❖ The four basic dataset types are tables, networks, fields, and geometry; other
possible collections of items include clusters, sets, and lists.
❖ These datasets are made up of different combinations of the five data types:
items, attributes, links, positions, and grids.

❖ For any of these dataset types, the full dataset could be available immediately in
the form of a static file, or it might be dynamic data processed gradually in the
form of a stream.

❖ The type of an attribute can be categorical or ordered, with a further split into
ordinal and quantitative.

❖ The ordering direction of attributes can be sequential, diverging, or cyclic.

Why Do Data Semantics and Types Matter?

❑ What kind of data are you given?

❑ What information can you figure out from the data, versus the meanings that you
must be told explicitly?

❑ What high-level concepts will allow you to split datasets apart into general and useful
pieces?

Suppose that you see the following data:

14, 2.6, 30, 30, 15, 100001

❖ What does this sequence of six numbers mean?

Similarly, suppose that you see the following data:

Basil, 7, S, Pear

❖ These numbers and words could have many possible meanings.

• To know about the data, two crosscutting pieces of information are required. Theses
are:
• Semantics of data
• Types of data.

• The semantics of the data is its real-world meaning.

• For instance, does a word represent a human first name,
• or !!!!!!!!!!!
• is it the shortened version of a company name where the full name can be looked
up in an external list,
• or !!!!!!!!!
• is it a city,
• or !!!!!!!!!!! is it a fruit?

• The type of the data is its structural or mathematical interpretation.

• Two levels:
• At the data level, what kind of thing is it: an item, a link, an attribute?

• At the attribute level: what kinds of mathematical operations are meaningful for
it?
• For example: if a number represents a count of boxes of detergent, then its type is a
quantity, and adding two such numbers together makes sense.

• If the number represents a postal code, then its type is a code rather than a quantity—
it is simply the name for a category that happens to be a number rather than a textual
name.

• Adding two of these numbers together does not make sense.

• Meta data:
• Additional (textual information) information of the original dataset is called
metadata
• ID Name Age Shirt Size Favorite Fruit
• 1 Amy 8 S Apple
• 2 Basil 7 S Pear
• 3 Clara 9 M Durian
• 4 Desm 13 L Elderberry
• 5 Ernest 12 L Peach
• 6 Fanny 10 S Lychee
• 7 Geore 9 M Orange
• 8 Hect 8 L Loquat
• 9 Ida 10 M Pear
• 10 Amy 12 M Orange
Data types

❖ An attribute is some specific property that can be measured, observed, or logged.

❖ For Synonyms for attribute are variable and data dimension, or just dimension for short.
❖ Example: attributes could be salary, price, number of sales, protein expression
levels, or temperature, weather data.
❖ An item is an individual entity that is discrete, such as a row in a simple table or a
node in a network.
❖ For example, items may be people, stocks, coffee shops, genes, or cities.
❖ A link is a relationship between items, typically within a network.
❖ A grid specifies the strategy for sampling continuous data in terms of both
geometric and topological relationships between its cells.
❖ A position is spatial data, providing a location in two-dimensional (2D) or three-
dimensional (3D) space.
❖ For example, a position might be a latitude–longitude pair describing a
location on the Earth’s surface or three numbers specifying a location within
the region of space measured by a medical scanner.
Dataset types

❖ A dataset is any collection of information that is the target of analysis. The

four basic dataset types are:
❖ tables, networks, fields, and geometry.
Dataset types
Tables: made up of rows and columns: spreadsheet.
❖ flat table: each row represents an item of data, and each column is an attribute
of the dataset.

❖ Each cell in the table is fully specified by the combination of a row and a
column—an item and an attribute—and contains a value for that pair.

❖ A multidimensional table has a more complex structure for indexing into a cell,
with multiple keys.
Networks: it is well suited for specifying that there is some kind of relationship
between two or more items.
❖ An item in a network is known as node
❖ A link is a relation between two items

❖ For example, in an articulated social network the nodes are people, and
links mean friendship.
❖ In a gene interaction network, the nodes are genes, and links between
them that these genes have been observed to interact with each other.

❖ Networks with hierarchical structure are called trees.

❖ Note: trees do not have cycles: each child node has only one parent node pointing
to it.
Field dataset type also contains attribute values associated with cells.
❖ Each cell in a field contains measurements or calculations from a continuous
domain.

❖ Continuous data requires careful treatment that takes into account the
mathematical questions of sampling

❖ Sampling: how frequently to take the measurements, and

❖ Interpolation: how to show values in between the sampled points in a
way that does not mislead.

❖ Continuous data is often found in the form of a spatial field, where the cell
structure of the field is based on sampling at spatial positions.
❖ Most datasets that contain inherently spatial data occur in the context of tasks
that require understanding aspects of its spatial structure, especially shape
❖ Grids – Uniform Grid, Unstructured grid
❖ The geometry dataset type specifies information about the shape of items with
explicit spatial positions.
❖ The items could be points, or one-dimensional lines or curves, or 2D surfaces
or regions, or 3D volumes.
❖ Geometry datasets are intrinsically spatial, and like spatial fields they typically
occur in the context of tasks that require shape understanding

Other Combinations

❖ Set

❖ Lists

❖ Cluster

❖ Path
Data availability

The two kinds of dataset availability: static or dynamic.

❖ the entire dataset is available all at once, as a static file.

❖ Some datasets are available in dynamic streams: One kind of dynamic

change is to add new items or delete previous items.
Attribute types
❖ The type of categorical data, such as favorite fruit or names, doesn’t have an
implicit ordering, but it often has hierarchical structure.
❖ Examples of categorical attributes are fruits (apples, oranges, etc..), movie genres,
file types, and city names.

❖ All ordered data does have an implicit ordering, as opposed to unordered

categorical data.
❖ This type can be further subdivided such as ordinal and quantitative.
❖ With ordinal data, such as shirt size, we cannot do full-fledged arithmetic, but
there is a well-defined ordering. For example, large minus medium is not a
meaningful concept, but we know that medium falls between small and large.

❖ A subset of ordered data is quantitative data, namely, a measurement of

magnitude that supports arithmetic comparison.
❖ For example, the quantity of 68 inches minus 42 inches is a meaningful
concept, and the answer of 26 inches can be calculated.
❖ Other examples of quantitative data are height, weight, temperature, stock
price, number of calling functions in a program, and number of drinks sold at a
coffee shop in a day.
Task Abstraction
Analysis: Four Levels for
Validation
Data Representation
Chart Types
Data Representation: Chart Types
• When visualizing data, choosing the right chart type is crucial to
effectively communicate patterns, trends, and relationships.
Donutchart
Sunburst
and Tree
map chart
Sankeydiagram
Scatter Map Choropleth Map
• Flow Map
• Flow Map is a variation of thematic maps used in cartography to
visualize how objects —for example, people or goods —move
between different geographical locations.

Flow Mapsare capable of displaying both quantitative and

qualitative types of data.

Lines in Flow Mapscan have different widths to visualize the flow

volume, and markers to show the direction.

Data Visualization Guide: Basics to Mastery
No ratings yet
Data Visualization Guide: Basics to Mastery
49 pages
Data Visualization Basics Explained
No ratings yet
Data Visualization Basics Explained
28 pages
Data Visualization and Analytics Guide
No ratings yet
Data Visualization and Analytics Guide
35 pages
Data Visualization in Data Science
No ratings yet
Data Visualization in Data Science
50 pages
Data Visualization Essentials Guide
No ratings yet
Data Visualization Essentials Guide
92 pages
Data Literacy in BUSN5003
No ratings yet
Data Literacy in BUSN5003
70 pages
Data Visualization Techniques in Python
No ratings yet
Data Visualization Techniques in Python
81 pages
Data Visualization Techniques Explained
100% (1)
Data Visualization Techniques Explained
23 pages
Visualizing Data Timestamps
No ratings yet
Visualizing Data Timestamps
25 pages
Lecture 4b - Data Preparation For Visualization
No ratings yet
Lecture 4b - Data Preparation For Visualization
34 pages
Data Science Tool Selection Guide
No ratings yet
Data Science Tool Selection Guide
42 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
42 pages
Understanding Data Visualization
No ratings yet
Understanding Data Visualization
13 pages
Business Analytics Overview and Techniques
No ratings yet
Business Analytics Overview and Techniques
58 pages
Unit5 Introduction To Visualization
No ratings yet
Unit5 Introduction To Visualization
13 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
Data Modeling and Visualization Essentials
No ratings yet
Data Modeling and Visualization Essentials
22 pages
Essential Guide to Data Visualization
No ratings yet
Essential Guide to Data Visualization
22 pages
Data Visualisation Process Overview
No ratings yet
Data Visualisation Process Overview
48 pages
Lec 02 Intro DataViz
No ratings yet
Lec 02 Intro DataViz
22 pages
ITS632 Lecture2 Data
No ratings yet
ITS632 Lecture2 Data
61 pages
Understanding Data Visualization Concepts
No ratings yet
Understanding Data Visualization Concepts
51 pages
Essentials of Data Visualization Techniques
No ratings yet
Essentials of Data Visualization Techniques
48 pages
Unit 3 Data Visualization
No ratings yet
Unit 3 Data Visualization
82 pages
Unit-1 DVT R23
No ratings yet
Unit-1 DVT R23
27 pages
Data Visualization
No ratings yet
Data Visualization
18 pages
Datawrapper: Essential for Visualizing Data
No ratings yet
Datawrapper: Essential for Visualizing Data
12 pages
R Programming 501 Final
No ratings yet
R Programming 501 Final
27 pages
Data Visualization Techniques Syllabus
No ratings yet
Data Visualization Techniques Syllabus
73 pages
DV - Unit I
No ratings yet
DV - Unit I
81 pages
Mastering Data Visualization Techniques
No ratings yet
Mastering Data Visualization Techniques
23 pages
Data Visualization Basics in R and Tableau
No ratings yet
Data Visualization Basics in R and Tableau
62 pages
Data Analytics and Visualization Guide
No ratings yet
Data Analytics and Visualization Guide
33 pages
Data Visualization
No ratings yet
Data Visualization
47 pages
Introduction to Data Exploration & Visualization
No ratings yet
Introduction to Data Exploration & Visualization
23 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
8 pages
Unit III Notes (First Half) 1
No ratings yet
Unit III Notes (First Half) 1
16 pages
Data Visualization Tools and Techniques
No ratings yet
Data Visualization Tools and Techniques
84 pages
Business Analytics Overview and Visualizations
No ratings yet
Business Analytics Overview and Visualizations
185 pages
Mastering Data Visualization Principles
No ratings yet
Mastering Data Visualization Principles
1 page
Understanding Data Visualization Basics
No ratings yet
Understanding Data Visualization Basics
44 pages
Lecture 8
No ratings yet
Lecture 8
9 pages
6-Introduction To Visualization Types
No ratings yet
6-Introduction To Visualization Types
71 pages
Basics of Data Visualization Explained
No ratings yet
Basics of Data Visualization Explained
60 pages
Data Visualization Tools Overview
100% (1)
Data Visualization Tools Overview
13 pages
Big Data Visualization
No ratings yet
Big Data Visualization
27 pages
Fundamentals of Data Visualization
No ratings yet
Fundamentals of Data Visualization
17 pages
BA
No ratings yet
BA
14 pages
Advanced Data Visualization Insights
No ratings yet
Advanced Data Visualization Insights
46 pages
Understanding Business Analytics Types
No ratings yet
Understanding Business Analytics Types
36 pages
Data Types and Visualization Techniques
No ratings yet
Data Types and Visualization Techniques
5 pages
Business Analytics Notes
No ratings yet
Business Analytics Notes
139 pages
AI Tools for Effective Data Visualization
100% (2)
AI Tools for Effective Data Visualization
37 pages
Python Exam Solutions with Answer Key
No ratings yet
Python Exam Solutions with Answer Key
4 pages
IoT-Based Smart Traffic Light System
No ratings yet
IoT-Based Smart Traffic Light System
4 pages
Insights on RESTful API Design
No ratings yet
Insights on RESTful API Design
25 pages
DataWeave Key Extraction Techniques
No ratings yet
DataWeave Key Extraction Techniques
10 pages
Overview of Data Link Layer Devices
No ratings yet
Overview of Data Link Layer Devices
6 pages
Memory Management in Virtual Systems
No ratings yet
Memory Management in Virtual Systems
51 pages
Xstream Oem Rfmodule v5
No ratings yet
Xstream Oem Rfmodule v5
64 pages
Bca Syllabus
No ratings yet
Bca Syllabus
40 pages
Jaltest AGV Diagnostic Tool Overview
No ratings yet
Jaltest AGV Diagnostic Tool Overview
1 page
LabVIEW DAQ Output and Control Basics
No ratings yet
LabVIEW DAQ Output and Control Basics
41 pages
Excel Graph Creation and Formatting Guide
No ratings yet
Excel Graph Creation and Formatting Guide
9 pages
Decimal, Binary, Hexadecimal Conversions
No ratings yet
Decimal, Binary, Hexadecimal Conversions
18 pages
Interfacing GPS Module with Raspberry Pi
No ratings yet
Interfacing GPS Module with Raspberry Pi
7 pages
C++ Object-Oriented Programming Guide
No ratings yet
C++ Object-Oriented Programming Guide
9 pages
C++ Programming MOOC Seminar Report
No ratings yet
C++ Programming MOOC Seminar Report
38 pages
Green Wave System for Ambulance Rescue
No ratings yet
Green Wave System for Ambulance Rescue
5 pages
Defining Standard Documents Procedure
No ratings yet
Defining Standard Documents Procedure
14 pages
O2 Sensor Check and Cap Installation Guide
No ratings yet
O2 Sensor Check and Cap Installation Guide
13 pages
ES & IOT Lab Syllabus
No ratings yet
ES & IOT Lab Syllabus
3 pages
Warehouse and Distribution Center Roles
No ratings yet
Warehouse and Distribution Center Roles
7 pages
Introduction to Amazon EC2 Lab Guide
No ratings yet
Introduction to Amazon EC2 Lab Guide
11 pages
Download and Install Packet Tracer Guide
No ratings yet
Download and Install Packet Tracer Guide
7 pages
Understanding NAT and Private IPs
No ratings yet
Understanding NAT and Private IPs
20 pages
Senior Oracle Hyperion Consultant Profile
No ratings yet
Senior Oracle Hyperion Consultant Profile
6 pages
Depth First Traversal and Java Concepts Quiz
No ratings yet
Depth First Traversal and Java Concepts Quiz
15 pages
CNC Machine Overview and Applications
No ratings yet
CNC Machine Overview and Applications
13 pages
Parler - We Are Owed An Apology
No ratings yet
Parler - We Are Owed An Apology
2 pages
Twinkle Agrawal: Database Administrator Profile
No ratings yet
Twinkle Agrawal: Database Administrator Profile
2 pages
UFS GlobalProtect VPN Access Guide
No ratings yet
UFS GlobalProtect VPN Access Guide
4 pages
Java SDET Course Schedule
No ratings yet
Java SDET Course Schedule
1 page

Mod 6

Uploaded by

Mod 6

Uploaded by

Data Visualization

• Use tools like:

• Share dashboards via:

❖ This figure shows the abstract types of what can be visualized.

❖ The ordering direction of attributes can be sequential, diverging, or cyclic.

❑ What kind of data are you given?

Suppose that you see the following data:

14, 2.6, 30, 30, 15, 100001

❖ What does this sequence of six numbers mean?

Similarly, suppose that you see the following data:

❖ These numbers and words could have many possible meanings.

• The semantics of the data is its real-world meaning.

• The type of the data is its structural or mathematical interpretation.

• Adding two of these numbers together does not make sense.

❖ An attribute is some specific property that can be measured, observed, or logged.

❖ A dataset is any collection of information that is the target of analysis. The

❖ Networks with hierarchical structure are called trees.

❖ Sampling: how frequently to take the measurements, and

The two kinds of dataset availability: static or dynamic.

❖ Some datasets are available in dynamic streams: One kind of dynamic

❖ All ordered data does have an implicit ordering, as opposed to unordered

❖ A subset of ordered data is quantitative data, namely, a measurement of

Flow Mapsare capable of displaying both quantitative and

Lines in Flow Mapscan have different widths to visualize the flow

You might also like