Conventional data visualization methods
Conventional data visualization methods are traditional techniques and
approaches used to represent data visually in charts, graphs, and other
visual formats to help people understand and interpret information more
effectively. These methods have been widely used for many years and
provide a solid foundation for data presentation. Some common
conventional data visualization methods include:
Bar Charts: Bar charts represent data using rectangular bars of varying
lengths, with the height or length of each bar corresponding to the data
values. Bar charts can be horizontal or vertical and are suitable for
comparing values across categories.
Line Charts: Line charts display data points connected by lines to show
trends and changes over time. They are often used to visualize time-series
data.
Pie Charts: Pie charts divide a whole into segments (or slices) to represent
the proportion of each category in the data. They are useful for showing
parts of a whole but are not ideal for comparing values.
Scatter Plots: Scatter plots display individual data points as dots on a two-
dimensional coordinate system, making it easy to identify relationships and
patterns between variables.
Area Charts: Area charts are similar to line charts but have the area under
the line filled with color, making it easier to visualize the cumulative effect
of data over time.
Histograms: Histograms are used to represent the distribution of
continuous data by dividing it into intervals or bins and displaying the
frequency of data points within each bin.
Box Plots: Box plots (box-and-whisker plots) provide a summary of the
distribution of data, showing the median, quartiles, and potential outliers.
Heatmaps: Heatmaps use color-coding to represent data values on a two-
dimensional grid, making it easy to identify patterns and trends in large
datasets.
Gantt Charts: Gantt charts are used for project management to visualize
tasks, their start and end times, and their dependencies in a timeline
format.
Pareto Charts: Pareto charts combine both bar charts and line charts to
display data in descending order of importance, helping to prioritize factors
based on their impact.
Radar Charts: Radar charts (spider charts) are useful for comparing
multiple variables across different categories, with each variable
represented by a spoke on a radial chart.
Tree Diagrams: Tree diagrams are hierarchical visualizations that show
relationships and subdivisions within a dataset, often used for
organizational structures and decision trees.
Sankey Diagrams: Sankey diagrams depict the flow of resources or
information between different stages or categories, showing the
proportional movement between them.
Venn Diagrams: Venn diagrams use overlapping circles to visualize the
relationships between different sets or categories.
Scatterplot Matrix: A scatterplot matrix is a grid of scatter plots that helps
visualize relationships and correlations between multiple variables
simultaneously.
These conventional data visualization methods serve as building blocks for
creating effective and informative data visualizations. The choice of the
most suitable method depends on the type of data, the message you want
to convey, and the audience you are targeting.
Retinal variables
Retinal variables, in the context of data visualization, refer to the visual
attributes that can be used to encode and represent data in a chart or
graph. These attributes are called "retinal" because they are related to the
human retina's ability to perceive and distinguish visual cues. The choice of
retinal variables can significantly impact the effectiveness and
interpretability of a data visualization. Some common retinal variables
include:
Position: Position is one of the most accurate and discriminative retinal
variables. It involves the use of spatial coordinates to represent data
points. For example, on a scatter plot, the x and y positions of data points
can encode two different data variables.
Length: Length can be used to represent quantitative values. Longer bars in
a bar chart or taller columns in a column chart can correspond to larger
data values.
Angle: The angle of lines or wedges can be used to represent data values.
Pie charts are an example where the angle of each slice represents a
portion of a whole.
Area: Area is the space occupied by a graphical element, such as the size
of circles or bubbles in a bubble chart. The area can represent data values,
but it can be less accurate than length.
Color: Color is a versatile retinal variable that can represent categorical or
continuous data. Different colors can distinguish categories or gradients of
values, but color perception can vary among individuals and devices.
Texture: Texture refers to patterns or textures applied to shapes, such as
hatching, stippling, or gridlines. It is used less frequently than other retinal
variables due to potential perceptual issues and lack of compatibility with
some printing methods.
Shape: Different shapes, such as circles, squares, or triangles, can be used
to represent distinct data categories. However, distinguishing shapes may
be less accurate than distinguishing colors or positions.
Size: Size can be used to represent quantitative data, where larger or
smaller graphical elements indicate different values. For example, the size
of data points in a bubble chart.
Value (Shade or Lightness): The perceived value or shade of color can be
used to represent quantitative data by varying the lightness of a color.
Darker values can represent higher values, and lighter values can represent
lower values.
Saturation: Saturation refers to the intensity of a color. It can be used to
distinguish different data categories or highlight specific data points within
a dataset.
When creating data visualizations, it's important to choose the appropriate
retinal variables based on the type of data you are working with and the
goals of your visualization. The selection of retinal variables can influence
the clarity, interpretability, and visual impact of your visualizations, so
careful consideration is essential to make the data more accessible and
informative to the audience.
Types of visual variables
Selective, associative, and ordered visual variables are important concepts
in data visualization that help designers make effective choices when
encoding data. These concepts come from Jacques Bertin's work on
semiotics in visualization and can be classified as follows:
Selective Visual Variables:
Selective visual variables are those that help viewers distinguish between
different data categories or attributes.
They are typically used to encode categorical data or highlight specific data
points.
Examples of selective visual variables include color, shape, and texture.
Color: Different colors can be used to represent distinct categories.
Shape: Various shapes (e.g., circles, squares, triangles) can be used to
differentiate categories.
Texture: Patterns or textures can be applied to elements to create visual
distinctions.
Associative Visual Variables:
Associative visual variables help viewers recognize relationships and
groupings within the data.
They are often used to encode ordinal or sequential data, emphasizing
order or continuity.
Examples of associative visual variables include position, size, and
orientation.
Position: Spatial placement, such as x and y coordinates, is a powerful way
to convey relationships and sequences.
Size: Changes in size can be used to represent the magnitude of data
values in an ordered fashion.
Orientation: The angle or direction of graphical elements can indicate order
or flow.
Ordered Visual Variables:
Ordered visual variables are used to represent ordered, sequential, or
ranked data. They emphasize the arrangement or progression of data.
These variables provide a clear sense of direction or movement within the
data.
Examples of ordered visual variables include length, color value
(lightness/darkness), and hue (color along a spectrum).
Length: Changes in the length of elements can represent ordered or ranked
data values.
Color Value: The lightness or darkness of color can be used to show
gradations or rank data.
Hue: When used in a sequential manner (e.g., from warm to cool colors),
hue can convey ordered information.
Designers select from these visual variables based on the specific
characteristics of the data and the message they want to convey. By using
selective, associative, and ordered visual variables appropriately, data
visualizations can be more effective in communicating information and
patterns to the audience.
Mapping variables to encoding:
Mapping variables to encoding in data visualization involves the process of
choosing how to represent data attributes using visual properties or
channels. This is a critical step in creating effective data visualizations.
Different data attributes can be mapped to various encoding channels to
convey information accurately and efficiently. Here are some common
encoding channels and the types of data attributes they are typically used
to represent:
Position:
Data Attribute: Quantitative or categorical data.
Encoding Channel: Spatial coordinates (x, y, or polar coordinates).
Example: Scatter plots use position to represent two quantitative variables.
Length:
Data Attribute: Quantitative data.
Encoding Channel: The length of lines, bars, or other graphical elements.
Example: Bar charts represent data values using the length of bars.
Color:
Data Attribute: Categorical or sequential data.
Encoding Channel: Different colors or color gradients.
Example: A heat map uses color to represent data values in a grid.
Shape:
Data Attribute: Categorical data.
Encoding Channel: Different shapes (e.g., circles, squares, triangles).
Example: Different data categories represented by various shapes in a
scatter plot.
Size:
Data Attribute: Quantitative data.
Encoding Channel: The size of graphical elements, such as points or
symbols.
Example: Bubble charts use size to represent data values in addition to
position.
Texture or Pattern:
Data Attribute: Categorical data.
Encoding Channel: Different patterns or textures applied to elements.
Example: A line chart with multiple data series can use different line
patterns for each series.
Orientation:
Data Attribute: Quantitative data.
Encoding Channel: The angle or direction of graphical elements.
Example: A wind rose chart uses orientation to represent wind direction
and intensity.
Transparency (Opacity):
Data Attribute: Quantitative data.
Encoding Channel: The level of transparency of elements.
Example: A density plot uses transparency to represent data density.
Value (Shade or Lightness):
Data Attribute: Quantitative data.
Encoding Channel: The perceived lightness or darkness of color.
Example: A choropleth map uses color value to represent variations in data
values.
Motion:
Data Attribute: Temporal or sequential data.
Encoding Channel: The movement or animation of elements.
Example: Animated line charts can show the evolution of data over time.
When mapping variables to encoding channels, it's essential to consider
the type of data (quantitative, categorical, ordinal), the goals of the
visualization, and the perceptual characteristics of the chosen channels.
Proper encoding helps convey the intended message, reveal patterns, and
make the data more understandable to the audience. The choice of
encoding should aim for clarity and effectiveness in data communication.
Choosing appropriate visual encodings:
Natural Ordering:
Natural ordering, also known as intrinsic ordering or inherent ordering,
refers to the inherent or default order of data based on its characteristics,
without the need for explicit sorting or manipulation. In many cases, natural
ordering arises from the nature of the data itself, and it is often intuitive and
meaningful for humans. Understanding and utilizing natural ordering can
be essential in various contexts, including data analysis, data visualization,
and programming. Here are some examples of natural ordering in different
domains:
Numerical Values: Numerical data, such as integers or real numbers,
typically have a natural order based on their magnitude. For example, in the
set of integers, natural ordering is from smallest to largest: ..., -3, -2, -1, 0, 1,
2, 3, ...
Dates and Time: Dates and timestamps have a natural chronological order.
This ordering proceeds from the past to the future, allowing for easy
interpretation and analysis of time-related data.
Alphabetical Ordering: In the case of textual data, natural ordering is often
based on the alphabetical order of characters or words. This ordering is
commonly used for sorting lists of words or names.
Geospatial Data: Geospatial data (e.g., locations, coordinates) can have a
natural order based on spatial relationships. For example, in a map,
locations can be naturally ordered from west to east or south to north.
Categorical Data: Some categorical data may have a natural order, such as
rankings (e.g., "low," "medium," "high") or educational levels (e.g.,
"elementary," "middle school," "high school").
Color Scales: Colors can be ordered naturally based on their position in the
visible spectrum. For instance, a rainbow color scale follows the natural
ordering of colors in the spectrum from red to violet.
Logical Sequences: Sequences in mathematics, logic, or computer
programming often have a natural order, such as the order of operations
(e.g., parentheses, exponentiation, multiplication, division, addition,
subtraction).
Utilizing natural ordering in data analysis and visualization can enhance the
interpretability and user experience. When data is organized according to
its natural order, it can make it easier for people to make comparisons,
identify patterns, and draw insights from the data. However, it's important
to recognize when natural ordering might not be appropriate for a specific
analysis or presentation and to be prepared to apply custom sorting or
ordering if needed.