0% found this document useful (0 votes)
5 views36 pages

Chapter 3

The document provides an introduction to data visualization using Matplotlib, focusing on creating bar charts, histograms, and scatter plots. It includes examples of visualizing Olympic medal data, customizing plots with legends and error bars, and interpreting boxplots. The content is aimed at helping users understand how to effectively represent quantitative comparisons in data.

Uploaded by

mohamedelsobkyy
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views36 pages

Chapter 3

The document provides an introduction to data visualization using Matplotlib, focusing on creating bar charts, histograms, and scatter plots. It includes examples of visualizing Olympic medal data, customizing plots with legends and error bars, and interpreting boxplots. The content is aimed at helping users understand how to effectively represent quantitative comparisons in data.

Uploaded by

mohamedelsobkyy
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Quantitative

comparisons: bar-
charts
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B

Ariel Rokem
Data Scientist
Olympic medals
,Gold, Silver, Bronze
United States, 137, 52, 67
Germany, 47, 43, 67
Great Britain, 64, 55, 26
Russia, 50, 28, 35
China, 44, 30, 35
France, 20, 55, 21
Australia, 23, 34, 25
Italy, 8, 38, 24
Canada, 4, 4, 61
Japan, 17, 13, 34

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing the data
medals = pd.read_csv('medals_by_country_2016.csv', index_col=0)
fig, ax = [Link]()
[Link]([Link], medals["Gold"])
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Interlude: rotate the tick labels
fig, ax = [Link]()
[Link]([Link], medals["Gold"])
ax.set_xticklabels([Link], rotation=90)
ax.set_ylabel("Number of medals")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing the other medals
fig, ax = [Link]
[Link]([Link], medals["Gold"])
[Link]([Link], medals["Silver"], bottom=medals["Gold"])
ax.set_xticklabels([Link], rotation=90)
ax.set_ylabel("Number of medals")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing all three
fig, ax = [Link]
[Link]([Link], medals["Gold"])

[Link]([Link], medals["Silver"], bottom=medals["Gold"])


[Link]([Link], medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"])
ax.set_xticklabels([Link], rotation=90)
ax.set_ylabel("Number of medals")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Stacked bar chart

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding a legend
fig, ax = [Link]
[Link]([Link], medals["Gold"])
[Link]([Link], medals["Silver"], bottom=medals["Gold"])
[Link]([Link], medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"])

ax.set_xticklabels([Link], rotation=90)
ax.set_ylabel("Number of medals")

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding a legend
fig, ax = [Link]
[Link]([Link], medals["Gold"], label="Gold")
[Link]([Link], medals["Silver"], bottom=medals["Gold"],
label="Silver")
[Link]([Link], medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"],
label="Bronze")

ax.set_xticklabels([Link], rotation=90)
ax.set_ylabel("Number of medals")
[Link]()
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Stacked bar chart with legend

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Create a bar chart!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B
Quantitative
comparisons:
histograms
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B

Ariel Rokem
Data Scientist
Histograms

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


A bar chart again
fig, ax = [Link]()
[Link]("Rowing", mens_rowing["Height"].mean())
[Link]("Gymnastics", mens_gymnastics["Height"].mean())
ax.set_ylabel("Height (cm)")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Introducing histograms
fig, ax = [Link]()
[Link](mens_rowing["Height"])
[Link](mens_gymnastics["Height"])
ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Labels are needed
[Link](mens_rowing["Height"], label="Rowing")
[Link](mens_gymnastics["Height"], label="Gymnastics")
ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
[Link]()
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: setting the number of bins
[Link](mens_rowing["Height"], label="Rowing", bins=5)
[Link](mens_gymnastics["Height"], label="Gymnastics", bins=5)
ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
[Link]()
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: setting bin boundaries
[Link](mens_rowing["Height"], label="Rowing",
bins=[150, 160, 170, 180, 190, 200, 210])

[Link](mens_gymnastics["Height"], label="Gymnastics",
bins=[150, 160, 170, 180, 190, 200, 210])

ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
[Link]()
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: transparency
[Link](mens_rowing["Height"], label="Rowing",
bins=[150, 160, 170, 180, 190, 200, 210],
histtype="step")

[Link](mens_gymnastics["Height"], label="Gymnastics",
bins=[150, 160, 170, 180, 190, 200, 210],
histtype="step")

ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
[Link]()
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Histogram with a histtype of step

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Create your own
histogram!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B
Statistical plotting
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B

Ariel Rokem
Data Scientist
Adding error bars to bar charts
fig, ax = [Link]()

[Link]("Rowing",
mens_rowing["Height"].mean(),
yerr=mens_rowing["Height"].std())

[Link]("Gymnastics",
mens_gymnastics["Height"].mean(),
yerr=mens_gymnastics["Height"].std())

ax.set_ylabel("Height (cm)")

[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Error bars in a bar chart

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding error bars to plots
fig, ax = [Link]()

[Link](seattle_weather["MONTH"],
seattle_weather["MLY-TAVG-NORMAL"],
yerr=seattle_weather["MLY-TAVG-STDDEV"])

[Link](austin_weather["MONTH"],
austin_weather["MLY-TAVG-NORMAL"],
yerr=austin_weather["MLY-TAVG-STDDEV"])

ax.set_ylabel("Temperature (Fahrenheit)")

[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Error bars in plots

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding boxplots
fig, ax = [Link]()
[Link]([mens_rowing["Height"],
mens_gymnastics["Height"]])
ax.set_xticklabels(["Rowing", "Gymnastics"])
ax.set_ylabel("Height (cm)")

[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Interpreting boxplots

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Try it yourself!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B
Quantitative
comparisons: scatter
plots
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B

Ariel Rokem
Data Scientist
Introducing scatter plots
fig, ax = [Link]()
[Link](climate_change["co2"], climate_change["relative_temp"])
ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing scatter plots
eighties = climate_change["1980-01-01":"1989-12-31"]
nineties = climate_change["1990-01-01":"1999-12-31"]
fig, ax = [Link]()
[Link](eighties["co2"], eighties["relative_temp"],
color="red", label="eighties")
[Link](nineties["co2"], nineties["relative_temp"],
color="blue", label="nineties")
[Link]()

ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")

[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding a comparison by color

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding a third variable by color
fig, ax = [Link]()
[Link](climate_change["co2"], climate_change["relative_temp"],
c=climate_change.index)
ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")
[Link]()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding time in color

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Practice making
your own scatter
plots!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H M AT P L O T L I B

You might also like