Python
ggplot
Published

February 26, 2023

Python visualization with Altair

I have been interested in finding a visualization library for Python that is similiar to ggplot2 in R. I like the quick plots in pandas but Matplotlib has never worked for me.

For customized visualizations I like the ggplot2 approach with layers. Its grammar of graphics provides a good balance between adjusting a visualization and productivity. The code with layers is nicely readable.

Vega-Altair, a Python library, uses a grammar to create visualization as well. So I gave it a try and it went well – see code below.

pandas and Altaire are likely to be the visualization tools that I use primarily in Python.

Here is some information on the penguins dataset 🐧 in the visualization. I didn’t want to use mtcars 🚗 again.

import altair as alt
import numpy as np
import pandas as pd
import seaborn as sns

dt = sns.load_dataset("penguins")

(alt.Chart(dt)
    .mark_circle()
    .encode(
      alt.X('flipper_length_mm', scale=alt.Scale(zero=False)),
      alt.Y('body_mass_g', scale=alt.Scale(zero=False)),
      color='species',
      tooltip=['species', 'island', 'sex']
    )
)
dt
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female
3 Adelie Torgersen NaN NaN NaN NaN NaN
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
... ... ... ... ... ... ... ...
339 Gentoo Biscoe NaN NaN NaN NaN NaN
340 Gentoo Biscoe 46.8 14.3 215.0 4850.0 Female
341 Gentoo Biscoe 50.4 15.7 222.0 5750.0 Male
342 Gentoo Biscoe 45.2 14.8 212.0 5200.0 Female
343 Gentoo Biscoe 49.9 16.1 213.0 5400.0 Male

344 rows × 7 columns