Code of the Day
IntermediateVisualisation

Seaborn statistical plots

Seaborn understands DataFrames and adds statistical awareness — learn when pairplot, heatmap, catplot, and boxplot each reveal insight.

Data ScienceIntermediate6 min read
Recommended first
By the end of this lesson you will be able to:
  • Explain seaborn's positioning relative to matplotlib
  • Identify when pairplot, heatmap, catplot, and boxplot each provide useful insight
  • Describe what "statistical visualisation" means in practice

Matplotlib is a drawing library — it draws lines, rectangles, and text precisely where you tell it to. Seaborn is a statistical visualisation library. The difference is that seaborn understands DataFrames natively, knows about groups and categories, and applies sensible statistical defaults automatically.

When you call sns.boxplot(data=df, x="category", y="value"), seaborn groups the data by category, computes the box statistics, and draws a labelled plot — in one line. The matplotlib equivalent requires grouping manually, computing quartiles, and calling the lower-level ax.boxplot() with the right parameters.

Seaborn draws on matplotlib under the hood. Everything it creates is a matplotlib figure, so you can always retrieve the axes and customise with standard ax.set_title() calls.

Four plots worth knowing

pairplot — multi-variable overview

sns.pairplot(df) takes a DataFrame and draws a grid where each numeric column is plotted against every other. The diagonal shows each column's distribution; the off-diagonal cells show scatter plots for each pair. It is the fastest way to see all bivariate relationships in a dataset at once — a standard first step in exploratory analysis.

Use it when you have four to ten numeric columns and want a broad overview. With more than ten columns the grid becomes unreadably dense.

heatmap — correlations and matrices

sns.heatmap(df.corr()) converts a correlation matrix into a colour-coded grid. Strong positive correlations are one colour (often red or blue), strong negative correlations are the other, and near-zero correlations are neutral. At a glance you can see which features move together.

Heatmaps also work for any matrix of values — for example, a pivot table of sales by product and region.

catplot — grouped comparisons

sns.catplot(data=df, x="category", y="value", kind="bar") is a high-level interface for categorical plots. Setting kind= to "bar", "box", "strip", or "violin" switches the plot type without changing anything else. When you also set hue= to a second categorical column, seaborn automatically splits each group into coloured sub-bars or sub-boxes.

boxplot — distributions and outliers

sns.boxplot(data=df, x="group", y="value") is the most direct way to compare distributions across groups. The box spans the interquartile range, the whiskers reach to 1.5× IQR, and points beyond that are plotted individually as outliers. It is the right first tool when you suspect outliers or skewed distributions in a grouped dataset.

A seaborn plot is also a matplotlib axes object. After calling any seaborn function, you can call plt.title(), plt.xlabel(), or retrieve the current axes with plt.gca() and use any ax. method on it.

When to use each

PlotBest for
pairplotFirst-look overview of all numeric columns
heatmapCorrelation matrix or any value grid
catplotComparing a numeric value across categories
boxplotSpread, skew, and outliers within groups

Where to go next

Next: seaborn in practice — running pairplot, heatmap, and catplot on an inline DataFrame so you can see exactly what each produces.

Finished reading? Mark it complete to track your progress.

On this page