Code of the Day
BeginnerExploring Data

Calculating stats

Use pandas .mean(), .median(), .std(), and .value_counts() to compute summary statistics on a Series and DataFrame.

Data ScienceBeginner8 min read
By the end of this lesson you will be able to:
  • Call .mean(), .median(), .std() on a numeric pandas Series
  • Use .value_counts() to count occurrences of each category
  • Apply column-level statistics to understand a DataFrame

The concepts from the previous lesson translate directly into method calls. Each statistic is one method call on a column (a Series). The code below builds a small dataset of online course completions and asks several questions of it.

Python — editable, runs in your browser

What each call returns

.mean() returns a single float — the arithmetic average. For score, this is the sum of all scores divided by 8.

.median() returns the middle value. With 8 students (even count), pandas averages the 4th and 5th sorted values. Compare it to the mean: if they are close, the distribution is roughly symmetric.

.std() returns the standard deviation. A value of around 13 here means that a typical student's score is about 13 points away from the mean — there is real spread in the class.

.value_counts() counts how many times each distinct value appears in a Series. Essential for categorical columns: here it tells you which track has the most students. It returns the counts sorted descending by default, so the most common category is first.

You can call .describe() to get mean, std, min, quartiles, and max all at once on a column: df["score"].describe(). Use individual methods when you want to embed a specific number in a calculation or comparison; use .describe() when you want a quick overview.

Applying stats to the whole DataFrame

Call .mean() or .median() on the whole DataFrame (not a single column) and pandas computes the statistic for every numeric column at once:

df[["score", "hours"]].mean()

This produces a Series with one value per column — useful for a quick overview when your DataFrame has many numeric columns.

Where to go next

Next: grouping concepts — the split-apply-combine pattern that lets you calculate statistics within groups, which is where most real analysis starts.

Finished reading? Mark it complete to track your progress.

On this page