Code of the Day
IntermediateSorting and Applications

Python's sort

Understand timsort, key= for custom sort orders, sort stability, and operator.itemgetter for efficient dict and object sorting.

CS FundamentalsIntermediate6 min read
By the end of this lesson you will be able to:
  • Explain timsort as a merge-plus-insertion hybrid and why it is fast in practice
  • Use key= to sort by arbitrary criteria
  • Understand sort stability and why it matters
  • Use operator.itemgetter and operator.attrgetter for concise sort keys

Python's sort algorithm is timsort — a hybrid of merge sort and insertion sort, designed to perform well on real-world data that is often partially ordered. Understanding what it does (and the key= parameter) makes you significantly more effective at data manipulation in Python.

Timsort: faster in practice

Timsort scans the input for runs — sequences that are already sorted ascending or descending. If it finds a descending run, it reverses it. It then merges runs together using a merge strategy similar to merge sort.

On random data, timsort is O(n log n) — the same as a pure merge sort. But on real data (logs sorted by timestamp with only recent additions, nearly-sorted configuration lists, already-sorted sub-sequences) it is dramatically faster. Insertion sort is used for small runs (fewer than ~32 elements) because its low overhead beats merge sort at small sizes.

The practical upshot: you almost never need to implement your own sort in Python. sorted() and list.sort() are fast enough for nearly all use cases.

The key= parameter

key accepts a function; Python calls it on each element before comparing. The element's position in the sorted output is determined by its key value.

words = ['banana', 'fig', 'apple', 'cherry', 'date']

# Sort by length
by_length = sorted(words, key=len)
# ['fig', 'date', 'apple', 'banana', 'cherry']

# Sort by last character
by_last = sorted(words, key=lambda w: w[-1])
# ['banana', 'apple', 'fig', 'date', 'cherry']

# Sort descending: negate a numeric key or use reverse=True
by_length_desc = sorted(words, key=len, reverse=True)
# ['banana', 'cherry', 'apple', 'date', 'fig']

Python computes each key once and caches it during the sort — you do not pay for repeated calls. This is the "Schwartzian transform" built in.

Sort stability

Python's sort is stable: if two elements compare as equal (their keys are equal), they retain their original relative order in the output.

This matters when you sort by multiple criteria in two passes. Sort by secondary key first, then by primary key. Because the sort is stable, elements with equal primary keys preserve the secondary-key order from the first pass.

students = [
    {'name': 'Alice', 'grade': 90},
    {'name': 'Bob',   'grade': 85},
    {'name': 'Carol', 'grade': 90},
]
# Sort by grade descending, ties broken by name ascending:
# Pass 1: sort by name (secondary key)
students.sort(key=lambda s: s['name'])
# Pass 2: stable sort by grade descending
students.sort(key=lambda s: s['grade'], reverse=True)
# Alice and Carol both have grade 90; Alice comes first (alphabetically)

Alternatively, sort in one pass with a tuple key (covered in the next lesson).

operator.itemgetter and operator.attrgetter

lambda works, but operator.itemgetter and operator.attrgetter are slightly faster (implemented in C) and more readable for common cases:

from operator import itemgetter, attrgetter

records = [{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
records.sort(key=itemgetter('age'))          # sort dicts by 'age'

# Multi-key: sort by 'grade' then 'name'
records.sort(key=itemgetter('grade', 'name'))

# For objects with attributes:
class Student:
    def __init__(self, name, gpa):
        self.name = name
        self.gpa = gpa

students.sort(key=attrgetter('gpa', 'name'))

itemgetter and attrgetter accept multiple arguments. The result is a tuple key, which Python compares element by element — this is exactly the "multi-key sort in one pass" pattern. Use them when you sort the same key repeatedly: they are slightly faster than a lambda because they avoid Python function-call overhead.

Where to go next

Next: custom sorting — sorting a list of dicts by multiple keys, using functools.cmp_to_key for a comparison function, all runnable in the browser.

Finished reading? Mark it complete to track your progress.

On this page