Code of the Day
BeginnerData Wrangling

Parsing JSON and CSV

Python's built-in json and csv modules turn raw data strings into dictionaries you can filter, transform, and write back out.

WorkflowBeginner10 min read
By the end of this lesson you will be able to:
  • Parse a JSON string with json.loads() into a Python dict or list
  • Parse a CSV string with csv.DictReader into a list of dicts
  • Write Python data back to a JSON string with json.dumps()
  • Recognise that the same logical data looks different in each format

Python ships with modules for both formats you need most: json for and csv for . Neither requires installation. Both follow the same basic pattern: pass in a string, get back a Python data structure.

Parsing JSON

json.loads() converts a JSON string into a Python object. Objects become dicts; arrays become lists:

import json

raw = '{"name": "Alice", "score": 42, "tags": ["fast", "reliable"]}'
data = json.loads(raw)

print(data["name"])      # Alice
print(data["tags"][0])   # fast
print(type(data))        # <class 'dict'>

json.dumps() goes the other direction — Python dict to JSON string. The indent=2 argument makes the output human-readable:

result = {"status": "done", "count": 7}
print(json.dumps(result, indent=2))
{
  "status": "done",
  "count": 7
}

Parsing CSV

csv.DictReader reads a CSV string (or file) and yields each row as a dictionary keyed by the column headers:

import csv
import io

raw = """name,score,status
Alice,42,done
Bob,35,pending
Carol,51,done"""

reader = csv.DictReader(io.StringIO(raw))
for row in reader:
    print(row)
# {'name': 'Alice', 'score': '42', 'status': 'done'}
# {'name': 'Bob',   'score': '35', 'status': 'pending'}
# {'name': 'Carol', 'score': '51', 'status': 'done'}

Notice that every value is a string — CSV has no type information. If you need score as an integer for arithmetic, convert it explicitly: int(row["score"]).

csv.DictReader gives you strings for every field, always. This trips up almost everyone the first time: row["score"] > 40 compares strings lexicographically, not numerically. Convert to int or float before comparing numbers.

The same data in both formats

Here is the same three-record dataset expressed first as CSV, then as JSON. Both represent identical information; the format changes only the shape of the text:

CSV:

name,score,status
Alice,42,done
Bob,35,pending
Carol,51,done

JSON:

[
  {"name": "Alice", "score": 42, "status": "done"},
  {"name": "Bob",   "score": 35, "status": "pending"},
  {"name": "Carol", "score": 51, "status": "done"}
]

CSV is more compact for simple tables. JSON is unambiguous about types (42 is a number, not a string) and extends naturally if you later need to add nested fields.

Try it

Parse CSV, filter rows, and serialise the results as JSON:

Python — editable, runs in your browser

This is the core loop of data wrangling: parse, filter/transform, serialise. The specific operations change; the loop stays the same.

Where to go next

Next: lab — wrangle — read a CSV of product inventory, filter by stock level, and write the filtered results to JSON.

Finished reading? Mark it complete to track your progress.

On this page