Tools

Tools: Update: How to Convert JSON to CSV in Python (Complete Guide)

2026-03-31 0 views admin

What is JSON to CSV Conversion?

csv.DictWriter — Convert JSON to CSV Without Pandas

Handling Non-Standard Types: datetime, UUID, and Decimal

csv.DictWriter Parameters Reference

pandas — Convert JSON to CSV with DataFrames

Flattening Nested JSON with json_normalize

DataFrame.to_csv() Parameters Reference

Convert JSON to CSV from a File and API Response

File on Disk — Read, Convert, Save

HTTP API Response — Fetch and Convert

Command-Line JSON to CSV Conversion

High-Performance Alternative — pandas with pyarrow

Working with Large JSON Files

Streaming JSON Array to CSV with ijson

NDJSON / JSON Lines — One Object Per Line

Common Mistakes

❌ Missing newline='' in open() — blank rows on Windows

❌ Forgetting index=False in pandas to_csv()

❌ Using records[0].keys() when records have inconsistent keys

❌ Writing nested dicts directly to CSV without flattening

csv.DictWriter vs pandas — Quick Comparison

Frequently Asked Questions

How do I convert JSON to CSV in Python without pandas?

How do I handle nested JSON when converting to CSV?

Why does my CSV have blank rows between data rows on Windows?

How do I append JSON records to an existing CSV file?

What is the fastest way to convert a large JSON file to CSV in Python?

Can I write CSV output to stdout instead of a file in Python? Almost every data pipeline eventually hits the same step: an API returns JSON, but the next consumer — a spreadsheet, an import script, a Redshift COPY command — needs CSV. Converting JSON to CSV in Python sounds trivial until you hit nested objects, inconsistent keys, or datetime values that need special handling. Python gives you two solid paths: the built-in json + csv modules for zero-dependency scripts, and pandas for nested flattening and larger datasets — or the online JSON to CSV converter for quick one-off conversions without any code. This guide covers both approaches end to end, with runnable Python 3.8+ examples. JSON to CSV conversion transforms an array of JSON objects into a tabular format where each object becomes a row and each key becomes a column header. JSON is hierarchical — objects can nest arbitrarily deep. CSV is flat — every value sits in a row-column grid. The conversion works cleanly when every object shares the same set of top-level keys. Nested objects, arrays, and inconsistent keys are where things get interesting. The csv module ships with every Python installation. No pip install, no virtual environment gymnastics. csv.DictWriter takes a list of dictionaries and writes each one as a CSV row, mapping dict keys to column headers. The fieldnames parameter controls both the column order and which keys get included. That newline="" argument on open() is not optional on Windows. Without it, you get double carriage returns — which show up as blank rows between every data row in Excel. On macOS and Linux it is harmless, so just always include it. The code above uses json.loads() for a string. Use json.load() (no trailing s) when reading from a file handle. This trips people up constantly — one reads a string, the other reads a file object. Setting extrasaction="ignore" silently drops any keys in the dicts that are not in your fieldnames list. The default is "raise", which throws a ValueError if any dict has an unexpected key. Pick whichever matches your tolerance for surprises. Note: csv.DictWriter vs csv.writer: DictWriter maps dict keys to column positions automatically. csv.writer writes raw lists as rows — you handle the column ordering yourself. DictWriter is almost always the right choice for JSON-to-CSV because JSON records are already dictionaries. Python's csv module ships with three named dialects: excel (comma delimiter, CRLF line endings — the default), excel-tab (tab delimiter, CRLF endings), and unix (LF line endings, quotes all non-numeric fields). For most JSON-to-CSV workflows the excel dialect is correct, but switch to unix when writing files that will be processed by POSIX tools like awk or sort. JSON from APIs often contains dates as ISO strings, UUIDs as hyphenated strings, and monetary values as floats. When you parse these into Python objects for processing before writing CSV, you need to convert them back to strings. The csv module calls str() on every value, so most types just work. But datetime objects produce messy default string representations, and Decimal values need explicit formatting to avoid scientific notation. Warning: If you let Decimal values pass through without formatting, very small or very large numbers may render in scientific notation (e.g., 1.5E+7). Always format Decimal with an explicit f-string like f"{value:.2f}" when writing financial data to CSV. The full constructor signature is csv.DictWriter(f, fieldnames, restval="", extrasaction="raise", dialect="excel", **fmtparams). Most of these have sensible defaults. The ones you will actually change are fieldnames, delimiter, and extrasaction. If you are already working in a pandas-heavy codebase, or your JSON has nested objects that you need to flatten, the pandas approach is significantly less code than the stdlib version. The tradeoff: pandas is a ~30 MB dependency. For a throwaway script, that is fine. For a Docker image you ship to production, the stdlib approach keeps things lighter. The index=False flag is one of those things you look up every single time. Without it, pandas writes a 0, 1, 2, ... column as the first column of your CSV. Nobody wants that. Real API responses are rarely flat. Orders contain shipping addresses, users contain nested preferences, telemetry events contain nested metadata. pd.json_normalize() walks nested dictionaries and flattens them into columns with dot-separated names. The sep="_" parameter controls how nested key names are joined. The default is ".", which produces columns like customer.name. I prefer underscores because dots in column names cause trouble with SQL imports and some spreadsheet formulas. For API responses that wrap the records array under a nested key, use the record_path parameter. If the response looks like {"data": {"orders": [...]}}, pass record_path=["data", "orders"] to navigate to the right list. DataFrame.to_csv() has over 20 parameters. These are the ones that matter for JSON-to-CSV workflows. The two most common real-world scenarios: reading JSON from a file on disk and converting it, or fetching JSON from an HTTP API and saving the result as CSV. Note: The example above uses urllib from the standard library to keep the script dependency-free. If you have requests installed, replace the urllib section with resp = requests.get(url, timeout=30); records = resp.json() — the rest of the CSV writing code stays identical. Sometimes you just need a one-liner in the terminal. Python's -c flag lets you run a quick conversion without creating a script file. Miller (mlr) is a standalone binary that treats JSON, CSV, and TSV as first-class formats with no Python runtime required. The --json2csv flag converts JSON input to CSV in a single pass. Install via Homebrew on macOS (brew install miller) or your Linux package manager. For datasets in the tens-of-millions-of-rows range, pandas with the pyarrow backend reads and writes significantly faster than the default. If you are processing more than a few hundred MB of JSON and the final consumer accepts Parquet, skip CSV entirely. Parquet is smaller, preserves column types, and both Redshift and BigQuery load it natively. CSV is a lossy format — every value becomes a string. json.load() reads the entire file into memory. For a 200 MB JSON file, that means ~200 MB of raw text plus the Python object overhead — easily 500 MB+ of heap usage. For files over 100 MB, stream the input with ijson and write CSV rows as you go. NDJSON (Newline-Delimited JSON), also called JSON Lines or .jsonl, stores one valid JSON object per line with no wrapping array. This format is common in log pipelines, event streams (Kafka, Kinesis), and bulk exports from services like Elasticsearch and BigQuery. Note: Switch to streaming when the JSON file exceeds 100 MB. A 1 GB JSON array loaded with json.load() can consume 3–5 GB of RAM due to Python object overhead. With ijson, memory stays flat regardless of file size. If you just need a quick conversion of a small file, paste it into the JSON to CSV converter instead. Problem: The csv module writes \r\n line endings. Without newline='', Python's text mode adds another \r on Windows, producing double-spaced output. Fix: Always pass newline="" when opening a file for CSV writing. It is harmless on macOS/Linux. Problem: Without index=False, pandas prepends an auto-incrementing row number column (0, 1, 2, ...) that pollutes the CSV with data that was never in the original JSON. Fix: Pass index=False to to_csv(). Problem: If JSON objects have different keys (some records have optional fields), using the first record's keys as fieldnames silently drops columns that only appear in later records. Fix: Collect all unique keys across all records before creating the DictWriter. Problem: csv.DictWriter calls str() on nested dicts, producing columns with values like "{'city': 'Portland'}" — raw Python repr, not actual data. Fix: Flatten nested objects first using pd.json_normalize() or a custom flattening function. Use csv.DictWriter when you need zero dependencies, your JSON is flat, and the script runs in a restricted environment (CI containers, Lambda functions, embedded Python). Use pd.json_normalize() + to_csv() when the JSON is nested, you need to transform or filter data before export, or you are already in a pandas workflow. For files that do not fit in memory, combine ijson with csv.DictWriter for constant-memory streaming. For quick, no-code conversions, the JSON to CSV converter on ToolDeck handles it without any Python setup. Use the built-in json and csv modules. Call json.load() to parse the JSON file into a list of dicts, extract fieldnames from the first dict's keys, create a csv.DictWriter, call writeheader(), then writerows(). This approach has zero external dependencies and works in any Python 3.x environment. If your JSON objects have inconsistent keys across records, collect all unique keys first with dict.fromkeys(k for r in records for k in r) before passing them as fieldnames to avoid missing columns. Flat JSON arrays map directly to CSV rows, but nested objects need flattening first. With pandas, pd.json_normalize() handles this automatically — it joins nested keys with a dot separator (e.g., "address.city"). Without pandas, write a recursive function that walks the dict and concatenates keys with a delimiter. The csv module writes \r\n line endings by default. On Windows, opening the file in text mode adds another \r, producing \r\r\n — which displays as a blank row. The fix is to always pass newline="" to open(). Open the file in append mode ("a") and create a DictWriter with the same fieldnames. Skip writeheader() since the header row already exists. For files under 500 MB, pd.read_json() followed by to_csv() is the fastest single-call approach — pandas uses optimized C code internally. For files above 500 MB, use ijson to stream JSON records and write them to CSV with csv.DictWriter row by row. For NDJSON files, a plain Python for loop over the file handle achieves constant memory without any third-party library. Yes. Pass sys.stdout as the file object to csv.writer() or csv.DictWriter(). With pandas, call to_csv(sys.stdout, index=False) or to_csv(None) to get a string you can print. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

[{"order_id":"ord_91a3","total":149.99,"status":"shipped"}, {"order_id":"ord_b7f2","total":34.50,"status":"pending"}] [{"order_id":"ord_91a3","total":149.99,"status":"shipped"}, {"order_id":"ord_b7f2","total":34.50,"status":"pending"}] [{"order_id":"ord_91a3","total":149.99,"status":"shipped"}, {"order_id":"ord_b7f2","total":34.50,"status":"pending"}] order_id,total,status ord_91a3,149.99,shipped ord_b7f2,34.50,pending order_id,total,status ord_91a3,149.99,shipped ord_b7f2,34.50,pending order_id,total,status ord_91a3,149.99,shipped ord_b7f2,34.50,pending # Python 3.8+ — minimal json to csv example import json import csv # Sample JSON data — an array of order objects json_string = """ [ {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99}, {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50}, {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95} ] """ records = json.loads(json_string) with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) # orders.csv: # order_id,product,quantity,unit_price # ord_91a3,Wireless Keyboard,2,74.99 # ord_b7f2,USB-C Hub,1,34.50 # ord_c4e8,Monitor Stand,3,29.95 # Python 3.8+ — minimal json to csv example import json import csv # Sample JSON data — an array of order objects json_string = """ [ {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99}, {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50}, {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95} ] """ records = json.loads(json_string) with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) # orders.csv: # order_id,product,quantity,unit_price # ord_91a3,Wireless Keyboard,2,74.99 # ord_b7f2,USB-C Hub,1,34.50 # ord_c4e8,Monitor Stand,3,29.95 # Python 3.8+ — minimal json to csv example import json import csv # Sample JSON data — an array of order objects json_string = """ [ {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99}, {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50}, {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95} ] """ records = json.loads(json_string) with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile: writer = csv.DictWriter(csvfile, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) # orders.csv: # order_id,product,quantity,unit_price # ord_91a3,Wireless Keyboard,2,74.99 # ord_b7f2,USB-C Hub,1,34.50 # ord_c4e8,Monitor Stand,3,29.95 # Python 3.8+ — read JSON file, write CSV file import json import csv with open("server_metrics.json", encoding="utf-8") as jf: metrics = json.load(jf) # json.load() for file objects # Explicit fieldnames control column order columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"] with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore") writer.writeheader() writer.writerows(metrics) # Only the five specified columns appear, in exactly that order # Python 3.8+ — read JSON file, write CSV file import json import csv with open("server_metrics.json", encoding="utf-8") as jf: metrics = json.load(jf) # json.load() for file objects # Explicit fieldnames control column order columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"] with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore") writer.writeheader() writer.writerows(metrics) # Only the five specified columns appear, in exactly that order # Python 3.8+ — read JSON file, write CSV file import json import csv with open("server_metrics.json", encoding="utf-8") as jf: metrics = json.load(jf) # json.load() for file objects # Explicit fieldnames control column order columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"] with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore") writer.writeheader() writer.writerows(metrics) # Only the five specified columns appear, in exactly that order # Python 3.8+ — pre-process datetime and Decimal before CSV write import json import csv from datetime import datetime, timezone from decimal import Decimal from uuid import UUID # Simulating parsed API response with Python types transactions = [ { "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"), "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc), "amount": Decimal("1249.99"), "currency": "USD", "merchant": "CloudHost Inc.", }, { "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"), "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc), "amount": Decimal("87.50"), "currency": "EUR", "merchant": "DataSync GmbH", }, ] def prepare_row(record: dict) -> dict: """Convert non-string types to CSV-friendly strings.""" return { "txn_id": str(record["txn_id"]), "created_at": record["created_at"].isoformat(), "amount": f"{record['amount']:.2f}", "currency": record["currency"], "merchant": record["merchant"], } with open("transactions.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"]) writer.writeheader() for txn in transactions: writer.writerow(prepare_row(txn)) # transactions.csv: # txn_id,created_at,amount,currency,merchant # a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc. # b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH # Python 3.8+ — pre-process datetime and Decimal before CSV write import json import csv from datetime import datetime, timezone from decimal import Decimal from uuid import UUID # Simulating parsed API response with Python types transactions = [ { "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"), "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc), "amount": Decimal("1249.99"), "currency": "USD", "merchant": "CloudHost Inc.", }, { "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"), "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc), "amount": Decimal("87.50"), "currency": "EUR", "merchant": "DataSync GmbH", }, ] def prepare_row(record: dict) -> dict: """Convert non-string types to CSV-friendly strings.""" return { "txn_id": str(record["txn_id"]), "created_at": record["created_at"].isoformat(), "amount": f"{record['amount']:.2f}", "currency": record["currency"], "merchant": record["merchant"], } with open("transactions.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"]) writer.writeheader() for txn in transactions: writer.writerow(prepare_row(txn)) # transactions.csv: # txn_id,created_at,amount,currency,merchant # a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc. # b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH # Python 3.8+ — pre-process datetime and Decimal before CSV write import json import csv from datetime import datetime, timezone from decimal import Decimal from uuid import UUID # Simulating parsed API response with Python types transactions = [ { "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"), "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc), "amount": Decimal("1249.99"), "currency": "USD", "merchant": "CloudHost Inc.", }, { "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"), "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc), "amount": Decimal("87.50"), "currency": "EUR", "merchant": "DataSync GmbH", }, ] def prepare_row(record: dict) -> dict: """Convert non-string types to CSV-friendly strings.""" return { "txn_id": str(record["txn_id"]), "created_at": record["created_at"].isoformat(), "amount": f"{record['amount']:.2f}", "currency": record["currency"], "merchant": record["merchant"], } with open("transactions.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"]) writer.writeheader() for txn in transactions: writer.writerow(prepare_row(txn)) # transactions.csv: # txn_id,created_at,amount,currency,merchant # a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc. # b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH # Python 3.8+ — pandas read_json then to_csv import pandas as pd # Read JSON array directly into a DataFrame df = pd.read_json("warehouse_inventory.json") # Write to CSV — index=False prevents the auto-generated row numbers df.to_csv("warehouse_inventory.csv", index=False) # That's it. Two lines. pandas infers column types automatically. # Python 3.8+ — pandas read_json then to_csv import pandas as pd # Read JSON array directly into a DataFrame df = pd.read_json("warehouse_inventory.json") # Write to CSV — index=False prevents the auto-generated row numbers df.to_csv("warehouse_inventory.csv", index=False) # That's it. Two lines. pandas infers column types automatically. # Python 3.8+ — pandas read_json then to_csv import pandas as pd # Read JSON array directly into a DataFrame df = pd.read_json("warehouse_inventory.json") # Write to CSV — index=False prevents the auto-generated row numbers df.to_csv("warehouse_inventory.csv", index=False) # That's it. Two lines. pandas infers column types automatically. # Python 3.8+ — flatten nested JSON using json_normalize import json import pandas as pd api_response = """ [ { "order_id": "ord_91a3", "placed_at": "2026-03-15T09:30:00Z", "customer": { "name": "Sarah Chen", "email": "[email protected]", "tier": "premium" }, "shipping": { "method": "express", "address": { "city": "Portland", "state": "OR", "zip": "97201" } }, "total": 299.95 }, { "order_id": "ord_b7f2", "placed_at": "2026-03-15T14:12:00Z", "customer": { "name": "James Park", "email": "[email protected]", "tier": "standard" }, "shipping": { "method": "standard", "address": { "city": "Austin", "state": "TX", "zip": "73301" } }, "total": 87.50 } ] """ orders = json.loads(api_response) # json_normalize flattens nested dicts — sep controls the delimiter df = pd.json_normalize(orders, sep="_") df.to_csv("flat_orders.csv", index=False) # Resulting columns: # order_id, placed_at, customer_name, customer_email, customer_tier, # shipping_method, shipping_address_city, shipping_address_state, # shipping_address_zip, total # Python 3.8+ — flatten nested JSON using json_normalize import json import pandas as pd api_response = """ [ { "order_id": "ord_91a3", "placed_at": "2026-03-15T09:30:00Z", "customer": { "name": "Sarah Chen", "email": "[email protected]", "tier": "premium" }, "shipping": { "method": "express", "address": { "city": "Portland", "state": "OR", "zip": "97201" } }, "total": 299.95 }, { "order_id": "ord_b7f2", "placed_at": "2026-03-15T14:12:00Z", "customer": { "name": "James Park", "email": "[email protected]", "tier": "standard" }, "shipping": { "method": "standard", "address": { "city": "Austin", "state": "TX", "zip": "73301" } }, "total": 87.50 } ] """ orders = json.loads(api_response) # json_normalize flattens nested dicts — sep controls the delimiter df = pd.json_normalize(orders, sep="_") df.to_csv("flat_orders.csv", index=False) # Resulting columns: # order_id, placed_at, customer_name, customer_email, customer_tier, # shipping_method, shipping_address_city, shipping_address_state, # shipping_address_zip, total # Python 3.8+ — flatten nested JSON using json_normalize import json import pandas as pd api_response = """ [ { "order_id": "ord_91a3", "placed_at": "2026-03-15T09:30:00Z", "customer": { "name": "Sarah Chen", "email": "[email protected]", "tier": "premium" }, "shipping": { "method": "express", "address": { "city": "Portland", "state": "OR", "zip": "97201" } }, "total": 299.95 }, { "order_id": "ord_b7f2", "placed_at": "2026-03-15T14:12:00Z", "customer": { "name": "James Park", "email": "[email protected]", "tier": "standard" }, "shipping": { "method": "standard", "address": { "city": "Austin", "state": "TX", "zip": "73301" } }, "total": 87.50 } ] """ orders = json.loads(api_response) # json_normalize flattens nested dicts — sep controls the delimiter df = pd.json_normalize(orders, sep="_") df.to_csv("flat_orders.csv", index=False) # Resulting columns: # order_id, placed_at, customer_name, customer_email, customer_tier, # shipping_method, shipping_address_city, shipping_address_state, # shipping_address_zip, total # Python 3.8+ — to_csv with common parameter overrides import pandas as pd df = pd.read_json("telemetry_events.json") # TSV output with explicit encoding and missing value handling df.to_csv( "telemetry_events.tsv", sep="\t", index=False, encoding="utf-8", na_rep="NULL", columns=["event_id", "timestamp", "source", "severity", "message"], ) # Write to stdout for piping in shell scripts print(df.to_csv(index=False)) # Return as string (no file written) csv_string = df.to_csv(index=False) print(len(csv_string), "characters") # Python 3.8+ — to_csv with common parameter overrides import pandas as pd df = pd.read_json("telemetry_events.json") # TSV output with explicit encoding and missing value handling df.to_csv( "telemetry_events.tsv", sep="\t", index=False, encoding="utf-8", na_rep="NULL", columns=["event_id", "timestamp", "source", "severity", "message"], ) # Write to stdout for piping in shell scripts print(df.to_csv(index=False)) # Return as string (no file written) csv_string = df.to_csv(index=False) print(len(csv_string), "characters") # Python 3.8+ — to_csv with common parameter overrides import pandas as pd df = pd.read_json("telemetry_events.json") # TSV output with explicit encoding and missing value handling df.to_csv( "telemetry_events.tsv", sep="\t", index=False, encoding="utf-8", na_rep="NULL", columns=["event_id", "timestamp", "source", "severity", "message"], ) # Write to stdout for piping in shell scripts print(df.to_csv(index=False)) # Return as string (no file written) csv_string = df.to_csv(index=False) print(len(csv_string), "characters") # Python 3.8+ — convert JSON file to CSV with error handling import json import csv import sys def json_file_to_csv(input_path: str, output_path: str) -> int: """Convert a JSON file containing an array of objects to CSV. Returns the number of rows written. """ try: with open(input_path, encoding="utf-8") as jf: data = json.load(jf) except FileNotFoundError: print(f"Error: {input_path} not found", file=sys.stderr) return 0 except json.JSONDecodeError as exc: print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr) return 0 if not isinstance(data, list) or not data: print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr) return 0 # Collect all unique keys across all records — handles inconsistent schemas all_keys: list[str] = [] seen: set[str] = set() for record in data: for key in record: if key not in seen: all_keys.append(key) seen.add(key) with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore") writer.writeheader() writer.writerows(data) return len(data) rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv") print(f"Wrote {rows} rows to deploy_logs.csv") # Python 3.8+ — convert JSON file to CSV with error handling import json import csv import sys def json_file_to_csv(input_path: str, output_path: str) -> int: """Convert a JSON file containing an array of objects to CSV. Returns the number of rows written. """ try: with open(input_path, encoding="utf-8") as jf: data = json.load(jf) except FileNotFoundError: print(f"Error: {input_path} not found", file=sys.stderr) return 0 except json.JSONDecodeError as exc: print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr) return 0 if not isinstance(data, list) or not data: print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr) return 0 # Collect all unique keys across all records — handles inconsistent schemas all_keys: list[str] = [] seen: set[str] = set() for record in data: for key in record: if key not in seen: all_keys.append(key) seen.add(key) with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore") writer.writeheader() writer.writerows(data) return len(data) rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv") print(f"Wrote {rows} rows to deploy_logs.csv") # Python 3.8+ — convert JSON file to CSV with error handling import json import csv import sys def json_file_to_csv(input_path: str, output_path: str) -> int: """Convert a JSON file containing an array of objects to CSV. Returns the number of rows written. """ try: with open(input_path, encoding="utf-8") as jf: data = json.load(jf) except FileNotFoundError: print(f"Error: {input_path} not found", file=sys.stderr) return 0 except json.JSONDecodeError as exc: print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr) return 0 if not isinstance(data, list) or not data: print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr) return 0 # Collect all unique keys across all records — handles inconsistent schemas all_keys: list[str] = [] seen: set[str] = set() for record in data: for key in record: if key not in seen: all_keys.append(key) seen.add(key) with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore") writer.writeheader() writer.writerows(data) return len(data) rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv") print(f"Wrote {rows} rows to deploy_logs.csv") # Python 3.8+ — fetch JSON from API and save as CSV import json import csv import urllib.request import urllib.error def api_response_to_csv(url: str, output_path: str) -> int: """Fetch JSON from a REST API endpoint and write it as CSV.""" try: req = urllib.request.Request(url, headers={"Accept": "application/json"}) with urllib.request.urlopen(req, timeout=30) as resp: if resp.status != 200: print(f"Error: API returned status {resp.status}") return 0 body = resp.read().decode("utf-8") except urllib.error.URLError as exc: print(f"Error: could not reach {url}: {exc.reason}") return 0 try: records = json.loads(body) except json.JSONDecodeError as exc: print(f"Error: API returned invalid JSON: {exc.msg}") return 0 if not isinstance(records, list) or not records: print("Error: expected a non-empty JSON array from the API") return 0 with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) return len(records) # Python 3.8+ — fetch JSON from API and save as CSV import json import csv import urllib.request import urllib.error def api_response_to_csv(url: str, output_path: str) -> int: """Fetch JSON from a REST API endpoint and write it as CSV.""" try: req = urllib.request.Request(url, headers={"Accept": "application/json"}) with urllib.request.urlopen(req, timeout=30) as resp: if resp.status != 200: print(f"Error: API returned status {resp.status}") return 0 body = resp.read().decode("utf-8") except urllib.error.URLError as exc: print(f"Error: could not reach {url}: {exc.reason}") return 0 try: records = json.loads(body) except json.JSONDecodeError as exc: print(f"Error: API returned invalid JSON: {exc.msg}") return 0 if not isinstance(records, list) or not records: print("Error: expected a non-empty JSON array from the API") return 0 with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) return len(records) # Python 3.8+ — fetch JSON from API and save as CSV import json import csv import urllib.request import urllib.error def api_response_to_csv(url: str, output_path: str) -> int: """Fetch JSON from a REST API endpoint and write it as CSV.""" try: req = urllib.request.Request(url, headers={"Accept": "application/json"}) with urllib.request.urlopen(req, timeout=30) as resp: if resp.status != 200: print(f"Error: API returned status {resp.status}") return 0 body = resp.read().decode("utf-8") except urllib.error.URLError as exc: print(f"Error: could not reach {url}: {exc.reason}") return 0 try: records = json.loads(body) except json.JSONDecodeError as exc: print(f"Error: API returned invalid JSON: {exc.msg}") return 0 if not isinstance(records, list) or not records: print("Error: expected a non-empty JSON array from the API") return 0 with open(output_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) return len(records) # Python one-liner: reads JSON from stdin, writes CSV to stdout cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " # Save output to a file cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " > orders.csv # Python one-liner: reads JSON from stdin, writes CSV to stdout cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " # Save output to a file cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " > orders.csv # Python one-liner: reads JSON from stdin, writes CSV to stdout cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " # Save output to a file cat orders.json | python3 -c " import json, csv, sys data = json.load(sys.stdin) w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) w.writeheader() w.writerows(data) " > orders.csv # Using jq + csvkit for complex transformations # Install csvkit: pip install csvkit # jq flattens and selects fields, in2csv handles the CSV formatting cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv # Miller (mlr) is another option for JSON-to-CSV mlr --json2csv cat orders.json > orders.csv # Using jq + csvkit for complex transformations # Install csvkit: pip install csvkit # jq flattens and selects fields, in2csv handles the CSV formatting cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv # Miller (mlr) is another option for JSON-to-CSV mlr --json2csv cat orders.json > orders.csv # Using jq + csvkit for complex transformations # Install csvkit: pip install csvkit # jq flattens and selects fields, in2csv handles the CSV formatting cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv # Miller (mlr) is another option for JSON-to-CSV mlr --json2csv cat orders.json > orders.csv pip install pyarrow pip install pyarrow pip install pyarrow # Python 3.8+ — pandas with pyarrow for faster CSV writing import pandas as pd # Read JSON with pyarrow engine (faster parsing for large files) df = pd.read_json("sensor_readings.json", engine="pyarrow") # to_csv doesn't have an engine parameter, but the DataFrame operations # between read and write benefit from pyarrow's columnar layout df.to_csv("sensor_readings.csv", index=False) # For truly large exports, consider writing to Parquet instead of CSV # — binary format, 5-10x smaller, preserves types df.to_parquet("sensor_readings.parquet", engine="pyarrow") # Python 3.8+ — pandas with pyarrow for faster CSV writing import pandas as pd # Read JSON with pyarrow engine (faster parsing for large files) df = pd.read_json("sensor_readings.json", engine="pyarrow") # to_csv doesn't have an engine parameter, but the DataFrame operations # between read and write benefit from pyarrow's columnar layout df.to_csv("sensor_readings.csv", index=False) # For truly large exports, consider writing to Parquet instead of CSV # — binary format, 5-10x smaller, preserves types df.to_parquet("sensor_readings.parquet", engine="pyarrow") # Python 3.8+ — pandas with pyarrow for faster CSV writing import pandas as pd # Read JSON with pyarrow engine (faster parsing for large files) df = pd.read_json("sensor_readings.json", engine="pyarrow") # to_csv doesn't have an engine parameter, but the DataFrame operations # between read and write benefit from pyarrow's columnar layout df.to_csv("sensor_readings.csv", index=False) # For truly large exports, consider writing to Parquet instead of CSV # — binary format, 5-10x smaller, preserves types df.to_parquet("sensor_readings.parquet", engine="pyarrow") pip install ijson pip install ijson pip install ijson # Python 3.8+ — stream large JSON array to CSV with constant memory import ijson import csv def stream_json_to_csv(json_path: str, csv_path: str) -> int: """Convert a large JSON array to CSV without loading it all into memory.""" with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf: # ijson.items yields each element of the top-level array one at a time records = ijson.items(jf, "item") first_record = next(records) fieldnames = list(first_record.keys()) writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for record in records: writer.writerow(record) count += 1 return count rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv") print(f"Streamed {rows} records to CSV") # Python 3.8+ — stream large JSON array to CSV with constant memory import ijson import csv def stream_json_to_csv(json_path: str, csv_path: str) -> int: """Convert a large JSON array to CSV without loading it all into memory.""" with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf: # ijson.items yields each element of the top-level array one at a time records = ijson.items(jf, "item") first_record = next(records) fieldnames = list(first_record.keys()) writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for record in records: writer.writerow(record) count += 1 return count rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv") print(f"Streamed {rows} records to CSV") # Python 3.8+ — stream large JSON array to CSV with constant memory import ijson import csv def stream_json_to_csv(json_path: str, csv_path: str) -> int: """Convert a large JSON array to CSV without loading it all into memory.""" with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf: # ijson.items yields each element of the top-level array one at a time records = ijson.items(jf, "item") first_record = next(records) fieldnames = list(first_record.keys()) writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for record in records: writer.writerow(record) count += 1 return count rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv") print(f"Streamed {rows} records to CSV") # Python 3.8+ — convert NDJSON to CSV line by line import json import csv def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int: """Convert a newline-delimited JSON file to CSV, one line at a time.""" with open(ndjson_path, encoding="utf-8") as nf: first_line = nf.readline() first_record = json.loads(first_line) fieldnames = list(first_record.keys()) with open(csv_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for line in nf: line = line.strip() if not line: continue try: record = json.loads(line) writer.writerow(record) count += 1 except json.JSONDecodeError: continue # skip malformed lines return count rows = ndjson_to_csv("access_log.ndjson", "access_log.csv") print(f"Converted {rows} log entries to CSV") # Python 3.8+ — convert NDJSON to CSV line by line import json import csv def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int: """Convert a newline-delimited JSON file to CSV, one line at a time.""" with open(ndjson_path, encoding="utf-8") as nf: first_line = nf.readline() first_record = json.loads(first_line) fieldnames = list(first_record.keys()) with open(csv_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for line in nf: line = line.strip() if not line: continue try: record = json.loads(line) writer.writerow(record) count += 1 except json.JSONDecodeError: continue # skip malformed lines return count rows = ndjson_to_csv("access_log.ndjson", "access_log.csv") print(f"Converted {rows} log entries to CSV") # Python 3.8+ — convert NDJSON to CSV line by line import json import csv def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int: """Convert a newline-delimited JSON file to CSV, one line at a time.""" with open(ndjson_path, encoding="utf-8") as nf: first_line = nf.readline() first_record = json.loads(first_line) fieldnames = list(first_record.keys()) with open(csv_path, "w", newline="", encoding="utf-8") as cf: writer = csv.DictWriter(cf, fieldnames=fieldnames) writer.writeheader() writer.writerow(first_record) count = 1 for line in nf: line = line.strip() if not line: continue try: record = json.loads(line) writer.writerow(record) count += 1 except json.JSONDecodeError: continue # skip malformed lines return count rows = ndjson_to_csv("access_log.ndjson", "access_log.csv") print(f"Converted {rows} log entries to CSV") # Before — blank rows on Windows with open("output.csv", "w") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # After — clean output on all platforms with open("output.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # Before — blank rows on Windows with open("output.csv", "w") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # After — clean output on all platforms with open("output.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # Before — blank rows on Windows with open("output.csv", "w") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # After — clean output on all platforms with open("output.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=columns) writer.writeheader() writer.writerows(data) # Before df = pd.read_json("events.json") df.to_csv("events.csv") # CSV gets an extra unnamed column: ,event_id,timestamp,... # After df = pd.read_json("events.json") df.to_csv("events.csv", index=False) # Clean CSV: event_id,timestamp,... # Before df = pd.read_json("events.json") df.to_csv("events.csv") # CSV gets an extra unnamed column: ,event_id,timestamp,... # After df = pd.read_json("events.json") df.to_csv("events.csv", index=False) # Clean CSV: event_id,timestamp,... # Before df = pd.read_json("events.json") df.to_csv("events.csv") # CSV gets an extra unnamed column: ,event_id,timestamp,... # After df = pd.read_json("events.json") df.to_csv("events.csv", index=False) # Clean CSV: event_id,timestamp,... # Before records = json.load(f) writer = csv.DictWriter(out, fieldnames=records[0].keys()) # Misses "discount" field that only appears in records[2] # After records = json.load(f) all_keys = list(dict.fromkeys(k for r in records for k in r)) writer = csv.DictWriter(out, fieldnames=all_keys, restval="") # Every key from every record is included as a column # Before records = json.load(f) writer = csv.DictWriter(out, fieldnames=records[0].keys()) # Misses "discount" field that only appears in records[2] # After records = json.load(f) all_keys = list(dict.fromkeys(k for r in records for k in r)) writer = csv.DictWriter(out, fieldnames=all_keys, restval="") # Every key from every record is included as a column # Before records = json.load(f) writer = csv.DictWriter(out, fieldnames=records[0].keys()) # Misses "discount" field that only appears in records[2] # After records = json.load(f) all_keys = list(dict.fromkeys(k for r in records for k in r)) writer = csv.DictWriter(out, fieldnames=all_keys, restval="") # Every key from every record is included as a column # Before records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] writer = csv.DictWriter(f, fieldnames=["id", "meta"]) writer.writerows(records) # meta column contains: {'source': 'web', 'region': 'us-west'} # After import pandas as pd records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] df = pd.json_normalize(records, sep="_") df.to_csv("events.csv", index=False) # Columns: id, meta_source, meta_region # Before records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] writer = csv.DictWriter(f, fieldnames=["id", "meta"]) writer.writerows(records) # meta column contains: {'source': 'web', 'region': 'us-west'} # After import pandas as pd records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] df = pd.json_normalize(records, sep="_") df.to_csv("events.csv", index=False) # Columns: id, meta_source, meta_region # Before records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] writer = csv.DictWriter(f, fieldnames=["id", "meta"]) writer.writerows(records) # meta column contains: {'source': 'web', 'region': 'us-west'} # After import pandas as pd records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}] df = pd.json_normalize(records, sep="_") df.to_csv("events.csv", index=False) # Columns: id, meta_source, meta_region import json import csv with open("orders.json") as f: records = json.load(f) with open("orders.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) import json import csv with open("orders.json") as f: records = json.load(f) with open("orders.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) import json import csv with open("orders.json") as f: records = json.load(f) with open("orders.csv", "w", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=records[0].keys()) writer.writeheader() writer.writerows(records) import pandas as pd nested_data = [ {"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "[email protected]"}}, ] df = pd.json_normalize(nested_data, sep="_") # Columns: id, customer_name, customer_email df.to_csv("flat_orders.csv", index=False) import pandas as pd nested_data = [ {"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "[email protected]"}}, ] df = pd.json_normalize(nested_data, sep="_") # Columns: id, customer_name, customer_email df.to_csv("flat_orders.csv", index=False) import pandas as pd nested_data = [ {"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "[email protected]"}}, ] df = pd.json_normalize(nested_data, sep="_") # Columns: id, customer_name, customer_email df.to_csv("flat_orders.csv", index=False) # Wrong — blank rows on Windows with open("output.csv", "w") as f: writer = csv.writer(f) # Correct — newline="" prevents double \r with open("output.csv", "w", newline="") as f: writer = csv.writer(f) # Wrong — blank rows on Windows with open("output.csv", "w") as f: writer = csv.writer(f) # Correct — newline="" prevents double \r with open("output.csv", "w", newline="") as f: writer = csv.writer(f) # Wrong — blank rows on Windows with open("output.csv", "w") as f: writer = csv.writer(f) # Correct — newline="" prevents double \r with open("output.csv", "w", newline="") as f: writer = csv.writer(f) import csv new_records = [ {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"}, ] with open("orders.csv", "a", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"]) writer.writerows(new_records) import csv new_records = [ {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"}, ] with open("orders.csv", "a", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"]) writer.writerows(new_records) import csv new_records = [ {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"}, ] with open("orders.csv", "a", newline="", encoding="utf-8") as f: writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"]) writer.writerows(new_records) # Fast for files that fit in memory import pandas as pd df = pd.read_json("large_dataset.json") df.to_csv("large_dataset.csv", index=False) # Streaming for files that don't fit in memory import ijson, csv with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf: records = ijson.items(jf, "item") first = next(records) writer = csv.DictWriter(cf, fieldnames=first.keys()) writer.writeheader() writer.writerow(first) for record in records: writer.writerow(record) # Fast for files that fit in memory import pandas as pd df = pd.read_json("large_dataset.json") df.to_csv("large_dataset.csv", index=False) # Streaming for files that don't fit in memory import ijson, csv with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf: records = ijson.items(jf, "item") first = next(records) writer = csv.DictWriter(cf, fieldnames=first.keys()) writer.writeheader() writer.writerow(first) for record in records: writer.writerow(record) # Fast for files that fit in memory import pandas as pd df = pd.read_json("large_dataset.json") df.to_csv("large_dataset.csv", index=False) # Streaming for files that don't fit in memory import ijson, csv with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf: records = ijson.items(jf, "item") first = next(records) writer = csv.DictWriter(cf, fieldnames=first.keys()) writer.writeheader() writer.writerow(first) for record in records: writer.writerow(record) import csv import sys import json data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]') writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) writer.writeheader() writer.writerows(data) # host,cpu # web-1,72.3 # web-2,45.1 import csv import sys import json data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]') writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) writer.writeheader() writer.writerows(data) # host,cpu # web-1,72.3 # web-2,45.1 import csv import sys import json data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]') writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys()) writer.writeheader() writer.writerows(data) # host,cpu # web-1,72.3 # web-2,45.1 - csv.DictWriter converts a list of dicts to CSV with zero dependencies — use json.load() to parse, then writeheader() + writerows(). - Always open CSV files with newline="" on Windows to prevent blank rows between data rows. - pd.json_normalize() flattens nested JSON into a flat DataFrame before calling to_csv() — handles multi-level nesting automatically. - Pass index=False to DataFrame.to_csv() — without it, pandas writes an unwanted row-number column. - For files over 500 MB, use ijson for streaming JSON parsing combined with csv.DictWriter for constant memory usage.

Share this article

Twitter Facebook LinkedIn Reddit

🏷️ Tags

toolsutilitiessecurity toolsupdateconvertpythoncompleteguide

More from Tools

Tools: Building a Face Authentication System for Linux (Using PAM, systemd, and ONNX) (2026)

2026-03-31 0

Tools: Configure static IP address on Netplan (2026)

2026-03-31 0

Tools: Complete Guide to Push Notifications for Your Self-Hosted Wallet

2026-03-31 0

Tools: Ultimate Guide: WebSocket 金融实时行情推送 API 实战解析：低延迟、高可用架构设计与落地

2026-03-31 0

Trending

1

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

2025-10-27 • 189 views

2

CVE-2025-43939: Dell Unity OS Command Injection (High)

2025-10-30 • 148 views

3

Google disputes false claims of massive Gmail data breach

2025-10-30 • 130 views

4

Microsoft: DNS outage impacts Azure and Microsoft 365 services

2025-10-30 • 88 views

5

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting

2025-11-25 • 81 views

InfinitSec - Latest Cybersecurity, Technology & Gaming News