Tools: Essential Guide: I Built a jq Alternative That Speaks JSONPath — and Deliberately Wrote Almost No Parser Code

Tools: Essential Guide: I Built a jq Alternative That Speaks JSONPath — and Deliberately Wrote Almost No Parser Code

I Built a jq Alternative That Speaks JSONPath — and Deliberately Wrote Almost No Parser Code

The problem in one example

Why Python, and why wrap an existing library

The engine layer is a very thin wrapper

Output formatting is where the actual work went

Streams in, exit codes out

What JSONPath actually isn't — and where jsonpath-ng helps

Where this tool falls short

Try it in 30 seconds

The lesson

Closing A small Python CLI that queries JSON using JSONPath (RFC 9535) instead of jq's DSL. The interesting part isn't the query engine — I wrapped an existing library for that. The interesting part is everything around it: error reporting, exit codes, output modes, Docker packaging, and the judgment call to not rewrite a solved problem. 📦 GitHub: https://github.com/sen-ltd/jsonpath-cli Every couple of months I end up in the same spot: I need to extract a value from a blob of JSON in a shell pipeline. I reach for jq, I get the syntax wrong, I read the man page, I get it wrong again, I curse, I eventually succeed. Next time, I've forgotten everything. Meanwhile, in the rest of my day, I'm writing JSONPath expressions constantly — in Postman tests, in Kubernetes kubectl -o jsonpath, in RestAssured assertions, in my IDE's JSON inspector. JSONPath is in my fingers. jq's DSL is not. This gap has a fix, and the fix is boring: build a CLI that takes JSONPath expressions, reads JSON, prints matches. That's the whole product. I'm going to walk through how I built it, why I didn't write a parser, and what I learned about a query language I thought I understood. Say you have this JSON: You want the names of users older than 40. In jq: Both are fine. Both express the same query. But if you've been writing the bottom one in six different tools all week, having to translate it to the top one — remembering the pipe syntax, the select function, the leading dot instead of $, the fact that .users[] is a flatten and not an index — is a small tax that adds up across a career. jq is a fantastic tool. I just want a version that speaks the dialect I already know. Read stdin or a file. Print matches. Grep-style exit codes. Done. The first serious design decision was: do I write the JSONPath parser myself? The RFC 9535 grammar is not huge — it fits in a few pages. I've written PEG parsers before. It would be interesting. It would also be at least two weeks of work to handle every edge case, and I'd be implementing the fifth or sixth production JSONPath parser in Python. There is already jsonpath-ng, which has been maintained for years, has a compatible extension module for filters (the ?(...) syntax), handles the ambiguous cases the original Goessner 2007 blog post left open, and is one pip install away. The honest answer is: writing another parser would have been a vanity project. The value I can add is not in the parser. It's in how the tool behaves at the shell boundary — error messages, exit codes, output formats, Docker packaging, documentation. That's where the UX of jq is very good and the UX of random JSONPath playgrounds is almost uniformly bad. So: wrap jsonpath-ng, and spend my attention on everything else. A few more judgment calls I made before writing any code: Here's the entire engine module, minus imports and docstrings: The whole engine is about 80 lines, and most of that is error translation. _ext_parse is jsonpath_ng.ext.parse, which understands filter expressions — the plain jsonpath_ng.parse doesn't, which tripped me up for half an hour. The ext module is the right default and not mentioned in the first page of the library's README. The interesting design detail here is that the engine raises my exception type (JSONPathParseError), not jsonpath-ng's internal error. That matters because it lets me re-render the error with a caret pointer the user can actually read: The column is extracted from jsonpath-ng's own error string with a small regex. If I can't find a column, I render the error without the caret — degraded but not broken. The general principle: never let a dependency's error bubble up verbatim, even when the dependency is good, because your users don't know or care what library you wrapped. The engine returns a list of matched Python values. Turning that list into text on stdout is the part users actually see, and I wanted four distinct modes: Four modes in 15 lines, and I could fold them into the CLI function — but keeping formatting in its own function makes the tests almost trivial. You call _format_matches([...], raw=True, ...) and compare strings. No subprocess, no captured streams, no flakes. One subtlety in --raw mode: it only unquotes strings. If your expression matches a number or an object, raw mode falls back to JSON-encoding it. That mirrors jq -r's behavior and is what you actually want in pipelines — jsonpath-cli --raw '$.user.id' | xargs echo should work whether the ID is a string or a number. ensure_ascii=False is there because I know some of my data is Japanese. Without it, {"名前": "ada"} would come out as "\u540d\u524d", which is technically correct JSON and absolutely useless in a terminal. One flag flip, one test case, one less paper cut. The CLI's main function is structured so every external thing it touches — argv, stdin, stdout, stderr — is a parameter with a default: This shape — a main that returns an int and takes injectable streams — is the single biggest upgrade you can make to a Python CLI's testability. My end-to-end tests are literally this: No subprocess.run, no temp files for most cases, no slow startup. Twenty-odd tests run in 0.2 seconds. When a test fails, the assertion error points at the exact string mismatch, not at a wall of captured stderr. The exit code rule is the only place I had to think. Should --count exit 1 when the count is zero? I decided yes, because if you write jsonpath-cli --count '$.errors' in CI, you want a non-zero exit to fire when there are no matches but you're checking match existence, and you want $? to flip on match presence. That's the whole point of grep-style exit codes. Here's the thing I had to accept halfway through building this: JSONPath is not one language. It's a family of mutually incompatible dialects, and until RFC 9535 landed in February 2024, there was no official standard at all — just Stefan Goessner's 2007 blog post and whatever each implementation decided to do. That's why jq, jsonpath-ng, Kubernetes' JSONPath, and Java's JSONPath library all disagree on edge cases like: jsonpath-ng picks a reasonable set of answers and mostly aligns with the RFC. "Mostly" is honest: the RFC is new enough that no Python library is fully conformant yet, and jsonpath-ng predates the RFC by years. For the 95% of queries people actually write — field access, wildcards, recursive descent, simple filters, slices — behavior is boring and consistent, and that's the part I care about. For the 5% edge cases, I documented my position: whatever jsonpath-ng does is what the CLI does, and if the RFC disagrees I'll upgrade when the library does. This is a judgment call I'm comfortable defending. The alternative — writing my own parser that tracks the RFC to the letter — means owning a compliance treadmill for a tool I wrote in a weekend. Not a good trade. Exit codes work the way you'd want: The thing I want to leave you with isn't about JSONPath. It's about the decision I made on day one: I'm not writing the parser. Every time I'm tempted to reimplement a solved problem "for fun" or "to understand it better," I have to weigh that against the actual product I'm trying to ship. Sometimes the learning is the point and reimplementing is right. But sometimes — most times, honestly — the value I can add is at the edges: the error message a user sees at 2am, the exit code their CI script checks, the docker run command that works without a README. Those are the things that turn a library into a tool. And those are the things no one else has already written for you. If there's a user-facing CLI behavior you wish existed, and the hard algorithmic work is already in a library somewhere, go wrap it. The wrapper is the product. Entry #102 in a 100+ portfolio series by SEN LLC. I'm also building this same "wrap a library, ship a great CLI" pattern for a few other query languages. If that sounds interesting, follow along. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

{"users":[{"name":"ada","age":36},{"name":"grace","age":85}]} {"users":[{"name":"ada","age":36},{"name":"grace","age":85}]} {"users":[{"name":"ada","age":36},{"name":"grace","age":85}]} jq -r '.users[] | select(.age > 40) | .name' jq -r '.users[] | select(.age > 40) | .name' jq -r '.users[] | select(.age > 40) | .name' $.users[?(@.age > 40)].name $.users[?(@.age > 40)].name $.users[?(@.age > 40)].name jsonpath-cli '$.users[?(@.age > 40)].name' jsonpath-cli '$.users[?(@.age > 40)].name' jsonpath-cli '$.users[?(@.age > 40)].name' class JSONPathParseError(JSONPathError): def __init__(self, expression: str, detail: str, position: int | None = None): self.expression = expression self.detail = detail self.position = position super().__init__(self._format()) def _format(self) -> str: if self.position is None: return f"invalid JSONPath expression: {self.detail}" pointer = " " * self.position + "^" return ( f"invalid JSONPath expression at column {self.position + 1}: {self.detail}\n" f" {self.expression}\n" f" {pointer}" ) def compile_expression(expression: str) -> Any: if not isinstance(expression, str): raise JSONPathParseError(str(expression), "expression must be a string") if not expression.strip(): raise JSONPathParseError(expression, "expression is empty") try: return _ext_parse(expression) except (JsonPathParserError, JsonPathLexerError) as exc: detail, position = _extract_position(str(exc)) raise JSONPathParseError(expression, detail, position) from exc class JSONPathParseError(JSONPathError): def __init__(self, expression: str, detail: str, position: int | None = None): self.expression = expression self.detail = detail self.position = position super().__init__(self._format()) def _format(self) -> str: if self.position is None: return f"invalid JSONPath expression: {self.detail}" pointer = " " * self.position + "^" return ( f"invalid JSONPath expression at column {self.position + 1}: {self.detail}\n" f" {self.expression}\n" f" {pointer}" ) def compile_expression(expression: str) -> Any: if not isinstance(expression, str): raise JSONPathParseError(str(expression), "expression must be a string") if not expression.strip(): raise JSONPathParseError(expression, "expression is empty") try: return _ext_parse(expression) except (JsonPathParserError, JsonPathLexerError) as exc: detail, position = _extract_position(str(exc)) raise JSONPathParseError(expression, detail, position) from exc class JSONPathParseError(JSONPathError): def __init__(self, expression: str, detail: str, position: int | None = None): self.expression = expression self.detail = detail self.position = position super().__init__(self._format()) def _format(self) -> str: if self.position is None: return f"invalid JSONPath expression: {self.detail}" pointer = " " * self.position + "^" return ( f"invalid JSONPath expression at column {self.position + 1}: {self.detail}\n" f" {self.expression}\n" f" {pointer}" ) def compile_expression(expression: str) -> Any: if not isinstance(expression, str): raise JSONPathParseError(str(expression), "expression must be a string") if not expression.strip(): raise JSONPathParseError(expression, "expression is empty") try: return _ext_parse(expression) except (JsonPathParserError, JsonPathLexerError) as exc: detail, position = _extract_position(str(exc)) raise JSONPathParseError(expression, detail, position) from exc $ jsonpath-cli '$.users[?(' - jsonpath-cli: invalid JSONPath expression at column 9: Parse error at 1:9 near token ( (() $.users[?( ^ $ jsonpath-cli '$.users[?(' - jsonpath-cli: invalid JSONPath expression at column 9: Parse error at 1:9 near token ( (() $.users[?( ^ $ jsonpath-cli '$.users[?(' - jsonpath-cli: invalid JSONPath expression at column 9: Parse error at 1:9 near token ( (() $.users[?( ^ def _format_matches(matches, *, raw, json_output, count, indent): if count: return f"{len(matches)}\n" if json_output: return json.dumps(matches, ensure_ascii=False, indent=indent) + "\n" lines = [] for value in matches: if raw and isinstance(value, str): lines.append(value) else: lines.append(json.dumps(value, ensure_ascii=False, indent=indent)) return "".join(line + "\n" for line in lines) def _format_matches(matches, *, raw, json_output, count, indent): if count: return f"{len(matches)}\n" if json_output: return json.dumps(matches, ensure_ascii=False, indent=indent) + "\n" lines = [] for value in matches: if raw and isinstance(value, str): lines.append(value) else: lines.append(json.dumps(value, ensure_ascii=False, indent=indent)) return "".join(line + "\n" for line in lines) def _format_matches(matches, *, raw, json_output, count, indent): if count: return f"{len(matches)}\n" if json_output: return json.dumps(matches, ensure_ascii=False, indent=indent) + "\n" lines = [] for value in matches: if raw and isinstance(value, str): lines.append(value) else: lines.append(json.dumps(value, ensure_ascii=False, indent=indent)) return "".join(line + "\n" for line in lines) def main(argv=None, *, stdin=None, stdout=None, stderr=None) -> int: parser = build_parser() args = parser.parse_args(argv) stdin = stdin or sys.stdin stdout = stdout or sys.stdout stderr = stderr or sys.stderr try: data = _read_input(args.file, stdin) except _IOProblem as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_IO_ERROR try: matches = evaluate(args.expression, data) except JSONPathParseError as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_PARSE_ERROR output = _format_matches(matches, raw=args.raw, json_output=args.json_output, count=args.count, indent=args.indent) stdout.write(output) return EXIT_OK if matches else EXIT_NO_MATCH def main(argv=None, *, stdin=None, stdout=None, stderr=None) -> int: parser = build_parser() args = parser.parse_args(argv) stdin = stdin or sys.stdin stdout = stdout or sys.stdout stderr = stderr or sys.stderr try: data = _read_input(args.file, stdin) except _IOProblem as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_IO_ERROR try: matches = evaluate(args.expression, data) except JSONPathParseError as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_PARSE_ERROR output = _format_matches(matches, raw=args.raw, json_output=args.json_output, count=args.count, indent=args.indent) stdout.write(output) return EXIT_OK if matches else EXIT_NO_MATCH def main(argv=None, *, stdin=None, stdout=None, stderr=None) -> int: parser = build_parser() args = parser.parse_args(argv) stdin = stdin or sys.stdin stdout = stdout or sys.stdout stderr = stderr or sys.stderr try: data = _read_input(args.file, stdin) except _IOProblem as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_IO_ERROR try: matches = evaluate(args.expression, data) except JSONPathParseError as exc: print(f"jsonpath-cli: {exc}", file=stderr) return EXIT_PARSE_ERROR output = _format_matches(matches, raw=args.raw, json_output=args.json_output, count=args.count, indent=args.indent) stdout.write(output) return EXIT_OK if matches else EXIT_NO_MATCH def _run(argv, stdin_text=""): stdin = io.StringIO(stdin_text) stdout = io.StringIO() stderr = io.StringIO() code = main(argv, stdin=stdin, stdout=stdout, stderr=stderr) return code, stdout.getvalue(), stderr.getvalue() def test_no_match_returns_exit_code_1(): code, out, _ = _run(["$.missing"], stdin_text='{"a": 1}') assert code == 1 assert out == "" def _run(argv, stdin_text=""): stdin = io.StringIO(stdin_text) stdout = io.StringIO() stderr = io.StringIO() code = main(argv, stdin=stdin, stdout=stdout, stderr=stderr) return code, stdout.getvalue(), stderr.getvalue() def test_no_match_returns_exit_code_1(): code, out, _ = _run(["$.missing"], stdin_text='{"a": 1}') assert code == 1 assert out == "" def _run(argv, stdin_text=""): stdin = io.StringIO(stdin_text) stdout = io.StringIO() stderr = io.StringIO() code = main(argv, stdin=stdin, stdout=stdout, stderr=stderr) return code, stdout.getvalue(), stderr.getvalue() def test_no_match_returns_exit_code_1(): code, out, _ = _run(["$.missing"], stdin_text='{"a": 1}') assert code == 1 assert out == "" # Build it git clone https://github.com/sen-ltd/jsonpath-cli cd jsonpath-cli docker build -t jsonpath-cli . # Basic field echo '{"user":{"name":"ada"}}' \ | docker run --rm -i jsonpath-cli --raw '$.user.name' # => ada # Filter echo '{"users":[{"name":"ada","age":36},{"name":"grace","age":85}]}' \ | docker run --rm -i jsonpath-cli --raw '$.users[?(@.age > 40)].name' # => grace # Recursive descent + JSON output echo '{"a":{"b":{"c":42}},"d":{"c":99}}' \ | docker run --rm -i jsonpath-cli --json '$..c' # => [42, 99] # Run the test suite in the image docker run --rm --entrypoint pytest jsonpath-cli # => 29 passed in 0.24s # Build it git clone https://github.com/sen-ltd/jsonpath-cli cd jsonpath-cli docker build -t jsonpath-cli . # Basic field echo '{"user":{"name":"ada"}}' \ | docker run --rm -i jsonpath-cli --raw '$.user.name' # => ada # Filter echo '{"users":[{"name":"ada","age":36},{"name":"grace","age":85}]}' \ | docker run --rm -i jsonpath-cli --raw '$.users[?(@.age > 40)].name' # => grace # Recursive descent + JSON output echo '{"a":{"b":{"c":42}},"d":{"c":99}}' \ | docker run --rm -i jsonpath-cli --json '$..c' # => [42, 99] # Run the test suite in the image docker run --rm --entrypoint pytest jsonpath-cli # => 29 passed in 0.24s # Build it git clone https://github.com/sen-ltd/jsonpath-cli cd jsonpath-cli docker build -t jsonpath-cli . # Basic field echo '{"user":{"name":"ada"}}' \ | docker run --rm -i jsonpath-cli --raw '$.user.name' # => ada # Filter echo '{"users":[{"name":"ada","age":36},{"name":"grace","age":85}]}' \ | docker run --rm -i jsonpath-cli --raw '$.users[?(@.age > 40)].name' # => grace # Recursive descent + JSON output echo '{"a":{"b":{"c":42}},"d":{"c":99}}' \ | docker run --rm -i jsonpath-cli --json '$..c' # => [42, 99] # Run the test suite in the image docker run --rm --entrypoint pytest jsonpath-cli # => 29 passed in 0.24s echo '{"a":1}' | docker run --rm -i jsonpath-cli '$.nothing' ; echo $? # => 1 echo 'not json' | docker run --rm -i jsonpath-cli '$.a' ; echo $? # => 3 echo '{"a":1}' | docker run --rm -i jsonpath-cli '$.nothing' ; echo $? # => 1 echo 'not json' | docker run --rm -i jsonpath-cli '$.a' ; echo $? # => 3 echo '{"a":1}' | docker run --rm -i jsonpath-cli '$.nothing' ; echo $? # => 1 echo 'not json' | docker run --rm -i jsonpath-cli '$.a' ; echo $? # => 3 - CLI-first, not library-first. The Python ecosystem already has plenty of JSONPath libraries you can import. What it doesn't have is one with a clean jq-style command-line UX. So jsonpath-cli has a minimal library surface (just compile_expression, evaluate, iter_matches for tests), and the real product is cli.py. - Docker as the primary distribution channel. pip install jsonpath-cli is fine for Python folks, but the people who would benefit most from this tool don't necessarily have a clean Python environment — they have bash, a Mac, and Docker. Making docker run the first-class entry point means anyone can try it in 30 seconds without touching their system Python. - Grep-like exit codes. jq exits 0 when a query parses, regardless of whether it matched anything. That's fine for some workflows but awful when you're writing a shell pipeline like if jsonpath-cli '$.error' logs.json; then .... I wanted grep behavior: 0 on match, 1 on no match, 2 on expression error, 3 on I/O / invalid JSON. - Does $[*] on an object iterate values or keys? - Is $..['a','b'] valid, or do you need two separate queries? - What does $.store.books[?(@.price)] return — truthy-filter semantics, existence-filter semantics, or a parse error? - Is the root $ mandatory, and what does a path without it mean? - Streaming. The CLI reads the entire input into memory and parses it as one JSON document. If you pipe a 5 GB log file in, it'll OOM. A real streaming implementation would need an event-based JSON parser (ijson) and a reimplementation of the JSONPath engine in a push-based style. Neither is small work. For log files, jq --stream is still your answer. - Output ordering on recursive descent. $..price returns matches in the order jsonpath-ng walks the tree, which is stable but not necessarily what you'd expect from a textual reading of the source JSON. I documented this with a test that asserts the set matches, not the sequence. - No JSONPath 2 features yet. Things like $..[?(@.tags contains 'python')] work in some dialects but not in jsonpath-ng's ext module. That's upstream, not something I'm going to fix. - Docker image is ~60 MB. Multi-stage Alpine build, but the python:3.12-alpine base is still 45 MB by itself. A Go rewrite would get this under 10 MB. That's a rewrite I might actually do, because for a CLI tool installable via docker run, download size is user-facing performance.