Claude Code JSON Output in CI

In short: Claude Code JSON output is structured machine-parseable output produced by the --output-format json flag, optionally constrained to a shape with --json-schema. Instead of free-form prose, the run returns a JSON object containing the result, session metadata, and (with a schema) a structured_output field that downstream automation can parse reliably.

Why Claude Code JSON output matters in CI

Once a job runs Claude Code headlessly, the next question is what your pipeline does with the answer. By default the answer is prose: a paragraph a human would read. Prose is hostile to automation. If a later step needs to post each finding as a pull-request comment, open a ticket, or fail the build when a critical issue appears, it has to dig structured facts out of natural language, and that means regular expressions that break the moment the wording shifts. Claude Code JSON output solves this at the source by returning the answer as data instead of prose.

You opt in with the --output-format flag, which accepts three values. text is the default human-readable prose. json wraps the response in a structured object that includes the text result, the session ID, and run metadata. stream-json emits newline-delimited JSON events as they happen, which suits long-running interactive UIs. For a CI step that needs to hand findings to another tool, json is the format that makes the rest of the pipeline deterministic.

Structured CI output: Output emitted by --output-format json (optionally shaped by --json-schema) that a downstream program can parse field by field, replacing fragile text parsing with a reliable data contract between Claude Code and the rest of the pipeline.

What the JSON object actually contains

Running claude -p "summarise this project" --output-format json returns a single JSON object rather than a bare string. The human-readable answer lives in the result field, and alongside it the object carries metadata: the session_id for the run, token usage, and cost details such as total_cost_usd and a per-model breakdown. That metadata is useful in its own right, a CI job can track spend per invocation, or capture the session ID to continue the same conversation in a later step. The key shift is that every part of the answer now has a named field instead of being buried in a paragraph.

To extract a field you reach for an ordinary JSON tool. Piping the output through jq is the documented pattern: claude -p "summarise this project" --output-format json | jq -r '.result' pulls just the text answer out as a raw string. Because the shape is predictable, the parsing step is stable across runs in a way that text scraping never is.

json

the --output-format value for CI

.result

field holding the text answer

structured_output

field holding schema-conforming data

Forcing a shape with --json-schema

Plain --output-format json gives you a reliable envelope, but the result field inside it is still free-form text. When a downstream step needs specific fields, a list of findings, each with a file, a line, and a severity, you constrain the answer with --json-schema and a JSON Schema definition. Claude then returns data conforming to that schema in the structured_output field of the response.

For example, asking Claude to extract function names with a schema that requires an object with a functions array of strings returns exactly that array, ready to consume with jq '.structured_output'. Applied to code review, you would define a schema whose findings array contains objects with file, line, severity, and message, and your comment-posting step iterates that array directly. The schema turns an open-ended request into a data contract: the shape your automation expects is the shape it is guaranteed to receive, which is the heart of what makes AI automation deterministic enough to trust in a pipeline.

From a headless run to inline PR comments

Loading diagram...

A schema-constrained JSON answer flows through jq into a loop that posts one PR comment per finding.

The pattern this knowledge point replaces

The clearest way to understand structured output is to see the anti-pattern it removes. Without it, a team asks Claude to "list each issue on its own line, with the file and line number first", then writes a regex to peel apart those lines. It works in the demo and fails in production: one run uses a colon, the next uses a dash; one wraps long messages, the next does not; an unexpected preamble shifts every match. Each variation is a silent break that posts garbage comments or none at all. Lowering the temperature does not fix it, because the failure is structural, not stylistic.

Switching to --output-format json with a schema eliminates the entire class of bug. There is no formatting to guess at because the contract is explicit, and there is no prose to parse because the fields are named. This is the same principle that Domain 4 applies to extraction with tool use and JSON schemas, brought into the CI context: define the structure you need and let the model fill it, rather than hoping its prose happens to be parseable.

The three output formats, and when each fits

--output-format is not a binary switch between prose and JSON; it has three settings, and choosing the wrong one is a quieter mistake than skipping it entirely. text is the default human-readable prose. json returns a single structured object once the run completes, the right choice for a CI step whose job is to hand a finished result to another tool. stream-json emits newline-delimited JSON, one event object per line, as the run progresses; it is built for real-time interfaces that render tokens and tool calls as they happen.

The trap is reaching for stream-json because it sounds the most advanced. For a pipeline step that produces findings and passes them on, streaming is the wrong shape: you would have to reassemble many event lines into a final answer yourself, which reintroduces parsing fragility from a different direction. A CI reviewer wants one object it can hand to jq, so json is the format. Reserve stream-json for places where a human is watching output arrive, a live dashboard, say, where the documented pattern pairs it with --verbose and --include-partial-messages and filters text deltas with a small jq expression. The stream even carries system events such as system/init, which reports the model and loaded plugins, and system/api_retry, which fires before a retryable API error is retried, so a streaming consumer can surface progress; a step that only needs the final answer has no use for any of that.

The metadata you get for free

A common misread of --output-format json is that the envelope matters only for the result text. In fact the object carries operational metadata that turns a one-shot run into something a pipeline can govern. The response includes total_cost_usd and a per-model cost breakdown, so a scripted caller can track spend per invocation without opening the usage dashboard, useful for a job that wants to alert or fail when a single run grows unexpectedly expensive. It also carries the session_id, the thread you capture now and resume later. That single field is the bridge to incremental review: store it from the first review and a later run can continue the very same conversation with --resume rather than starting cold.

This is why structured output is worth adopting even before you need a strict schema. The plain json envelope already gives you a stable place to read the answer, the cost, and the session handle, three things a maturing pipeline tends to want before long. The schema is the second layer you add only when a downstream step needs specific, typed fields. Reaching for json first and --json-schema second keeps the progression natural: get a reliable envelope, then pin the shape inside it when the consumer demands it.

Pairing a reviewer persona with structured output

The flags compose cleanly. A frequent CI pattern sets the reviewer's role with --append-system-prompt while still returning machine-parseable output, for example, piping a pull-request diff into a run that is told to behave as a security engineer and to emit JSON. Anthropic documents exactly this shape: a gh pr diff piped into claude -p with an appended system prompt of "You are a security engineer. Review for vulnerabilities." and --output-format json. The persona shapes what Claude looks for; the output format guarantees the answer returns as data the next step can consume. Keeping those two concerns separate, what to review versus how to return it, is what makes the invocation both expressive and parseable, and it generalises to any reviewer role you want to script.

What still belongs to you after the schema

A schema constrains the shape of the answer, but it does not absolve the pipeline of judgement about the contents. Two habits keep schema-driven steps robust. First, treat the structured field as a checkpoint: read structured_output and confirm the expected key exists before you loop it, so a run that somehow returned prose instead of conforming data fails loudly rather than feeding an empty array into your comment step. Second, keep the schema as tight as the downstream step actually needs, marking the fields your loop reads as required means a missing severity or line is caught at parse time instead of producing a half-built comment. The schema is a contract, and a contract is only as useful as the side that checks it.

None of this is heavyweight. The whole point of structured output is that the checks become simple field lookups rather than fragile text heuristics. Where free-form parsing forced you to anticipate every way a sentence might be phrased, a schema lets you assert a handful of named properties and move on. That shift, from guessing at prose to asserting over fields, is the reason this knowledge point sits at the apply level rather than at remember: you are not recalling a flag, you are designing a reliable data contract between Claude and the rest of the pipeline.

How this knowledge point is tested

This is an apply-level knowledge point in Scenario 5, so exam items hand you a concrete CI failure and ask for the most robust fix. The tell-tale setup is a pipeline that parses Claude's text output and keeps breaking, or a requirement to feed findings into another automated system such as the GitHub API. The wrong answers are tempting because they sound like reasonable engineering: tune the temperature, ask for a tidier text format, or add more parsing rules. The right answer recognises that text is the problem and that --output-format json (with --json-schema when a specific shape is required) is the durable solution. Remember that structured output is a property of how you invoke the run, which is why headless mode is its hard prerequisite.

Worked example

A CI job finds security issues and a later step must post each one as an inline PR comment, but the text-parsing step keeps breaking.

The pipeline runs claude -p "find security issues in the diff and describe each one" and captures the prose. A Bash script then tries to split that prose into individual findings with a regular expression and call the GitHub API for each. It worked when it shipped, but two weeks later half the PRs get either no comments or a single garbled one, because Claude phrased a few findings differently and the regex stopped matching.

The durable fix changes the invocation, not the parser. Run claude -p "find security issues in the diff" --output-format json --json-schema '{"type":"object","properties":{"findings":{"type":"array","items":{"type":"object","properties":{"file":{"type":"string"},"line":{"type":"integer"},"severity":{"type":"string"},"message":{"type":"string"}},"required":["file","line","severity","message"]}}},"required":["findings"]}'. Now the response carries a structured_output.findings array with a guaranteed shape.

The comment step becomes trivial and stable: jq -c '.structured_output.findings[]' yields one finding per line, and the script loops them, reading .file, .line, .severity, and .message to build each inline comment. No phrasing change can break it, because the script never reads prose. It reads fields whose names and types the schema pins down. The same data also lets the job fail the build when any finding has a critical severity, simply by checking the array.

Misconceptions worth pinning down

These two traps separate architects who understand the mechanism from those who pattern-match on symptoms.

Misconception

If the text output keeps breaking my parser, I should lower the temperature so the formatting is identical every run.

What's actually true

Temperature nudges word choice; it does not guarantee a parseable structure, and any drift still breaks the regex. The reliable fix is to stop parsing prose: --output-format json returns named fields, and --json-schema pins their shape, so there is nothing to format-match.

Misconception

--output-format json and --json-schema are the same thing, so I only need one of them.

What's actually true

They are layered. --output-format json gives you a structured envelope whose result field is still free-form text. Adding --json-schema constrains the answer to a shape you define and returns it in structured_output. Use the schema when a downstream step needs specific, typed fields.

Misconception

stream-json is the structured format I should use to feed findings into another CI tool.

What's actually true

stream-json emits one JSON event per line as the run progresses, which suits a live streaming UI. A CI step that hands a finished result to another tool wants the single consolidated object from --output-format json, optionally shaped by --json-schema. Choosing stream-json there forces you to reassemble events into a final answer yourself, trading one parsing problem for another.

Check your understanding

A CI job runs Claude Code to find security issues, and a later step must post each finding as an inline PR comment through the GitHub API. The current step captures Claude's plain-text reply, and a brittle regex keeps breaking when the wording changes. What is the most robust fix?

Watch and learn

Official Anthropic Academy lessons first, then hand-picked walkthroughs. Videos load only when you press play.

No videos curated for this concept yet

We are still curating the best official and community videos for this topic.

References & primary sources

Adaptive study

Master this concept with Archie

Practice it inside an adaptive study session. Archie, your Socratic AI tutor, tracks your mastery with Bayesian Knowledge Tracing and schedules the perfect next review.

Start studying

Claude Code JSON Output for Structured CI Findings

Why Claude Code JSON output matters in CI

What the JSON object actually contains

Forcing a shape with --json-schema

The pattern this knowledge point replaces

The three output formats, and when each fits

The metadata you get for free

Pairing a reviewer persona with structured output

What still belongs to you after the schema

How this knowledge point is tested

Misconceptions worth pinning down

People also ask

Watch and learn

References & primary sources

Master this concept with Archie