Json output different from normal output

Hi everyone,

Does anyone know how OpenAI formats the output as JSON after we define the schema inside a function using the API?

Does the same LLM that generates the response also process the JSON formatting? If so, has anyone measured whether using JSON mode impacts performance? If a single LLM is handling both tasks (following the prompt request and adhering to the schema instructions), I imagine it could affect the model’s self-attention and potentially change its performance.

It wouldn’t be too hard to test, I guess I could set up a coding experiment or math question and compare responses in normal mode versus JSON mode. But I’m curious if anyone has already conducted such experiments. I’m particularly interested in non-coding and non-math questions, is just that it’s harder to measure differences and mistakes.

The alternative would be that a separate LLM or program processes the initial model response and converts it into JSON. If that were the case, we would expect the output quality to remain the same, but latency could increase (which does seem to happen with JSON responses). On top of that, this setup would likely be more costly for OpenAI since a single request would require two inference steps, so I don’t think they would want that

Also, is it just me, or does this whole thing feel a bit unscientific? Features get released, sometimes they work, sometimes they don’t, and the inner workings remain secret, making it difficult for companies to decide whether to implement them. This doesn’t exactly encourage businesses to integrate LLMs into their systems

First: if you are sending a “function” schema, you are getting an AI that calls a function with that structure when the function is useful.

Perhaps you are referring to JSON output being employed as a response_format by:

  • text - You simply talk to the AI about what the structured response output must be, like if it is being sent directly to an API
  • json_object - you must still instruct the AI in developer instruction how to write the desired JSON response as a requirement, as though to a parser.
  • json_schema - you provide a JSON schema to instruct, contain, and validate the AI’s JSON response. The schema is provided to the AI to follow.
  • json_schema + strict - you provide a JSON schema, and additional enforcement is made so only that JSON schema and ordering can be followed as AI output.

In all cases, the contents of strings and usage and selection of keys are by the same ChatAI. It is still actually writing the JSON you receive.

A bunch of instructions about “you aren’t talking to a user, you are sending to a pretty display API” will naturally change the quality of the output. The more you want “user_sentiment” and “user_memories_to_store” or other concurrent jobs, the more a key “response_to_user” will diverge from what the AI would normally produce.

You could encapsulate the whole response in a JSON’s key for some other use, but that doesn’t seem productive. JSON is typically useful for obtaining more data fields than just “Fine, and how are you?”.