Open AI, The Company Behind the popular ChatGPT Large language model Generative AI service, has released a tool that will return query results in the exact format specified by users.

It is most requested among developers using AI technology, According to To the CEO of OpenAI Sam Altman.

Since its launch in November 2022, the ChatGPT service has been used by countless application developers. Agent-based and other apps. But while ChatGPT uses unstructured data to deliver its results (via API-driven Function calling), application developers will prefer to deliver these results as structured data, so that they can be better analyzed by their respective apps.

Last year, OpenAI expanded its API to include rendered results. JSON. This week, the company goes a step further, revealing a new API feature called Structured outputswhich will provide whatever results. JSON schema The developer states in the query.

“Developers have long been working around the limitations of LLMs. […] Through open source tooling, prompting, and retrying frequent requests to ensure that model output matches the formats required to interoperate with their systems,” wrote Michelle Pokras, Open AI Technical staff member, in one Blog item Posted on Tuesday.

“Structured outputs solve this problem by training OpenAI models to match developer-supplied schemas and to better understand complex schemas.”

How Structured Outputs Improve LLM Quality

Structured outputs differ from those produced by simple user prompts in that they are constrained in what information they can convey, a technique known as “constrained sampling” or “constrained decoding”. .

“To force correct outputs, we limit our models to only the tokens that will be valid according to the provided schema, rather than all available tokens,” Pokras explained.

For ChatGPT, this extra step of filling in the schema improves its accuracy.

OpenAI developers found in tests that ChgatGPT was able to fill in the default schema 100% correctly, but would only provide correct answers 85% of the time with simple prompts.

The OpenAI chart compares the reliability of ChatGPT responses generated for JSON schemas versus those generated from the command prompt.

The OpenAI chart compares the reliability of ChatGPT responses generated for JSON schemas versus those generated from the command prompt.

How to Generate Structured Output

Now, developers provide a JSON schema when their apps submit a request. json_schemawhich is a new option for response_format Given the parameter tough The value must be set as “true” inside the function definition. The output of the model then best matches the answers to the schema (this works for both AI tools and direct user inquiries).

Answers will still adhere to OpenAI. Safety requirements — which blocks potentially harmful content — and will return refusal A string value for questions that will not answer this. And there are other limitations: it only supports a subset. JSON schema. This will not prevent errors in the model definition, and there is an additional delay in the first response, as ChatGPT configures the developer’s schema.

both of them Node.js And The python The software development kits from OpenAI have been updated with new response_format parameter

The biggest potential use case is, of course, formatting. Unstructured data In structured data, it can therefore be entered and analyzed. Relational database systems. Which has long been a challenge for organizations with a lot of information buried in office documents.

But poker does illustrate some potential innovative uses of the technology, including creating a user interface based on user input and providing a single answer without supporting content.

It’s a “great feature and much needed,” says the machine learning researcher Elvis Saravia noted On X. Saravia created a tutorial on using Structured Outputs for YouTube:

The group Created with Sketch.





Source link