iIt has been almost two years since OpenAI was released. Chat GPT On an unsuspecting world, and the world, close to the stock market, lost its mind. Everywhere, people were wringing their hands wondering: what would this mean? [enter occupation, industry, business, institution].

Within academia, for example, humanities professors worried about how they would be able to grade essays afterward if students were using ChatGPT or similar technology to help them write. The answer, of course, is to come up with better ways of grading, because students will use these tools for the simple reason that it would be silly not to – just as it would be silly to budget without a spreadsheet. But universities are slow-moving beasts and, as I write, there are committees in many ivory towers that are seriously trying to formulate “policies on the use of AI.”

As he intended, though, OpenAI’s senseless distortion has opened up another conundrum for academia — a new kind of large language model (LLM) that can — allegedly — “Reasoning“They’ve named it OpenAI o1, but since it was known internally as Strawberry, we’ll stick with that. The company Describes as “The first in a new series of AI models designed to spend more time thinking before they respond”. They can “reason through complex tasks and solve more difficult problems than previous models in science, coding and mathematics”.

In a way, Strawberry and its upcoming cousins ​​are a response to the strategies previously deployed by skilled users of LLM to overcome the fact that models are intrinsically “A one-shot LL.M” – prompted with an example to generate answers or perform tasks. The trick used by the researchers. “Chain of thought” prompting was used to improve model performance. This forced the model to respond to a carefully crafted sequence of detailed cues and thus provided more sophisticated responses. What OpenAI seems to have done with Strawberry is to internalize this process.

So whereas with earlier models like the GPT-4 or Claude, one would prompt them and they would respond immediately, with Strawberry a prompt usually creates a delay while the machine “thinks” something. . It involves an internal process of coming up with a number of possible responses followed by some kind of evaluation, after which the most plausible one is selected and delivered to the user. .

As described by OpenAI.Strawberry “learns to improve her train of thought and the strategies she uses. It learns to recognize and correct its own mistakes. It learns to break down difficult steps into easy ones. It learns to try a different approach when the current one isn’t working. This process dramatically improves the inference ability of the model.

This means that somewhere inside the machine there is a record of the “chain of thought” that led to the final result. In principle, this seems like a step forward as it could reduce the opacity of LLMs – the fact that they are essentially black boxes. And that matters, because humanity would be mad to entrust its future to decision-making machines whose internal processes—accidental or corporate design—are implausible. Disappointingly, though, OpenAI is reluctant to let users look inside the box. “we have decided” It says“to not show raw chains of thought to users. We recognize that this decision has its drawbacks. We partially overcome this by teaching the model to reproduce any useful ideas from the chain of thought in response.” Trying to make amends.” Translation: The strawberry box is a slightly lighter shade of black.

The new model has attracted a lot of attention because the idea of ​​an “analytical” machine hinders progress toward more “intelligent” machines. But, as usual, all these loaded terms have to be cleared by quotation marks so that we don’t anthropomorphize machines. They are still just computers. Even so, some people have been intimidated by some of the unexpected things that strawberries seem to be capable of.

The most interesting of these was provided during OpenAI’s internal testing of the model, when its computer hacking capabilities were being explored. The researchers asked him to hack into a secure file and report on its contents. But the test designers made a mistake – they tried to put Strawberry in a virtual box with the protected file but failed to realize that the file was inaccessible.

According to Their reportAfter encountering the problem, Strawberry then surveyed the computer used in the experiment, discovered a bug in a misconfigured part of the system that he shouldn’t have been able to access, and modified how VirtualBox worked. are, and created a new box with it. files are required. In other words, he did what any resourceful human hacker would have done: after encountering a problem (caused by human error), he explored his software environment to find a way out. Find out, and then take the necessary steps to accomplish this task. was set. And he dropped a track that explained his reasoning.

Or, to put it another way, he used his initiative. Just like a human being. We could use more machines like this.

Skip the newsletter promotion of the past.

What I have been reading.

Rhetoric asked.
The threat of superhuman AI isn’t what you think. I have a wonderful article by Shannon Weller. November magazine on the sinister brutality of a tech industry that talks of its creations as being “superhuman.”

Guess again.
Benedict Evans has written a beautiful piece, Asking the wrong questionsArguing that we don’t get our predictions about technology wrong as much as we get our predictions about things wrong.

On the brink
Historian Timothy Snyder’s sobering essay on our choices regarding Ukraine, To be or not to be.



Source link