
Google researchers last week unveiled a new artificial intelligence (AI) architecture that could enable large language models (LLMs) to remember the long-term context of events and topics. A paper was published by Mountain View-based tech giant on the subject, and the researchers claim that AI models trained using this architecture showed a more “human-like” memory retention ability. . Specifically, Google ditched traditional transformer and recurrent neural network (RNN) architectures to develop a new way to teach AI models how to remember contextual information.
Titans can scale the context window of AI models to over 2 million tokens.
The project’s lead researcher, Ali Behrouz, Posted About the new architecture on X (formerly known as Twitter). They claim that the new architecture provides a meta-contextual memory with attention that teaches AI models how to remember information in test-time computations.
That’s according to Google’s paper. published In the preprint online journal arXiv, the Titans architecture can expand the context window of AI models by two million tokens. Memory has been a difficult problem for AI developers.
Humans remember information and events with context. If someone asks a person what they wore last weekend, they may be able to remember additional contextual information, such as going to a birthday party for someone they’ve known for the past 12 years. are That way, when a follow-up is asked a question about why they wore a brown jacket and denim jeans last weekend, that person will be able to contextualize all of that short-term and long-term information. .
On the other hand, AI models usually use modified retrieval-augmented generation (RAG) systems for transformer and RNN architecture. It uses information as neural nodes. So, when an AI model is asked a question, it accesses the specific node that contains the key information, as well as nearby nodes that may contain additional or related information. However, once a query is resolved, the information is removed from the system to save processing power.
However, it has two downsides. First, an AI model cannot remember information in the long term. If someone wants to ask a follow-up question after the session is over, they have to provide the full context again (unlike how humans work). Second, AI models do a poor job of retrieving long-term contextual information.
With Titans AI, Behrouz and other Google researchers sought to build an architecture that would enable AI models to develop a long-term memory that could be run continuously while forgetting information to make it computationally efficient. can go
To this end, the researchers developed an architecture that encodes history into neural network parameters. Three types were used – memory as context (MAC), memory as gating (MAG), and memory as layer (MAL). Each of them is suitable for different tasks.
Additionally, Titans uses a new surprise-based learning system, which asks AI models to remember unexpected or important information about a subject. These two changes allow the Titans architecture to show improved memory function in LLMs.
In the BABILong benchmark, Titans (MAC) performs well, where it effectively outperforms larger models such as GPT-4, Llama3 + RAG, and Llama3-70B, over a 2M context window. is pic.twitter.com/ZdngmtGIoW
— Ali Behrouz (@behrouz_ali) January 13, 2025
In a separate post, Behrouz claimed that based on internal testing on the BABILong benchmark, Titans (MAC) models were able to outperform major AI models such as GPT-4, LLama 3 + RAG. and LLama 3 70B.