In the world of large language models, the tech underpinning artificial intelligence, size matters. And Google said it’s allowing users to feed its Gemini 1.5 Pro model more data than ever.
More from Google I/O 2024
During the Google I/O developers conference on Tuesday, Alphabet CEO Sundar Pichai said Google is increasing Gemini 1.5 Pro’s context window from 1 million to 2 million tokens. Pichai said the update will be made available to developers in “private preview,” but stopped short of saying when it may be available more broadly.
“It’s amazing to look back and see just how much progress we’ve made in a few months,” Pichai said after announcing that Google is doubling Gemini 1.5 Pro’s context window. “And this represents the next step on our journey towards the ultimate goal of infinite context.”
Large language models, or LLMs like Gemini 1.5 Pro, are AI models that are trained on enormous amounts of data to understand language so that tools like Gemini — the search giant’s competitor to ChatGPT — can generate content that humans can understand.
Doubling Gemini 1.5 Pro’s context window from 1 million to 2 million tokens could dramatically improve the results you get from Google’s LLM. But tokens, context windows and other AI jargon is decidedly nebulous. And without some of that context Pichai was so interested in discussing, it may be difficult to know why 2 million tokens is such a big deal.
Read on for a primer on tokens, and how increasing the number can change how you use and interact with Gemini going forward. And for more on Gemini and other AI tools like ChatGPT, Microsoft Copilot, Perplexity and Claude as well as news, tips and explainers on all things AI, check out CNET’s AI Atlas resource.
What are tokens in AI?
In AI, tokens are pieces of words that the LLM evaluates to understand the broader context of a query. Each token is made up of four characters in English. Those characters can be letters and numbers, of course, but also spaces, special characters and more. It’s also important to note that an individual token’s length will vary by language.
As AI models add the ability to analyze images, video and audio, they similarly use tokens to get the full picture. If you input an image into a model for context, AI models will break the picture down into parts, with each part representing tokens.
Tokens are used both as inputs and outputs. So, when users input a query into an AI model, the model itself breaks down the words into tokens, analyzes it, and delivers a response in tokens that are then converted into words that humans understand.
OpenAI, the company that owns ChatGPT, offers a handy example for understanding tokens. Have you ever heard Wayne Gretzky’s famous quote, “You miss 100% of the shots you don’t take?” That sentence is made up of 11 tokens. If you swap out the percentage symbol for the word percent, the token count increases to 13 tokens.
If you’re interested in seeing how many tokens make up your text, check out OpenAI’s Tokenizer tool, which allows you to input text and see how many tokens it uses.
Understanding how many tokens are contained in any word or sentence is important. The more tokens available in a context window, the more data you can input into a query and the more data the AI model will understand and use to deliver results.
Watch this: Google Introduces New AI Tools for Music, Video and Images
What does the context window do?
No conversation about tokens is complete without explaining the context window. Indeed, it’s in the context window where tokens are used — and matter most.
Think of a context window as the length of your memory. The bigger the context window, the more memory you can access to understand what someone is saying and answer them appropriately. Context windows help AI models remember information and reuse it to deliver better results to users. The larger the context windows (meaning, the more tokens it can use in a dialogue with users), the better its results.
“You might have had an experience where a chatbot ‘forgot’ information after a few turns,” Google wrote in a blog post earlier this year. “That’s where long context windows can help.”
Why would it be better to have more tokens?
So, why are more tokens better? It comes down to simple math.
The more tokens a context window can accept, the more data you can input into a model. The more data you can input, the more information the AI model can use to deliver responses. The better the responses, the more valuable the experience of using an AI model.
Think of it this way: If you wanted to get a synopsis about an important moment in world history, only giving an AI model a sentence to digest and deliver a summary wouldn’t be all that useful. But imagine feeding it an entire book about the event and the superior result you’ll receive. The latter case is only made possible with more tokens.
When will Google’s updated context window be available?
Google’s updated context window is only launching on its Gemini 1.5 Pro model for now. Pichai said it’ll be available to developers in a “private preview” first, with Google revealing later during the I/O event that it would be launched “later this year.” So, stay tuned.
What is infinite context and when will we get there?
Pichai referenced a future in which we’ll get to “infinite context,” a point at which LLMs will be able to ingest and output an infinite amount of data, effectively giving them access to all the world’s knowledge to deliver superior results. But truth be told, we’re nowhere close.
One of the problems with increasing tokens is that it takes more compute power with each increase. And while infinite context is indeed something AI supporters are looking forward to, no one can say for sure when, or even if, compute power would reach a level where that’s possible.
In a blog post in February, Google touted how at the time, Gemini 1.5 Pro supported 1 million tokens. And while the company acknowledged that it is working on expanding context windows, at the time, its research was able to achieve only a context window of 10 million tokens — a far cry from infinite.
However, as you continue to use AI models, expect context windows to increase not only from Google, but other providers, as well. And along the way, enjoy the better results expanded token availability makes possible.
Editor’s note: CNET is using an AI engine to help create a handful of stories. Reviews of AI products like this, just like CNET’s other hands-on reviews, are written by our human team of in-house experts. For more, see CNET’s AI policy and how we test AI.