Google used its annual developer conference to showcase what the company is calling its lightest and most efficient artificial intelligence models.
At Google I/O on Tuesday, the company announced Gemini 1.5 Flash, the newest addition to the Gemini series. Google said in a blog post that the new model can quickly summarize conversations, caption images and videos and extract data from large documents and tables.
“We heard from developers that they wanted something faster and even more cost-effective,” said Demis Hassabis, CEO of Google DeepMind, in a press briefing.
The unveiling comes as tech companies increasingly refocus their product development and rollouts around generative AI, which is of particular importance to Google because the new tools give consumers more advanced and creative ways to access online information compared with traditional web search.
OpenAI on Monday launched a new AI model and desktop version of ChatGPT, along with a new user interface. The new model, called GPT-4o, is twice as fast as GPT-4 Turbo and half the cost, the company said.
Google recently announced an improved Gemini 1.5 Pro model, which can make sense of multiple large documents — 1,500 pages total — or summarize 100 emails, according to a vice president working on Gemini.
Gemini 1.5 Pro will soon be able to handle an hour of video content, or codebases with more than 30,000 lines, said Sissie Hsiao, a vice president at Google and the general manager for Gemini experiences.
“You can quickly get answers and insights about dense documents, like figuring out the details of the pet policy in your rental agreement or comparing key arguments of multiple long research papers,” Hsiao said.
OpenAI’s latest upgrade brings with it improved quality and speed and allows ChatGPT to handle 50 different languages. It will also be available via OpenAI’s application programming interface, or API, allowing developers to begin building applications using the new model immediately, executives said.
With 35 languages, Google says Gemini 1.5 Pro has a 2 million token window, which measures context and indicates how much information the model can process at once. The new model has improved local reasoning, planning and image understanding, company executives said.
“It offers the longest context window of any foundational model yet,” Alphabet CEO Sundar Pichai said in the press briefing. At the event, he gave an example of a parent asking Gemini to summarize all recent emails from their child’s school.
Gemini 1.5 Pro will initially be available for testing in Workspace Labs. Gemini 1.5 Flash will be available for testing and in Vertex AI, which is Google’s machine learning platform that lets developers train and deploy AI applications.
Don’t miss these exclusives from CNBC PRO
Read the full article here