Google’s Gemini AI: The Multimodal Supermodel Aiming to Outshine GPT-4 and Beyond
Gemini debuted in December 2023 as Google DeepMind’s multimodal AI with Ultra, Pro, and Nano variants. Gemini is trained as a native multimodal model that can process text, images, audio, code, and video. Gemini Ultra achieved 90.0% on the MMLU test, leading on 30 of 32 academic benchmarks and beating GPT-4’s ~86.4%. Gemini 1.0 released December 2023 offered Ultra, Pro, and Nano sizes, with Nano capable of running on Pixel phones for on-device tasks. Gemini 1.5 Pro, revealed February 2024 and broadly deployed by May 2024, used a multimodal Mixture-of-Experts architecture and supported a 1,000,000-token context window (up to 2,000,000