UPDATED: SAN FRANCISCO, May 6, 2026, 05:05 (PDT)
Google’s claim that a Bard-era AI system learned Bengali on its own is facing scrutiny because a former company researcher pointed to Google training data showing Bengali was already in the mix. The dispute cuts at a basic question in the AI race: whether companies are describing real technical surprises, or dressing up known training effects as mystery.
It matters now because Bard no longer sits on the edge of Google’s product line. It has been folded into Gemini, which Google is pushing into cars, business software and developer tools, giving old claims about what its models “learn” fresh commercial weight. Reuters
The issue also lands in a harder market. Google is fighting OpenAI, Microsoft and other rivals for users and enterprise customers while trying to prove that its heavy AI spending can turn into durable revenue. Reuters reported in April that Google was putting AI agents at the heart of an enterprise push under the Gemini Enterprise name.
The Bengali claim traces back to a 2023 “60 Minutes” segment on Google’s AI work. James Manyika, a senior Google executive, said the company found that after limited prompting in Bengali, the system could “translate all of Bengali,” while Chief Executive Sundar Pichai described parts of AI behavior as a “black box” — industry shorthand for systems whose internal decision-making is hard to trace. CBS News
Margaret Mitchell, a former Google AI ethics researcher, challenged that framing. Google’s PaLM research paper listed Bengali in its non-code training data: 0.194 billion Bengali tokens, or 0.026%, and 0.042 billion tokens of Latin-script Bengali, or 0.006%. A token is a chunk of text used to train a model. That table does not prove the exact Bard system shown on television used the same data, but it weakens any simple claim that the model had no prior exposure to Bengali.
Emily M. Bender, a University of Washington linguistics professor, wrote after the broadcast that creating ignorance about training data makes model performance look more surprising than it is. She said the rhetorical move of treating Bard like a person was misleading: “IT. IS. NOT.” Medium
Google’s own early Bard blog used more restrained language. The company described a large language model, or LLM, as a “prediction engine” that chooses likely next words, and warned that such systems can produce inaccurate or false information while sounding confident. blog.google
Bard’s model lineage also changed quickly. Google first described Bard as powered by a lightweight version of LaMDA; in May 2023, it said PaLM 2’s multilingual capabilities were helping expand Bard to new languages. In February 2024, Google renamed Bard as Gemini and launched a mobile app and paid Gemini Advanced plan.
That rebrand was part of a broader contest with Microsoft and OpenAI. Jack Krawczyk, then a Google product lead, told Reuters that a $20-a-month AI product needed more than raw model access: “access to a model alone is not really enough.” Reuters
But the criticism has a limit. A training table cannot reconstruct every prompt, fine-tuning step or product layer behind Bard, and large models can still show unexpected behavior after sparse examples. The risk for Google is different: loose claims about self-learning may invite tougher questions from customers, regulators and researchers about training data, evaluation methods and what companies know before they ship.
For Bengali speakers and other lower-resource language users, the practical issue is less mystical. Independent research on ChatGPT translation across Bengali and five other languages found gender defaults and stereotype-linked errors, underscoring that language coverage in AI systems still needs direct testing, not just broad claims of fluency.
Google has kept moving. In February, it said Gemini 3.1 Pro was rolling out across consumer, developer and enterprise products, including the Gemini API, Vertex AI, the Gemini app and NotebookLM. That makes the old Bard Bengali row more than a footnote: as Gemini spreads, Google has less room for fuzzy language about what its models know, where that knowledge came from and how much of it was learned “on its own.” blog.google