19 September 2025
45 mins read

10X Faster Than Any Supercomputer: Inside Microsoft’s AI Mega-Datacenter

10X Faster Than Any Supercomputer: Inside Microsoft’s AI Mega-Datacenter
  • Microsoft unveils “Fairwater,” a 315-acre AI datacenter campus in Wisconsin – its largest AI facility yet – part of a multi-billion global expansion of purpose-built AI infrastructure.
  • Massive scale: Three buildings totaling 1.2 million sq. ft. house hundreds of thousands of NVIDIA GPUs, forming a single supercomputer 10× more powerful than today’s fastest supercomputer.
  • Cutting-edge hardware: Uses the latest NVIDIA Blackwell (GB200/GB300) GPUs (72 per rack) with ultra-fast NVLink networks, delivering 865,000 tokens/sec throughput – the highest of any cloud platform. Custom networking (800 Gbps InfiniBand/Ethernet) knits tens of thousands of GPUs into one giant cluster.
  • Azure & OpenAI integration: These AI datacenters power OpenAI’s models (ChatGPT/GPT-4), Microsoft Copilots, and Azure AI services. OpenAI’s GPU shortages have underscored the need – CEO Sam Altman admitted the company was “out of GPUs” and had to delay model rollouts, a gap Microsoft’s new supercomputers help fill.
  • Advanced cooling & energy: Features closed-loop liquid cooling with zero water waste – chilled water circulates through servers and 172 giant cooling fans, keeping the facility cool with minimal water use (roughly equivalent to one restaurant’s yearly usage). Over 90% of the campus is liquid-cooled with one-time fill water, drastically reducing consumption. Microsoft will build on-site solar farms to offset energy draw, though local grid upgrades and new gas generators are also planned to meet huge power demands.
  • Location & investment: Situated in Mt. Pleasant, Wisconsin – a site once slated for Foxconn – the campus benefits from abundant land, a cool climate, and state incentives. Microsoft has committed $7+ billion for two AI datacenters here, creating ~800 high-tech jobs once operational (and 10,000+ union construction jobs during building) wispolitics.com. Globally, Microsoft is spending tens of billions on AI infrastructure in 2025, including new hydropowered AI datacenters in Norway and a supercomputer project in the UK blogs.microsoft.com.
  • “AI factory” paradigm: Microsoft’s Fairwater facility is built as a single-purpose AI factory, unlike general cloud datacenters. A flat network enables all GPUs to work as one, so giant AI models can be trained and served at unprecedented scale. “These aren’t data centers – they are factories that manufacture intelligence,” says NVIDIA CEO Jensen Huang, highlighting how such facilities have become the new engines of the digital economy.
  • Broader impact: The project anchors a regional tech hub, including a Microsoft AI Co-Innovation Lab at UW-Milwaukee to train local businesses in AI wispolitics.com. Officials hail the investment as putting Wisconsin “on the cutting edge of AI power… while creating good, family-supporting jobs”. Microsoft is coordinating with local authorities on workforce training and grid improvements to grow responsibly.
  • Rivals in the AI arms race: Microsoft’s move comes amid fierce competition. Amazon (AWS) is pouring comparable sums (>$100 billion in 2025 CapEx) into AI infrastructure, using both NVIDIA GPUs and its custom Trainium chips. AWS has teamed with NVIDIA on “Project Ceiba” to build one of the fastest AI supercomputers, capable of 414 exaflops of AI performance, and offers UltraClusters linking up to 20,000 H100 GPUs for a single workload. Google has built its own TPU-based AI supercomputers – TPU v4 pods with 4,096 chips and new TPU v5p “Hypercomputer” pods with 8,960 AI chips each, connected by optical networks. Google’s Cloud infrastructure, leveraging millions of TPUs and high-end GPUs, is among the most advanced – Google calls its new multi-pod AI system a “Hypercomputer” and has claimed it outperforms equivalent GPU clusters. Meta (Facebook), meanwhile, is refitting its datacenters for AI at breakneck pace – investing $60–65 billion through 2025 and aiming to deploy 1.3 million GPUs by late 2025. Meta built the AI Research SuperCluster (RSC) (one of the world’s fastest in 2022) and is constructing two new supercomputing clusters with 24,000+ NVIDIA H100 GPUs each to train its next-gen models. Oracle has partnered deeply with NVIDIA to offer massive cloud AI clusters: Oracle’s OCI Superclusters will scale beyond 100,000 NVIDIA Blackwell GPUs in a single distributed system, using NVIDIA’s latest Grace-Blackwell “Superchip” architecture. Oracle is already deploying thousands of Blackwell GPUs with ultra-fast interconnects and plans to host one of the world’s largest AI clusters to meet surging demand.
  • Accelerating trend: Industry experts liken this build-out to an “AI infrastructure arms race.” U.S. tech giants’ combined data center spend is now over $300 billion per year, targeting ever-bigger AI factories. Even new entrants are joining – Elon Musk’s startup xAI recently constructed “Colossus,” a private supercomputer with 100,000 NVIDIA GPUs (doubling to 200,000) to power its AI models. The AI gold rush is spurring innovation but also straining resources: these GPU farms consume enormous power, raising concerns about electricity grids and sustainability. Microsoft, for instance, had to coordinate with utilities in Wisconsin to ensure the grid can handle its gigawatt-scale energy needs without raising local rates. Analysts predict global data center power usage could double by 2030 due to AI growth – prompting efforts like Microsoft’s to use renewables and efficient cooling at every site.

In summary, Microsoft’s new AI datacenter in Wisconsin – billed as the world’s most powerful – embodies the scale and ambition of today’s AI era. It merges cutting-edge silicon, novel cooling and networking, and massive cloud integration to enable “frontier” AI models that were previously impossible. By tightly coupling hundreds of thousands of GPUs into one system, Microsoft is effectively launching a cloud-based supercomputer for AI, boosting both its Azure platform and partners like OpenAI. This effort is a cornerstone of Microsoft’s strategy to democratize AI – delivering advanced AI services globally via Azure’s network of 400+ datacenters. At the same time, it’s a high-stakes bet in a broader race: Amazon, Google, Meta, and Oracle are all scaling their own “AI factories,” each with unique tactics but a common goal of leading the next wave of computing.

Aerial view of Microsoft’s new Fairwater AI datacenter campus in Mount Pleasant, Wisconsin – a 315-acre site housing three massive buildings (1.2 million sq. ft. total) purpose-built as a single AI supercomputer. This facility will connect hundreds of thousands of NVIDIA GPUs via ultra-fast networks, delivering 10× the performance of the world’s top supercomputer and powering services like OpenAI’s GPT models and Microsoft’s Copilot.

Microsoft’s leadership frames this project not just as a datacenter upgrade, but as a paradigm shift. “In the heart of the American Midwest, a modern marvel is rising,” said Microsoft President Brad Smith, referring to the Wisconsin campus, calling it the world’s most powerful AI datacenter and “more than a technological feat – it’s a promise to grow responsibly, invest deeply, and create opportunities” for the community and nation. By investing at frontier scale, Microsoft hopes to stay at the forefront of AI advancements. The datacenter’s sheer power will allow training of the largest AI models (with trillions of parameters) and handle AI inference (e.g. running GPT-powered services) for millions of users at once. In practical terms, that means faster improvements to generative AI tools, more capable AI assistants in Microsoft 365, and new AI services available via Azure for enterprises around the world.

Microsoft’s AI Datacenter Initiative: Purpose and Unprecedented Scale

Microsoft’s push to build AI-specific datacenters comes as AI models grow exponentially in size and importance. Traditional cloud datacenters – optimized for multitasking many small apps – are not efficient for “AI at scale”, where one enormous neural network might run across thousands of GPUs in parallel. Thus, Microsoft is establishing dedicated AI “factories” to meet surging demand from products like OpenAI’s ChatGPT and Microsoft’s own Bing AI and Copilot features. “This week we introduced a wave of purpose-built datacenters and infrastructure investments… to support the global adoption of cutting-edge AI workloads,” wrote Microsoft Cloud EVP Scott Guthrie blogs.microsoft.com. The crown jewel is “Fairwater” in Wisconsin, unveiled in September 2025, which Microsoft calls the largest, most sophisticated AI datacenter it has ever built.

Scale: The Fairwater campus spans 315 acres in Mount Pleasant, WI, with three huge server buildings totaling 1.2 million square feet – roughly the size of 20+ football fields under roof. Constructing it required staggering quantities of materials and work: 46.6 miles of deep foundation pilings, 26.5 million pounds of steel, 120 miles of power cabling, and 72.6 miles of pipes installed. In short, it’s a megaproject on par with the largest factories on Earth. And Fairwater is just the start – Microsoft confirms it has “multiple identical Fairwater datacenters under construction” elsewhere in the U.S. to create a distributed capacity. Each such site represents an investment of billions of dollars. (Indeed, Microsoft announced an additional $4 billion to build a second AI datacenter in Wisconsin by 2028, bringing the state total to $7+ billion.)

Purpose: Unlike multi-purpose cloud regions, these AI datacenters are built solely to train and run large AI models – what Microsoft calls “frontier AI.” Guthrie explains that effective AI requires “thousands of computers working together” with specialized accelerators (GPUs) doing massive parallel math, ultra-fast networks to synchronize them, and enormous storage for the data these models learn from. The goal is to eliminate bottlenecks so the GPUs remain busy at all times. By designing everything under one roof – compute, networking, storage – as one cohesive AI supercomputer, Microsoft can push performance to levels unattainable in a normal cloud setup.

Crucially, these AI supercomputing sites plug into Azure’s global cloud. Microsoft has over 400+ datacenters in 70 regions worldwide for its cloud services. The new AI sites will “seamlessly connect” via Microsoft’s network backbone, forming a distributed AI supercomputer that spans continents. This means an Azure customer (or OpenAI) can tap into multiple AI datacenters at once, training models across geographic regions for resilience and scale. “Through innovation to link these AI datacenters in a distributed network, we multiply the efficiency and compute exponentially to democratize access to AI globally,” Microsoft says. In effect, Microsoft is weaving a Worldwide AI Web – a strategy to ensure AI services are fast and available everywhere, and not limited by the confines of a single building.

From a strategic standpoint, Microsoft’s massive investment is driven by competition (to offer the most capable AI cloud) and by partnership obligations. Microsoft’s multi-billion-dollar stake in OpenAI came with an agreement to provide the cloud horsepower for OpenAI’s research and products. As ChatGPT’s popularity exploded, so did the compute requirements – reportedly, running ChatGPT (GPT-4) for everyone strained Azure’s capacity in 2023. Sam Altman, OpenAI’s CEO, publicly noted that “a lack of computing capacity is delaying [OpenAI] products”, and that OpenAI was forced to stagger new releases like GPT-4.5 because they were “out of GPUs” in early 2025. Those revelations underscored Microsoft’s need to “go big” on infrastructure. Indeed, Microsoft is doubling down, with Guthrie stating the Wisconsin AI hub will “power OpenAI, Microsoft AI, Copilot and many more leading AI workloads” going forward. Simply put, Microsoft sees these AI datacenters as critical assets to maintain leadership in the AI-powered cloud era.

Hardware and Infrastructure: A Supercomputer Under the Hood

What makes Microsoft’s AI datacenter so powerful? In essence, Microsoft has engineered a giant cluster computer from the ground up. It starts with cutting-edge AI accelerators – specifically, NVIDIA’s latest GPUs – and interconnects them with networks and servers into one cohesive system. The Wisconsin facility uses NVIDIA’s Blackwell generation chips (the successor to NVIDIA’s current “Hopper” H100 GPUs). Microsoft was the first cloud provider to deploy NVIDIA’s new GB200 systems – referring to Grace-Blackwell superchips and servers – at full rack and datacenter scale. Each server node in this cluster pairs multiple Blackwell GPUs with CPUs (likely NVIDIA Grace ARM CPUs) and high-speed memory. But the magic lies in how these GPUs are linked horizontally (within racks) and vertically (across the entire building).

Each rack in the Fairwater datacenter packs 72 NVIDIA Blackwell GPUs connected via NVIDIA’s ultra-fast NVLink and NVSwitch fabric. This effectively fuses 72 GPUs into one “super-GPU” with 1.8 terabytes per second of GPU-to-GPU bandwidth and a shared memory pool of 14 TB accessible by any GPU. As a result, a single rack behaves like a gigantic accelerator capable of processing an astonishing 865,000 tokens per second (a measure of AI throughput) – making it the highest-throughput AI system available in any cloud as of 2025. For context, that means one rack can handle the equivalent of hundreds of pages of text generated per second, or simultaneously run dozens of AI model instances at high speed. And this is just one rack – the Wisconsin datacenter contains many hundreds of such racks working together.

To scale beyond a rack, Microsoft had to solve networking challenges at multiple levels. Inside each rack, NVLink handles GPU communication at several TB/s, eliminating bottlenecks in that domain. Between racks, the datacenter uses a combination of InfiniBand and high-speed Ethernet links – running at 800 Gbps – arranged in a fat-tree topology. A fat-tree network ensures any rack can talk to any other with minimal hops and no congestion, even at full throughput. Essentially, Microsoft built a custom switching fabric so that “every GPU can talk to every other GPU at full line rate without congestion” across the whole datacenter. Furthermore, Fairwater’s layout is unique: it’s a two-story datacenter design, meaning racks are stacked in vertical pairs and connected not only to neighbors on the same floor but also directly to those above/below. This clever design reduces physical distance (and thus latency) between far-flung racks – a critical tweak because even a few extra meters of cable can slow down communication in these ultra-fast clusters.

Inside Microsoft’s AI supercomputer: High-density racks of NVIDIA GPU servers form the core of the Fairwater datacenter. Each rack contains 72 tightly interconnected GPUs (via NVLink/NVSwitch), effectively acting as one massive accelerator. These racks are linked with 800 Gb/s InfiniBand/Ethernet networks in a fat-tree topology, allowing tens of thousands of GPUs to function as a single system for training AI models. Advanced liquid cooling pipes are visible, which dissipate heat from the densely packed hardware.

By co-engineering across hardware and software, Microsoft claims to have “the most powerful, tightly coupled AI supercomputer in the world, purpose-built for frontier models”. Each layer – from custom firmware to the Azure AI software stack – is optimized to treat the datacenter like one big computer. For example, Azure developed tools like BlobFuse2 to feed data to GPUs at high throughput, ensuring no GPU sits idle waiting for data. The storage subsystem was rearchitected to aggregate bandwidth from thousands of storage nodes, yielding exabyte-scale capacity that can sustain millions of reads/writes per second per account. This eliminates the need for tedious data sharding when training on petabytes of data – the storage just scales to whatever the AI workload demands. Such refinements in storage and I/O are crucial when training advanced models that consume trillions of tokens (text, images, etc.), because any slowdown in data supply would idle costly GPUs.

In short, Microsoft built an AI-optimized beast: thousands of NVIDIA GPUs plus a high-speed nervous system (network + storage) that keeps them fed with data and working in concert. While exact GPU counts aren’t disclosed, phrases like “hundreds of thousands of cutting-edge AI chips” were used. Reuters reported that by adding a second building, the Wisconsin campus would tie together “hundreds of thousands of powerful chips from Nvidia” into what will be the world’s most powerful AI supercomputer. For comparison, the world’s current top supercomputer (the Frontier system in Oak Ridge) uses ~8,700 GPUs; Microsoft’s AI supercomputer will likely use 10× or more GPUs. This leap in scale (enabled by Microsoft’s deep pockets and the cloud demand for AI) lets Microsoft assert that Fairwater will deliver “10× the performance of the world’s fastest supercomputer today”. Such a claim likely refers to AI-specific performance (mixed precision computing for neural nets), but it underlines just how far cloud giants are pushing beyond even the best public research computers.

Integration with Azure Cloud and OpenAI

A key aspect of Microsoft’s AI datacenter strategy is its tight integration with the Azure cloud platform and its flagship AI partner, OpenAI. Rather than siloing this as a special-purpose cluster for internal use, Microsoft is making the AI supercomputer part of Azure, so that both Microsoft’s own teams and Azure customers (including OpenAI) can leverage its power on demand.

Notably, Microsoft’s exclusive cloud partnership with OpenAI means OpenAI runs its large models on Azure. Services like the Azure OpenAI Service allow enterprise customers to use OpenAI’s GPT-4 (and future GPT-5) via Azure’s infrastructure. The new AI datacenters directly benefit this arrangement. In Microsoft’s words, “Microsoft’s AI datacenters power OpenAI [and] Microsoft AI” across various products. For example, when you interact with Bing Chat or GitHub Copilot, the requests likely hit Azure GPUs that may physically reside in one of these AI-optimized facilities. Likewise, OpenAI’s model training runs on Azure – in 2020 Microsoft built a dedicated supercomputer with 10,000 GPUs for OpenAI, and the Wisconsin cluster vastly surpasses that. OpenAI’s newest models (like GPT-4.5, GPT-5 in the future) will need even more compute; having multiple Fairwater-class datacenters ensures Azure can meet those needs, keeping OpenAI tied to Microsoft’s cloud for the long run.

The Azure integration also means smaller firms and researchers can benefit. Azure will offer slices of this mega-computer to customers who need heavy AI compute. Microsoft already has Azure AI Supercomputing offerings – the new infrastructure takes it to another level. For instance, a company training a large-language model on Azure could request a cluster of, say, 2,000 GPUs in one job. Azure can allocate a partition of the AI supercomputer (thanks to its flat network design) to that customer while isolating it from others. This essentially democratizes access to an AI capability that only a Google or Meta might have had in-house before. “We’re building a distributed, resilient and scalable system that operates as a single, powerful AI machine,” Microsoft says about its global AI datacenter network. The company emphasizes that by pooling compute across regions, even customers outside the U.S. can tap into giant models without latency or reliability issues. This is a selling point as Microsoft courts enterprises and governments seeking advanced AI: plug into Azure and get supercomputer-grade AI without building your own.

For OpenAI, the benefits are clear: virtually unlimited scaling and support. However, it’s a two-way street – Microsoft’s fortunes in AI are tied to OpenAI’s success (and vice versa). In 2023, the partnership was so exclusive that OpenAI used only Azure. By late 2025, there were reports OpenAI might diversify to other clouds for redundancy, but Microsoft’s massive capacity build-out is likely aimed at keeping OpenAI primarily on Azure by being the biggest and fastest platform. Not to mention, Microsoft’s own AI services (Microsoft 365 Copilot, Azure AI services like Cognitive Search, etc.) will directly run on this infra. Every Outlook email auto-draft, every Teams meeting transcription, every Bing image generation – those AI features consume GPU cycles on Azure. Microsoft has even launched “Azure AI Foundry” services and is integrating AI across Windows and Office, all of which lean on Azure’s backbone. Without scaling the backend, the user-facing innovations would stall.

It’s also worth noting that Microsoft is co-designing software optimizations with OpenAI. For example, OpenAI’s models might be tuned to Azure’s architecture – perhaps custom fiber paths or scheduling algorithms ensure that something like GPT-5 trains efficiently across thousands of Azure GPUs. Conversely, insights from OpenAI (such as model parallelism techniques or tooling) feed into Azure’s improvements. This virtuous cycle is part of Microsoft’s strategic goal: ensure that if any company is building the next breakthrough AI, Azure is the obvious place to do it. That’s one reason Microsoft is also investing in AI startups and opening labs – to create an ecosystem that gravitates to its cloud.

Finally, Microsoft has signaled interest in AI-specific chips of its own (Project Athena is reportedly its custom AI chip effort). While NVIDIA GPUs power the present, Microsoft likely foresees a future where it mixes in proprietary accelerators for cost and independence. Integration with Azure would make that transition seamless for customers – Azure’s AI services could run on whatever backend (GPU or ASIC) as long as performance is there. In summary, the AI datacenters are a platform play: they bolster Azure’s position as a top cloud for AI, ensure OpenAI’s heavy workloads stay in-house, and provide Microsoft an edge in delivering AI-powered experiences to end-users.

Sustainability and Energy: Cooling a 2-Story AI Furnace

Building the world’s most powerful AI datacenter comes with intense power and cooling needs. Modern AI supercomputers consume tens of megawatts of power – on par with small towns – and pack hardware so densely that traditional cooling (air conditioning) can’t keep up. Microsoft recognized from the outset that liquid cooling was essential for Fairwater. As the blog notes, “Traditional air cooling can’t handle the density of modern AI hardware”, so Microsoft engineered advanced liquid cooling “at facility scale”.

Closed-loop liquid cooling: In Microsoft’s design, cold water is piped directly to the server racks, where it flows through cold plates or heat exchangers attached to the GPUs and CPUs. This water absorbs heat far more efficiently than blowing air over the components. The now-hot water is then pumped out of the server hall to external cooling units – Microsoft calls them cooling “fins” – lining the sides of the building. At Fairwater, there are 172 giant fans (20-foot diameter each) on these exterior cooling fins, which chill the water before it recirculates back inside. Importantly, the system is a closed loop: the water is reused continuously, with no evaporation or discharge, meaning virtually zero water wastage during operation. Microsoft only needs to fill the system once (during construction); after that, it’s a sealed circuit. This is in stark contrast to typical data centers, which often use evaporative cooling towers consuming vast amounts of water daily (and in arid regions, that’s a big environmental stress).

Microsoft reports that over 90% of the datacenter’s capacity uses this liquid loop and requires “water only once during construction,” with “no evaporation losses.”. The remaining 10% of servers (likely standard cloud servers co-located on site) can use outside air cooling in cooler months and only use water on the hottest days. By doing so, Microsoft dramatically cuts water usage compared to typical designs. In an era where some large data centers guzzle millions of gallons a day, this is a noteworthy achievement. In Wisconsin’s case, the cool climate provides an additional boost – chillers can leverage cold outside air for part of the year, increasing efficiency. Microsoft even remarked that with these techniques, the Wisconsin AI campus’s annual water use will be about the same as a single restaurant, an almost unbelievable comparison given the scale – but it underscores the efficacy of the closed-loop approach.

On the energy front, the challenges are equally large. Powering hundreds of thousands of GPUs requires massive electrical capacity. Microsoft is working with Wisconsin’s utilities to ensure power delivery for both datacenters (the region between Milwaukee and Chicago was attractive partly due to robust transmission infrastructure). Notably, Microsoft pre-paid for grid upgrades to avoid passing costs to local ratepayers. Nevertheless, the load is so high that new power plants are being considered: “the project will entail new fossil fuel power generation near the facilities,” reported Reuters. Brad Smith acknowledged the area is “LNG territory,” hinting that natural gas turbines might be installed to guarantee reliable supply. This reveals a tension in sustainability – while Microsoft pledges to run on 100% renewable energy (it plans a solar farm in Wisconsin to offset the datacenter’s draw), the reality is that to meet peak demands and provide 24/7 power, some on-site or local gas generation is a fallback.

However, renewables do play a big role in Microsoft’s strategy. The company has a commitment to carbon neutrality and has been one of the largest buyers of green power. For this campus, they’re investing in solar energy projects elsewhere in the state to equal the consumption of the AI datacenters. Meanwhile, in Norway, Microsoft chose Narvik for a new AI datacenter specifically because of abundant hydropower – that site will be “powered by entirely renewable energy” from hydroelectric sources. Narvik’s cool climate and cheap green electricity made it an ideal sustainable choice, and Microsoft is partnering with local firms (nScale and Aker) there to build a $6.2 billion hyperscale AI facility. This shows Microsoft’s pragmatic approach: use site selection to minimize carbon footprint (where possible, like Norway’s hydro), and apply cutting-edge cooling tech to reduce resource use everywhere.

Beyond electricity and water, Microsoft also looks at the broader environmental footprint. The liquid cooling system in Fairwater is supported by what Microsoft says is the “second largest water-cooled chiller plant on the planet” – an indication of just how much heat they need to dissipate. By consolidating GPUs in a few super-efficient hubs, Microsoft could argue it’s better for the environment than having the same GPUs scattered in less efficient smaller data centers. There’s also an emphasis on heat reuse research (though not mentioned explicitly in this blog, some data centers reuse heat for local heating – unclear if that’s planned here). Microsoft’s Datacenter Community Pledge (referenced in related blogs) suggests they aim to be good stewards – e.g., designing facilities that don’t strain local water sources and contribute positively. In Wisconsin, they worked with the community so well that local officials lauded the project’s responsible growth and the state even adjusted laws to attract such datacenters wispolitics.com wispolitics.com.

In summary, Microsoft’s AI datacenters marry extreme computing with extreme cooling. By using liquid cooling at scale, efficient chilling, and renewable power offsets, Microsoft is trying to balance the huge energy appetite of AI with its sustainability goals. Yet, the challenge remains enormous: industry analysts note that data centers already consume ~2% of global electricity (≈536 TWh in 2025) and the AI boom could double that by 2030. Power and cooling are now as critical as processors in this new era – one reason Microsoft, Google, and others are even exploring on-site small nuclear reactors as a future stable power source for AI farms. While nuclear datacenters might be years away, Microsoft’s innovations in Wisconsin are a crucial step toward making AI infrastructure greener and more efficient today.

Location and Strategic Considerations: Why Wisconsin (and Beyond)

It raised some eyebrows when Microsoft chose Wisconsin’s Racine County – far from Silicon Valley – to build its epic AI campus. The choice was quite intentional, influenced by a mix of practical and strategic factors.

Mt. Pleasant, WI offered a large, already-prepped site thanks to the earlier Foxconn development attempt. Back in 2017, Foxconn (the Taiwanese manufacturer) had acquired land there with plans for a huge LCD panel factory, backed by state incentives. That project largely fell through, leaving a site with infrastructure ready (power lines, roads) but no tenant. Microsoft swooped in, announcing in 2023 it would repurpose the area for a datacenter – a big win for the state and a clever use of an existing industrial zone. The location sits “nestled between Milwaukee and Chicago,” meaning it has robust grid connectivity and fiber optic networks along the I-94 corridor. It also places the datacenter within reasonable distance of multiple metropolitan areas (for workforce and for low-latency internet routing to Midwest and East Coast users).

The state of Wisconsin actively wooed Microsoft. Governor Tony Evers highlighted that this investment “puts Wisconsin on the very cutting edge of AI power… in the world”. The state passed legislation to make itself more attractive for datacenters (tax incentives, etc.). It also designated a Regional Tech Hub and assembled packages to ensure trained workforce and community support wispolitics.com wispolitics.com. In short, Wisconsin signaled it was “open for AI business.” The choice also diversifies Microsoft’s datacenter geography – many U.S. hyperscale datacenters are in Washington, Iowa, Virginia, etc. Spreading to Wisconsin taps into new energy grids and avoids over-concentration in any one region (which can be risky due to local power limits or natural disasters).

Additionally, the cool climate of the upper Midwest helps with cooling (free air cooling in winter, and efficient chiller operation year-round). Abundant freshwater in the region also means, if water were needed, it’s not in a drought-prone area – though Microsoft’s design minimizes water use regardless. The community also matters: Racine County and Mount Pleasant were eager for the jobs and economic boost after the Foxconn disappointment. Microsoft’s engagement, like setting up the AI lab at UW-Milwaukee and using local union labor, has likely created good local rapport wispolitics.com wispolitics.com.

Strategically, having a central U.S. location for an AI hub can reduce latency for serving AI across the country. If Microsoft’s other AI datacenters in the U.S. are, say, in West Coast or South, placing one in the Midwest ensures that East Coast users can also get fast responses from Azure AI (the speed of light limitations mean physical distance adds latency; a distributed approach mitigates that). Moreover, multiple identical sites mean Azure can do geo-redundant training: splitting a giant training job across two datacenters in different states, which can improve reliability (if one has an outage, the job continues on the other). Microsoft explicitly mentioned linking AI datacenters to operate as one distributed supercomputer, so location diversity is part of that vision.

Outside the U.S., Microsoft is also picking strategic spots: Narvik, Norway for Europe’s AI hub – chosen for cheap renewable energy and cool climate (plus perhaps political stability in Scandinavia). Loughton, UK for a British AI supercomputer – likely because the UK government is keen on hosting AI infrastructure domestically for sovereignty reasons blogs.microsoft.com. By partnering with local players (nScale, etc.), Microsoft also shares the investment load and navigates local regulations more smoothly.

In essence, Microsoft’s site selection marries pragmatics (power, cooling, infrastructure) with strategy (customer proximity, political goodwill). Wisconsin ticked many boxes: ready-made site, supportive government, central location, and room to expand (315 acres gives space for additional buildings or solar farms). As Microsoft continues rolling out AI datacenters, expect them to pop up in places with similar profiles – not necessarily major cities, but often in suburban or rural tech hubs with good connectivity. For example, Microsoft has large datacenters in Iowa and Virginia (for standard cloud) and is now extending to new states. This not only spreads economic benefit (earning political points) but also ensures Azure can claim multi-region resilience for AI services, a key for enterprise customers who need uptime and compliance across jurisdictions.

Economic and Workforce Impact

The creation of the “world’s most powerful AI datacenter” is not just a tech story – it’s also a major economic development story for the region and the industry. These AI megaprojects bring significant investment, jobs, and community initiatives.

In Wisconsin, Microsoft’s initial $3.3 billion investment (for the first datacenter) and the new $4 billion expansion (for the second) total over $7 billion by 2028 wispolitics.com. During construction, this has translated into a bonanza for local labor: nearly 10,000 workers have had roles in the build so far, including over 3,000 construction workers at peak activity. Trades such as electricians, plumbers, pipefitters, ironworkers, concrete specialists, etc., have been heavily engaged. These are largely union jobs with good wages, providing a boost to the local economy. Microsoft indicated the second datacenter will similarly require thousands of workers over the multi-year build. For a region that was anticipating Foxconn’s manufacturing jobs that never fully materialized, Microsoft’s project has been a welcome source of employment.

Once operational, datacenters employ far fewer people than during construction – but still provide highly skilled, well-paid positions. Microsoft says the Mount Pleasant campus will eventually have around 800 permanent jobs once both datacenters are running (about 500 when the first comes online, scaling to 800 with the second). These include datacenter technicians, engineers, security personnel, operations managers, network specialists, etc. While 800 may sound modest, these are stable tech jobs in an area not traditionally known as a tech hub – a big win for diversification of the state’s economy. Brad Smith emphasized “all the things we build need to be operated… these are good jobs.”. Indeed, roles at an AI datacenter might involve managing advanced AI hardware, monitoring supercomputer performance, and performing maintenance on cutting-edge cooling systems – offering local talent a chance to work on world-leading technology without moving to Seattle or Silicon Valley.

Beyond direct jobs, there are ancillary economic benefits. The project will likely boost local suppliers, from concrete companies to electrical equipment vendors. It can increase tax revenue for the county (though datacenter incentives often include tax breaks, the broader economic activity usually compensates). Microsoft’s presence can also attract other businesses – e.g., suppliers, or companies that want to be near an AI infrastructure center. The designation of the area as a technology hub could draw startups or research institutions. In fact, Microsoft’s AI Co-Innovation Lab at University of Wisconsin-Milwaukee is aimed at empowering hundreds of small and mid businesses to adopt AI. This suggests Microsoft is investing in local talent and entrepreneurship, which can spur innovation outside the coasts. The state’s economic development corporation (WEDC) even gave a grant to support this lab, recognizing that knowledge spillover from Microsoft can help upskill the region’s workforce in AI.

Furthermore, projects of this scale often involve community commitments by the company. Microsoft, under its Datacenter Community Pledge, might fund local STEM education, sustainability projects, or infrastructure improvements as part of being a good neighbor. We saw a hint of this with the WEDC partnership and the governor’s Task Force on AI Workforce that was set up. Microsoft’s engagement with that task force and support for technical college AI training (the state budget put $2 million into AI workforce grants) indicates a collaborative effort to ensure local workers can fill the high-tech roles the datacenter brings.

On a broader scale, Microsoft’s AI datacenters reflect a shift in where tech jobs are created. While much of the AI software development happens in big-city labs, the infrastructure for AI is often built in smaller communities. This can be a positive force for economic balance. For example, Meta is building a 2 GW datacenter in rural Louisiana (drawing massive power, but also creating jobs in construction and operation). Google and Amazon too have large datacenters in places like Alabama, Oregon, etc. The AI era’s “factories” are rising in heartland areas, analogous to when auto plants or steel mills defined Rust Belt towns – except these factories manufacture “intelligence” rather than cars.

However, there are also workforce challenges. Operating an AI datacenter requires specialized skills (HVAC engineers for liquid cooling, high-voltage electricians for big substations, IT engineers who understand distributed computing). There could be talent shortages initially, hence Microsoft’s efforts to train local workers. Also, as AI gets more efficient, these sites might automate some tasks (e.g., AI for datacenter monitoring), meaning headcounts might not grow linearly with capacity. But overall, the trend is that AI infrastructure build-out is a significant new source of tech employment – one that ranges from construction trades to cloud computing experts, thus bridging blue-collar and white-collar realms.

Finally, Microsoft’s investment fosters a strategic tech ecosystem in Wisconsin. The presence of a world-class AI facility could attract research grants or partnerships with local universities (UW-Madison, for instance, has strong computer science programs that might collaborate with Microsoft). It puts Wisconsin on the map for AI, which could inspire students to pursue AI-related careers knowing opportunities exist locally. The Governor’s enthusiasm and the bipartisan support for datacenter incentives show that political leaders see this as a long-term economic catalyst. In effect, the AI datacenter is both an engine of Microsoft’s cloud and a potential engine of regional growth.

Partnerships and Strategic Goals

Microsoft hasn’t built this gargantuan AI endeavor alone – it’s leveraging key partnerships across industry and geographies, all aligned with its strategic goals in AI.

On the technology side, Microsoft’s deep partnership with NVIDIA is evident. NVIDIA is supplying the GPUs (Blackwell, and later “GB300”) that form the computational heart of these datacenters. Microsoft worked closely with NVIDIA to be the first cloud to deploy the full NVIDIA HGX supercomputing stacks (the GB200 racks) and has co-engineered custom solutions like the two-story rack layout and networking to maximize NVIDIA GPU performance. Jensen Huang, NVIDIA’s CEO, has even described NVIDIA as having evolved from selling chips to building “AI factories” with partners. For Microsoft, having a preferential pipeline to NVIDIA’s latest hardware (which has been in extreme demand globally) is strategically crucial – it ensures Azure can offer top-notch AI services ahead of rivals. This partnership is mutually beneficial: Microsoft gets early access to new chips; NVIDIA gets a premiere showcase (Azure) to prove out and sell its biggest systems. The Oracle-NVIDIA collaboration similarly highlights how GPU vendors and cloud providers are teaming up to break performance barriers.

Microsoft is also collaborating with infrastructure firms like nScale and Aker (in Norway) and presumably similar partners in other locales blogs.microsoft.com. In Norway, nScale (an AI infrastructure startup) and Aker (a Norwegian industrial company) formed a joint venture with Microsoft to build the Narvik datacenter. This partnership likely helps Microsoft navigate local requirements, secure renewable energy (Aker has energy expertise), and share some capital expense. In the UK, Microsoft again teamed with nScale to build what’s touted as “the UK’s largest supercomputer” for AI blogs.microsoft.com. These moves align with Microsoft’s strategy to globalize AI capacity quickly by joining forces with companies that have regional footholds or specializations (e.g., nScale focusing on sustainable datacenters).

Another key partnership is, of course, OpenAI. Microsoft’s multi-year, multibillion partnership with OpenAI (including a reported $10 billion investment in 2023) gave it exclusive cloud provider status for OpenAI’s advanced models. This has been a lynchpin of Microsoft’s AI strategy: it gets to integrate OpenAI’s tech into its products (from Bing to Azure to Office) and in return provides the massive cloud infrastructure OpenAI needs. Microsoft’s AI datacenters are essentially the fulfillment of that deal – they provide the “muscle” behind OpenAI’s “brains.” Microsoft’s strategic goal is to be the platform for the AI revolution, and having OpenAI’s breakthrough models run on Azure draws AI customers to Microsoft who might otherwise use Google or AWS. It’s a symbiotic partnership driving Azure’s AI adoption – for instance, the Azure OpenAI Service has attracted 11,000+ businesses to use GPT-4 and other models via Azure APIs by late 2025. Each of those is using Microsoft’s cloud (and paying for the GPU time on Microsoft’s new hardware).

Microsoft is also partnering with the open-source AI community indirectly. For example, Windows and Azure now support things like ONNX models, Hugging Face collaborations, etc. By building the infrastructure, Microsoft can woo AI startups, research labs, and open-source projects to train models on Azure credits or use Microsoft’s toolchains. This broadens Microsoft’s influence in the AI ecosystem beyond just OpenAI.

One more partnership angle: government and enterprise clients. Microsoft is likely coordinating with governments (such as the UK’s, which is keen on AI safety research – possibly using that UK supercomputer) to host sensitive AI projects. Winning government AI cloud contracts is strategic, and having the world’s most powerful AI datacenter helps Microsoft bid for, say, a defense department AI program or a national lab collaboration. Similarly, Microsoft’s strategic goal to embed AI into every industry means partnering with companies in healthcare, finance, etc., to give them tailored AI supercomputing solutions on Azure. Already, we see companies like Meta and Tesla (xAI) building their own clusters because they need immense compute. Microsoft could offer, as a partnership model, to build dedicated AI infrastructure for big clients as part of Azure (somewhat like Oracle offering Dedicated Region clusters). This could keep customers from going the DIY route.

Strategically, Microsoft’s goals with these AI datacenters are to secure AI leadership, drive Azure growth, and advance AI capabilities in a way that benefits its whole product portfolio. CEO Satya Nadella has been vocal that AI is the new platform wave (after PC, after mobile/cloud) – and Microsoft intends to ride it at all levels. By 2025, Microsoft aimed to integrate AI “Copilot” features into Windows, Office, Dynamics, GitHub, and more. All those features funnel users into consuming Azure AI services on the backend. So the strategic goal is clear: make Azure the de facto AI utility for the world. To do that, Microsoft is betting on scale and partnerships – scale via mega-datacenters and partnerships with the best model-makers (OpenAI) and chip-makers (NVIDIA).

In doing so, Microsoft also hedges against competitors. Google has its TPU advantage, Amazon has custom silicon – Microsoft partnering with OpenAI gave it a top-tier model advantage, and partnering with NVIDIA ensures it has top-tier hardware advantage. Additionally, Microsoft reportedly has Project Athena (custom AI chips) potentially coming; if those pan out, Microsoft could reduce dependence on NVIDIA in the long term and offer cheaper or more optimized AI clouds. But even that would be a partnership story – likely working with an AI chip startup (or an acquired team like Nervana or Fungible, which Microsoft bought in 2023).

Finally, Microsoft’s strategic vision is also democratization of AI (as they often state). The datacenters are a means to an end: enabling developers and organizations worldwide to use AI without having to worry about the gargantuan compute behind it. By providing this as a service, Microsoft positions itself as a key driver in transforming industries with AI. The partnership with customers is the ultimate partnership – if companies choose Azure for their AI needs, Microsoft wins. So far, Microsoft has an edge with offerings like Azure OpenAI and by showcasing itself as the first to bring things like the NVIDIA GB200 systems online globally. Its strategic goal can be summarized as: build the biggest, greenest, most connected AI supercomputer network, so that all roads in the AI era lead to Azure.

The AI Datacenter Arms Race: Microsoft vs. Other Tech Giants

Microsoft is not alone in building next-generation AI infrastructure – it’s a fierce race involving all the major cloud providers and some social media and enterprise players. Each is taking a slightly different approach, but the scale of ambition is enormous across the board. Let’s compare how Microsoft’s efforts stack up against Amazon, Google, Meta, and Oracle, who are often cited as leading players in this space:

Amazon Web Services (AWS)

Amazon, the biggest cloud provider, is investing heavily to ensure it can serve massive AI workloads on its AWS platform. In 2025, Amazon’s capital expenditures (much of which go to data centers) were projected around $100 billion – even higher than Microsoft’s, reflecting AWS’s broader size. AWS’s strategy for AI infrastructure has two prongs: NVIDIA GPUs and custom AWS chips. On the GPU side, AWS has worked closely with NVIDIA. At re:Invent 2023, AWS and NVIDIA announced Project Ceiba, an effort to build one of the world’s fastest AI supercomputers in the AWS Cloud. This system is reportedly capable of 414 exaFLOPS of AI performance, which AWS said is ~375× the power of the world’s current fastest traditional supercomputer. While the exact comparison details are complex (AI FLOPS vs HPC FLOPS), it signals AWS’s intent to claim the “fastest AI computer” title as well.

AWS has introduced P5 instances featuring NVIDIA H100 GPUs and its custom low-latency EFA (Elastic Fabric Adapter) networking, allowing customers to scale jobs to 20,000 GPUs in a cluster – a scale similar to Microsoft’s one-building cluster (though Microsoft ultimately plans even larger, with multiple clusters linked). AWS touts that you can “spin up an exascale supercomputer on demand” on their cloud. They’ve built ultra-cluster infrastructure in regions like US East (Virginia) and US West (Oregon) to support this.

On the custom silicon front, AWS has Trainium (for training) and Inferentia (for inference) chips. While these currently don’t match NVIDIA’s top-end performance per chip, Amazon is on a roadmap to improve them. A Yahoo Finance report noted AWS aims to quadruple the performance of its AI processors by late 2025 as it readies an exaflop-scale supercomputer using those chips. If AWS’s in-house silicon matures, it could run large models on its own hardware – a different path than Microsoft, which so far relies on NVIDIA (though Microsoft’s Athena chip could be analogous down the line).

In summary, AWS’s AI datacenter efforts: building out massive GPU clusters, integrating them tightly with its cloud (SageMaker, EC2, etc.), and supplementing with proprietary chips. AWS also benefits from an enormous global infrastructure and a huge customer base already on its cloud, making adoption of new AI hardware somewhat straightforward. Microsoft’s advantage in the AI race has been its exclusive OpenAI deal, but Amazon is countering by partnering broadly (for instance, partnering with Hugging Face to offer thousands of open models on AWS, and investing in startups like Anthropic to ensure they also use AWS). The arms race here is as much about talent and partnerships as hardware.

Google

Google has a unique approach: it designs its own TPUs (Tensor Processing Units) specifically for AI, and also operates massive GPU clusters especially for external cloud customers. Google’s datacenter prowess is well-known – its infrastructure underpins both Google’s services (Search, YouTube, etc.) and Google Cloud offerings. For AI, Google unveiled what it calls an “AI Hypercomputer.” At Google I/O and Cloud Next events, it announced TPU v5p pods and a multi-pod architecture that is one of the most powerful in the world. Each TPU v5p pod contains 8,960 TPU chips in a 3D torus network with an astonishing interconnect bandwidth of 4,800 Gbps per chip. This configuration is said to provide 42.5 exaFLOPS of AI compute power in one pod (if we use TF32 or BF16 ops) – making it arguably on par or beyond any single-machine cluster out there. Google has also indicated these pods can be networked together, essentially scaling out into an “AI supercomputer” that isn’t confined to one physical unit.

Google’s TPU v4 pods (previous generation) already were rankers on TOP500-like lists, with 4,096 TPUs per pod. Google famously kept much of its AI training internal to TPUs – for instance, PaLM (540B) and other large models were trained on TPU v4 pods. By doing this, Google could optimize end-to-end (from model to chip) and claim better efficiency. Google’s AI datacenters, such as its facility in Mayes County, Oklahoma, are known to house many of these TPU pods and also GPUs for its Cloud customers.

Interestingly, Google has also not shied from using GPUs: Google Cloud offers NVIDIA A100 and H100 instances, and they reportedly built a GPU supercomputer with over 26,000 GPUs for Google Cloud customers at one point. So Google’s approach is hybrid: use TPUs to give itself a competitive edge (especially for internal products like DeepMind’s models or Google’s own Gemini model), but also provide GPUs to customers who rely on the NVIDIA ecosystem.

In terms of spending, Google too is on a tear – it likely spent tens of billions on datacenters and networking specifically for AI in recent years. One analysis mentioned Google’s aggressive TPU deployment made it the “third-largest data center chip supplier in 2023,” behind only NVIDIA and AMD, if one counts all the TPUs it has in production. Google’s edge might be in efficiency (TPUs are designed for optimal performance-per-watt on AI tasks), whereas Microsoft and AWS are throwing more raw GPU power at the problem. A strategic goal for Google is to prove “AI supremacy” by having the fastest training times and most sophisticated models – its TPU hypercomputer is a tool for that. It’s telling that Google refers to its datacenters as evolving from mere warehouses to “factories that manufacture intelligence” (a concept Huang also echoed). Google’s infrastructure head, Urs Hölzle, has spoken about pushing the boundaries of networking (e.g., using optical circuit switches to dynamically rewire TPU clusters). These kinds of innovations allow Google to maximize utilization and performance, arguably giving it a lead in certain scenarios.

Compared to Microsoft: Microsoft now has a partner (OpenAI) with a model arguably ahead of Google’s in the public eye, but Google has self-reliance in hardware (TPUs) that Microsoft doesn’t yet. If Microsoft’s claim is 10× current fastest supercomputer, Google might counter it has an AI supercomputer with similar or greater muscle (though direct comparisons are tricky). The arms race between these two may well come down to whose full stack integration yields better AI at scale – Microsoft with NVIDIA and OpenAI, or Google with TPUs and DeepMind/Google Brain.

Meta (Facebook)

Meta took a slightly different route: it’s not a public cloud provider (so far), but it needs huge AI capacity for its own products (Facebook, Instagram, and their AI features, plus VR/metaverse work, etc.). Meta in 2022 announced its AI Research SuperCluster (RSC), which at the time used 6,080 NVIDIA A100 GPUs (760 NVIDIA DGX nodes) in phase 1, with plans to expand to 16,000 GPUs in phase 2. RSC was built to be one of the fastest AI supercomputers and indeed ranked among the top such systems worldwide for AI tasks. It was used to train models like the large language model behind LLaMA.

But Meta did not stop at RSC. Reports in 2023-2024 revealed Meta’s massive ramp-up: Meta committed $30+ billion a year on new datacenter construction to pivot its entire infrastructure to AI. In numbers, Meta was said to be “rapidly converting its infrastructure for AI workloads, investing $60–65 billion in 2024-25” and aiming to deploy 1.3 million GPUs by end of 2025. That figure dwarfs everyone else’s in terms of sheer GPU count – it likely includes not just cutting-edge GPUs but also many smaller ones across their fleet. It signals Meta’s intent to infuse AI across all user experiences (feeds, content recommendation, etc.) and to develop advanced AI like their next-gen “personas” or generative models for the metaverse.

One marquee project: Meta is building a new hyperscale datacenter in Louisville, Alabama (inside the Red Oak development), which reportedly will be a 2.5 GW facility when fully outfitted – consuming as much power as a large nuclear reactor – specifically for AI and ML workloads. There was even an instance where Meta explored a small nuclear reactor on-site for power (an idea shelved due to discovering endangered bees on the intended site, interestingly). Meta’s scale is such that it has to worry about power availability at a national grid level.

Meta’s approach to hardware has been mostly NVIDIA GPUs (A100s, H100s). However, Meta also designed an AI accelerator chip internally (the “Meta Training and Inference Accelerator”, MTIA) and a data center optimized CPU (Borrego). Early efforts stumbled, leading Meta to scrap some chip projects in 2022, but they’re likely trying again to reduce dependency on NVIDIA eventually. For now, though, Meta in 2023 reportedly ordered a huge number of NVIDIA H100 GPUs – so many that it was one of NVIDIA’s top customers.

In terms of how Meta’s efforts compare to Microsoft’s: Meta is building for itself, not selling cloud access (though they open-sourced their LLaMA models which others can run). Microsoft is building to serve customers and OpenAI. Meta’s RSC with 16k GPUs is smaller than Microsoft’s multi-datacenter cluster plan, but Meta may be building multiple RSC-like clusters around the world. A recent list of powerful AI supercomputers placed xAI’s Colossus and a rumored Meta “Research Cluster 2” (with ~24k H100) near the top. Additionally, Meta has a different philosophy: they release a lot of AI research openly (like LLaMA models), which then anyone (possibly on Microsoft’s Azure or elsewhere) can run, whereas OpenAI (Microsoft’s partner) keeps models proprietary. This ideological difference doesn’t change the infrastructure needs, but it does mean Meta’s compute is directed at in-house research that then proliferates.

If Microsoft’s AI datacenter is a factory for rent (via Azure), Meta’s are factories for internal use to make the “AI products” (like better recommendation algorithms, generative AI for users, etc.) that keep Facebook/Instagram competitive. One interesting twist: because Meta open-sourced LLaMA, many cloud providers (including Azure) ended up hosting LLaMA fine-tunes and benefiting from Meta’s model indirectly. So in a way, Meta’s contribution to the arms race is providing open AI models that spur more usage of infrastructure broadly.

Oracle

Oracle is a bit of a dark horse in this race. As a smaller cloud provider (Oracle Cloud Infrastructure – OCI), it has carved out a niche by focusing on high-performance GPU offerings and partnering aggressively. Oracle saw an opportunity: while AWS, Azure, GCP have many services, some AI startups felt they could get more personal attention or capacity by going to Oracle, which was hungry to grow its cloud business. Oracle’s strategy has been to align closely with NVIDIA and offer very attractive terms or capacity for big AI customers.

For instance, Oracle in late 2022 partnered with NVIDIA to start offering NVIDIA’s DGX Cloud on OCI – effectively renting entire GPU clusters to customers with NVIDIA’s backing. Oracle invested in expanding GPU capacity significantly, and NVIDIA in turn highlighted OCI as a cloud with cutting-edge networking (they worked on using 100 Gbps RDMA networks and things like that in OCI). Early on, companies like Zoom, Cohere, and even OpenAI (for some workloads) used Oracle Cloud for GPU resources. Elon Musk’s xAI initially used Oracle as well, reportedly utilizing 16,000 GPUs on OCI for training its first model. (xAI later moved to build its own cluster, Colossus, possibly due to cost or control reasons).

Oracle has branded its setup as “OCI Superclusters.” According to Oracle, an OCI Supercluster can be configured with different GPU types: up to 131,072 AMD MI300 GPUs, or 65,536 NVIDIA H200 GPUs, or 100,000+ Blackwell GPUs in the future. These numbers are highly ambitious and likely forward-looking (they depend on future hardware releases like NVIDIA H200, B200 etc.). But Oracle has essentially announced “the world’s largest AI supercomputer in the cloud” in April 2025, saying it can scale to >100,000 GPUs for a single customer. Oracle’s design uses either NVIDIA’s Quantum-2 InfiniBand or RoCE v2 (Ethernet) networks to link nodes, and they support multi-node distributed training with both GPU and even upcoming AI-optimized CPUs (like AmpereOne or others). Oracle also has strong partnerships with AMD – they plan to offer zettascale clusters with upcoming AMD Instinct accelerators (mentioning up to 131k of those).

What Oracle lacks in giant self-funded builds (their overall cloud footprint is smaller) they make up via focus – their marketing and efforts are squarely aimed at being the go-to cloud for AI startups and enterprises needing large dedicated clusters. They even launched an AI startup accelerator offering free credits.

Comparatively, Microsoft’s advantage is a far larger global network and integration with enterprise IT (Azure is more widely used, and has the Azure OpenAI Service). Oracle’s advantage is perhaps flexibility and top-tier hardware availability with potentially less contention. For example, an AI startup might get a 2,000-GPU cluster on OCI faster or cheaper than on Azure because Azure’s capacity might be prioritized for its own or OpenAI’s needs, whereas Oracle will bend over backwards to win that customer. Oracle’s partnership with NVIDIA is such that NVIDIA’s Ian Buck (VP at NVIDIA) authored a blog praising OCI’s deployment of Blackwell racks and calling OCI “one of the world’s largest and fastest-growing clouds” for AI. It mentions OCI is “among the first to deploy NVIDIA GB200 systems” and will use them for both NVIDIA’s DGX Cloud and OCI’s own services.

One notable story was Elon Musk’s xAI leaving Oracle in 2024: initially xAI used Oracle Cloud, but as it scaled to 100k GPUs, Musk decided to build an in-house cluster likely for control and cost reasons. Oracle losing that account shows the limits – some clients may graduate to owning infrastructure if they get big enough (as OpenAI is considering with Microsoft’s help, building their own datacenters or chips). Nonetheless, Oracle has many other clients (e.g., Meta was rumored to be using Oracle for overflow GPU capacity at one point, and Zoom used OCI for AI processing during the pandemic).

All in all, Oracle’s efforts indicate that even companies outside the top 3 clouds are in the game, leveraging partnerships to stay relevant. Oracle’s strategic goal is to position OCI as the specialized AI cloud, which could complement its strength in enterprise database and application hosting. This competition has likely pushed Microsoft and others to raise their game (e.g., Microsoft partnered with CoreWeave, a GPU cloud startup, to cover some demand surges in 2023 when GPUs were scarce).

The Bigger Picture

Beyond these four, it’s worth noting the broader context: the AI infrastructure arms race is not just corporate but geopolitical. The U.S. and allies are sprinting ahead in building these compute hubs, while China is also investing heavily (though hampered by export controls on top GPUs). For instance, U.S. national labs built Frontier (using AMD GPUs) and are building El Capitan (exascale with AMD GPUs) for scientific AI and defense – those are not commercial, but they do contribute to the overall AI capacity (and potentially, via collaborations, companies can learn from them). Meanwhile, cloud providers are expanding in Europe, the Middle East (e.g., Microsoft and others building cloud regions in Saudi/UAE possibly for AI usage).

Experts have raised concerns: Is this build-out sustainable? Power is a major choke point – Northern Virginia, the world’s largest datacenter hub, has already faced power provisioning delays due to the surge in demand. Some local communities oppose new datacenters because of noise, water, or environmental impact. This has led to innovative thinking: from advanced immersion cooling (where servers dunk in special fluids) to exploring nuclear reactors on-site to guarantee clean power. As Ali Azhar wrote, “AI growth [is] outpacing the grid”, and meeting AI’s hunger might require novel energy solutions like small modular nuclear or massive renewable plus storage projects.

For now, Microsoft’s Wisconsin datacenter serves as a microcosm of these trends: it’s huge, expensive, cutting-edge, and promises world-leading performance – but it also needed creative cooling, renewable offsets, and even then falls back to gas turbines to ensure 24/7 reliability. It’s creating jobs and excitement, but also requiring substantial power and water planning. Each tech giant’s project has similar stories (for example, Google in Oregon faced community pushback on water usage for its data centers, leading it to invest in wastewater reuse programs).

In terms of metrics in the arms race: sometimes it’s about number of GPUs (e.g., Microsoft/Oracle talking 100k+, xAI hitting 100k, Meta aiming for 1.3M cumulative), sometimes about network speed (Azure 800 Gbps, Spectrum-X Ethernet in xAI’s cluster proving Ethernet can scale to 100k GPUs with 95% throughput), and often about flops or tokens per second (Microsoft quoting tokens/sec, NVIDIA quoting exaflops). It can be dizzying, but the bottom line is everyone is scaling up by an order of magnitude or more.

Microsoft’s 10× claim (vs fastest supercomputer) is mirrored by Google’s claim that TPU v4 pods were 10× faster than the previous gen and that TPU v5p will go beyond, and Amazon’s claim of hundreds of exaflops in cloud, and Oracle’s 100k GPU statement. It’s an arms race in marketing too – each wants the prestige of “most powerful” to attract talent and customers.

Expert commentary generally agrees that this AI infrastructure race is akin to building the next generation of computing “railroads” or “power grids.” Jensen Huang calls these datacenters the new factories and emphasizes that throughput = revenue in AI – so companies with bigger “AI factories” can produce more advanced AI products or serve more AI queries, thus making more money. This dynamic is driving the spend. Some analysts warn of diminishing returns or that software breakthroughs (like more efficient algorithms) could pause the need for ever-more GPUs. But at least for now, every indication is demand is insatiable – as long as AI models improve with scale (which they have, per “scaling laws”), companies will invest in scale.

To wrap up the comparison: Microsoft’s AI datacenter initiative is among the front-runners in this race, arguably leapfrogging what was publicly known of others by connecting multiple giant clusters and partnering with a top AI research outfit (OpenAI). But Amazon and Google are right there as well, each with slightly different tech stacks. Meta and Oracle, while not as broad in cloud services, are pushing boundaries in their domains. The competition is intense but also somewhat collaborative – e.g., all use NVIDIA to some extent, all benefit when NVIDIA or AMD innovate, and they all are contributing to an ecosystem that advances AI capabilities for humanity.

Conclusion: The New Era of AI Infrastructure

Microsoft’s “Fairwater” AI datacenter in Wisconsin offers a glimpse into the future of computing. No longer are breakthrough AI models limited by compute – tech giants are building industrial-scale brain factories to unlock next-generation artificial intelligence. As we’ve seen, Microsoft’s approach knits together massive GPU arrays, custom engineering, sustainable practices, and strategic partnerships to create a facility of unprecedented power. By doing so, Microsoft positions itself to drive innovation across its products and services, from cloud to consumer software. At the same time, it contributes to an accelerating global race, one where companies and nations pour resources into AI infrastructure as the new strategic asset, akin to oil reserves or semiconductor fabs in previous eras.

The impact of these AI megacenters will be far-reaching. In the near term, users will experience faster, smarter AI – more capable chatbots, more insightful analytics, more natural language interactions – because models can be trained on trillions of data points and fine-tuned continuously on these supercomputers. In the broader economy, regions like Mount Pleasant are transformed into high-tech hubs, and skilled jobs emerge in places that once might have relied on traditional manufacturing. Yet, challenges around energy and environment must be navigated carefully; the industry must ensure that the pursuit of machine intelligence doesn’t conflict with sustainability or equity.

Microsoft’s Wisconsin datacenter is slated to come online in early 2026 wispolitics.com, heralding what Microsoft calls “a new era of cloud-powered intelligence” that is “secure, adaptive and ready for what’s next.” It symbolizes Microsoft’s confidence that investing in infrastructure is key to AI leadership. And it sends a message: the company that builds the biggest “brain” may well attract the brightest minds (human and AI alike) to its platform.

In a sense, computing has come full circle – from mainframes to personal computers to cloud data centers, and now to AI supercomputers. Each shift brought exponential increases in capability. We now stand at the dawn of the AI supercomputing age, where the scale of infrastructure is almost hard to fathom – acres of land covered in server racks all working in concert on a single problem, like training a model to understand human language or discover new drugs. As Microsoft and its rivals continue to outdo each other, the beneficiaries will be those of us who use technology: we’ll see AI systems grow more powerful, hopefully helping solve big societal challenges (from healthcare to climate modeling), not just composing emails.

The arms race in AI datacenters also suggests that computing power is the new strategic resource. Just as past eras saw races for faster chips or bigger data storage, today it’s about integrated capabilities – compute + data + algorithms – at planet scale. Microsoft’s bet is that by owning a good chunk of that power and offering it via Azure, it can capture a significant share of the AI economy. If AI is the “new oil,” Microsoft is building some of the biggest refineries.

One credible technologist, in reflecting on these developments, said: “These aren’t data centers. These are factories that manufacture intelligence.” That metaphor encapsulates it well. Microsoft’s AI datacenter is a factory of the 21st century, where raw data and compute are the inputs, and trained AI models and insights are the outputs. It runs 24/7, generating “tokens” of intelligence much like a mill produces goods. And just as the industrial revolution’s factories transformed economies, these AI factories could transform the digital economy – enabling services and capabilities previously only seen in science fiction.

In conclusion, Microsoft’s creation of the world’s most powerful AI datacenter isn’t an isolated feat, but part of a broader narrative: the global build-out of AI infrastructure that will define the coming decade. It demonstrates what’s possible when engineering ambition meets capital and urgency – a datacenter so large and advanced that it blurs the line with supercomputers. As these facilities come online, we’ll likely look back on this moment much like the early days of the internet or smartphones – as a foundational investment that unlocked a wave of innovation.

The AI arms race shows no sign of slowing. For Microsoft and others, the challenge will be not just to build the biggest, but to use them wisely – to push AI forward in a way that’s safe, inclusive, and beneficial. The world’s most powerful AI datacenter will help answer some of AI’s hardest technical questions; it’s up to humans to ensure we ask the right questions. For now, the infrastructure is being put in place – titanic computers humming in massive halls – ready to support the next generation of AI breakthroughs that will shape our world.

Sources:

  • Microsoft Official Blog – Inside the world’s most powerful AI datacenter (Scott Guthrie, Sep 18, 2025)
  • Microsoft Azure Blog / Press Releases – Wisconsin AI datacenter investment and details wispolitics.com
  • Reuters – Microsoft boosts Wisconsin data center spending to $7 billion (Sep 18, 2025)
  • Microsoft News Center – Narvik, Norway AI datacenter announcement (Sep 17, 2025)
  • IronArc Venture analysis – The Power Behind AI: America’s Trillion-Dollar Bet on HPC Data Centers (2025)
  • TechCrunch – OpenAI CEO says the company is “out of GPUs” (Feb 27, 2025)
  • Jensen Huang (NVIDIA) keynote quotes – VivaTech 2025 via NextDC blog
  • NVIDIA Blog – OCI deploys thousands of Blackwell GPUs, will scale beyond 100k GPUs (Apr 28, 2025)
  • VisualCapitalist / NVIDIA News – xAI’s Colossus 100k GPU supercomputer (2024)
  • Additional context from Deloitte analysis on data center energy and HPCwire on AI infrastructure trends.
Amazon’s Finance Teams Unleash AI for Complex Tasks – Transforming Corporate Finance
Previous Story

Amazon’s Finance Teams Unleash AI for Complex Tasks – Transforming Corporate Finance

AI Gold Rush or Bubble? Tech’s Trillion-Dollar Question
Next Story

AI Gold Rush or Bubble? Tech’s Trillion-Dollar Question

Go toTop