The computer industry is entering an entirely new development phase for artificial intelligence (AI), said Satya Nadella, the chief executive officer (CEO) and chairman of Microsoft Corp., in a keynote address yesterday that kicked off the company’s Ignite 2023 conference taking place this week in Seattle.
“We are not just talking about a technology that is new and interesting, but we are getting into the details of product making, deployment, safety, and real productivity games, all of the real-world issues, and that’s just the most exciting thing for all of us as builders,” he said. “We are at a tipping point.”
While he spent a great deal of time focusing on Copilot and Microsoft Cloud – details of which will appear in further conference coverage – there was no shortage of news on the Azure front, and it included the launch of:
- Microsoft Azure Maia 100, an AI Accelerator chip designed to run cloud-based training and inferencing for AI workloads such as OpenAI models, Bing, GitHub Copilot and ChatGPT.
- Microsoft Azure Cobalt 100, a cloud-native chip based on Arm architecture optimized for performance, power efficiency and cost-effectiveness for general purpose workloads.
- An announcement of the general availability of Azure Boost, a system that makes storage and networking faster by moving those processes off the host servers onto purpose-built hardware and software.
Maia has been designed, said Nadella, “as an end-to-end rack for AI. AI power demands require infrastructure that is dramatically different from other clouds. The compute workloads require a lot more cooling as well as the networking density. And we have designed the cooling unit, known as the sidekick, to match the thermal profile of the chip, and we added rack-level closed loop liquid cooling for higher efficiency.”
According to a Microsoft blog detailing the launch of Maia and Azure Cobalt, “tucked away on Microsoft’s Redmond campus is a lab full of machines probing the basic building block of the digital age: silicon. This multi-step process meticulously tests the silicon in a method that Microsoft engineers have been refining in secret for years.
“The chips will start to roll out early next year to Microsoft’s data centres, initially powering the company’s services such as Microsoft Copilot or Azure OpenAI Service. They will join an expanding range of products from industry partners to help meet the exploding demand for efficient, scalable and sustainable compute power, and the needs of customers eager to take advantage of the latest cloud and AI breakthroughs.”
Nadella described the Azure Cobalt 100, a 64-bit 128-core ARM-based chip “as the first CPU designed by us, specifically for the Microsoft Cloud. It’s already powering parts of Microsoft Teams, Azure Communications Services, as well as Azure SQL as we speak, and next year we will make this available to customers.”
Azure Boost, he said, is a system that “offloads server virtualization processes onto purpose-built software and hardware. This enables massive improvements in networking, remote storage and local storage throughput.”
Also on the silicon news front, the following is being announced at Ignite.
- The addition of AMD MI300X accelerated virtual machines (VMs) to Azure. The ND MI300 VMs are designed to accelerate the processing of AI workloads for high range AI model training and generative inferencing, and will feature AMD’s latest GPU, the AMD Instinct MI300X.
- The preview of the new NC H100 v5 Virtual Machine Series built for NVIDIA H100 Tensor Core GPUs, which a release stated offers greater performance, reliability and efficiency for mid-range AI training and generative AI inferencing.
- Plans for the upcoming ND H200 v5 Virtual Machine Series, an AI-optimized VM featuring the upcoming NVIDIA H200 Tensor Core GPU.
Approaching the midway point of his keynote, Nadella was joined on stage by Jensen Huang, the co-founder, president and CEO of NVIDIA, a company described by the BBC in an article earlier this year as the “chip maker that became an AI superpower. Originally known for making the type of computer chips that process graphics, particularly for computer games, NVIDIA hardware underpins most AI applications today.”
“AI and accelerated computing are a full stack challenge, and it’s a data centre scale challenge,” said Huang. “From computing to networking, from chips to APIs, everything has been transformed as a result of generative AI.
“Over the last 12 months, our two teams have been accelerating everything we could. One of the initiatives, of course, is accelerated computing, offloading general purpose computing, accelerating all the software we can, because it improves energy efficiency, reduces carbon footprint, reduces costs for our customers, improves their performance, and so on, and so forth.”
He described generative AI (GenAI) as the single most significant platform transition in computing history. “You and I both have been in the computer industry a long time,” he said to Nadella. “In the last 40 years, nothing has been this big. It’s bigger than PC, it’s bigger than mobile, it’s going to be bigger than internet, and surely, by far.
“This is also the largest TAM (total addressable market ) expansion of the computer industry in history. There’s a whole new type of data centre that’s now available. Unlike the data centres of the past, this (one) is dedicated to one job and one job only – running AI models and generating intelligence. It’s an AI factory.”