From Core to Edge: Rethinking AI Infrastructure for a Real-Time World

Edge Technology
Leo Merle
Jun 24, 2025

From Core to Edge: Rethinking AI Infrastructure for a Real-Time World

The next wave of AI innovation is no longer confined to the cloud. It’s being deployed at the edge — embedded in streetlamps, inside robots, and powering intelligent systems in the most unpredictable environments on earth. At the recent theCUBE + NYSE Wired Summit in Palo Alto, a panel of AI infrastructure leaders explored what it really takes to move from centralized compute to real-world intelligence.

The discussion brought together Harminder Sehmi (CFO, Blaize), Gopal Hegde (SVP Engineering, SiMa.ai), and Vijay Nagarajan (VP Strategy, Broadcom), moderated by theCUBE’s John Furrier and Dave Vellante.

A Shift in Paradigm: Why Edge Matters Now

“AI becomes real when it runs where the data is created,” said Harminder Sehmi of Blaize. “We’re working with smart city customers today. Our chips are deployed on lampposts — they’re industrial-grade, designed to handle environmental extremes and deliver real-time insights where latency matters most.”

Grounded, physical AI is quickly transforming industries — from defense and healthcare to autonomous vehicles and public safety. Unlike traditional cloud-based AI pipelines, edge AI demands hardware and software architectures that are compact, efficient, and capable of operating independently. This shift not only enables faster data processing and real-time responses but also reduces reliance on centralized infrastructure, making systems more resilient and adaptable in diverse environments. As a result, organizations can enhance security, improve operational efficiency, and unlock innovative applications that were previously impossible with traditional approaches.

Core-to-Edge Continuum: Broadcom’s Perspective

While Blaize and SiMa.ai emphasized edge-native compute and device-level intelligence, Broadcom’s Vijay Nagarajan offered a complementary view from the data infrastructure perspective. “The edge is crucial for delivering AI services directly where data is generated,” he emphasized, pointing to Broadcom’s advancements in connectivity and switching solutions like Tomahawk 6. These high-performance networking chips support the bandwidth and scale required to seamlessly transfer data across cloud and edge environments efficiently.

Together, these perspectives underscore a crucial truth: effective edge computing requires advanced networking infrastructure that can enable real-time responsiveness across distributed systems.

Performance Through Efficiency, Not Excess

Unlike general-purpose chips, edge infrastructure requires systems to do more with less — less power, less space, and less latency — delivering maximum performance in a compact form. This drives a new category of silicon innovation. Blaize, for example, has focused on purpose-built architecture optimized for real-time multimodal AI — voice, text, video — all processed directly on-device.

“It’s not about chasing the smallest node,” said Sehmi. “We’ve worked with both TSMC and Samsung. For defense, U.S.-based foundries matter. But even more important is architectural efficiency — getting more out of every TOP we deliver.”

Open Ecosystems Will Define the Future

One of the key challenges in edge AI is scale. With diverse applications and fragmented platforms, no single company can address it alone. Sehmi emphasized the importance of ecosystem flexibility:

“Our belief is that ISV partners are critical. A lot of apps running on GPUs aren’t optimized for the edge. With a programmable SDK and a low-code/no-code AI Studio, we help partners deploy models efficiently — even if they’re not developers.”

This sentiment echoed across the panel — open standards, open platforms, and developer-first tools are crucial to democratizing edge AI.

Intelligence That Wakes Up Only When Needed

A recurring theme was intelligent efficiency — selectively activating only the logic required, at the right time, without overburdening the system. Sehmi illustrated this with a sharp analogy:

“Someone asks, ‘What’s two plus three?’ and you’ve woken up the entire model — the math teacher, the lawyer, the dental hygienist. Think about the power that takes. Our approach is to selectively activate what’s needed. That’s how we scale AI efficiently.”

This model of selective activation is not just a software issue — it’s a hardware-software co-design problem that defines the next era of edge AI.

Infrastructure for a Real-Time World

The panel agreed: the future of AI is distributed, decentralized, and deeply embedded into the fabric of our physical world. The focus is shifting from raw compute horsepower to system-level design — how hardware, connectivity, and software interact to solve real-world problems.

“The opportunity in autonomous systems, especially toward the end of the decade, is huge,” Sehmi concluded. “But it won’t be captured by chasing speed alone. It’ll go to those who build smart, resilient, and adaptive infrastructure.”

The Edge Is Here and Now

As AI continues its migration from core to edge, the conversation is no longer about potential — it’s about execution. With programmable compute at the edge and powerful networking at the core, a new generation of AI infrastructure is emerging. And the leaders who understand this shift — and design for it — are already building the future.

Watch the full session: https://bit.ly/3TCUgNS