2024 Outlook with Da Chuang of Expedera - Semiwiki

2024 Outlook with Da Chuang of Expedera – Semiwiki

Source Node: 2508131

Da Chuang 2

Expedera provides customizable neural engine semiconductor IP that dramatically improves performance, power, and latency while reducing cost and complexity in edge AI inference applications. Da is co-founder and CEO of Expedera. Previously, he was cofounder and COO of Memoir Systems, an optimized memory IP startup, leading to a successful acquisition by Cisco. At Cisco, he led the Datacenter Switch ASICs for Nexus 3/9K, MDS, CSPG products. Da brings more than 25 years of ASIC experience at Cisco, Nvidia, and Abrizio. He holds a BS EECS from UC Berkeley, MS/PhD EE from Stanford. Headquartered in Santa Clara, California, the company has engineering development centers and customer support offices in the United Kingdom, China, Japan, Taiwan, and Singapore.

Tell us a little bit about yourself and your company.

My name is Da Chuang, and I am the co-founder and CEO of Expedera. Founded in 2018, Expedera has built our reputation of providing the premier customizable NPU IP for edge inference applications from edge nodes and smartphones to automotive. Our Origin NPU, now in its 4thgeneration architecture, supports up to 128 TOPS in a single core while providing industry-leading processing and power efficiencies for the widest range of neural networks including RNN, CNN, LSTM, DNN, and LLMs.

-What was the most exciting high point of 2023 for your company?

>>2023 was a year of tremendous growth for Expedera. We added two new physical locations to our company, Bath (UK) and Singapore. Both of these offices are focused on future R&D, developing next-generation AI architectures, plus other things you’ll be hearing about in the months and years to come. While that is very exciting for us, perhaps the most significant high point for Expedera in 2023 was our customer and deployment growth. We started the year with the news that our IP had been shipped in over 10M consumer devices, which is a notable number for any Semiconductor IP startup. Throughout the year, we continued to expand our customer base, which now includes worldwide Tier 1 smartphone OEMs, consumer devices chipsets, and automotive chipmakers. Our NPU solution is recognized globally as the best in the market, and customers come to us when they want the absolute best AI engine for their products.

-What was the biggest challenge your company faced in 2023?

>>The biggest challenge in 2023, along with the biggest opportunity, has been the emergence of Large Language Models (LLMs) and Stable Diffusion (SD) in the edge AI space. LLMs/SD represent a paradigm shift in AI – they require more specialized processing and more processing horsepower than the typical CNN / RNN networks most customers were deploying in 2022 and prior. The sheer number of LLM/SD-based applications our customers are implementing has been incredible to see. However, the main challenge of LLMs and SD on the edge has been allowing those networks to run within the power and performance envelope of a battery-powered edge device.

-How is your company’s work addressing this biggest challenge?

>> Our customers want to feature products that are AI-differentiated; products that bring real value to the consumer with a fantastic user experience. However, significant hits to battery life aren’t accepted as part of the user experience. As we integrated LLM and SD support into our now-available 4th generation architecture, our design emphasis was focused on providing the most memory efficient, highest utilization, lowest latency NPU IP we could possibly build. We drilled in the underlying workings of these new network types; data movements, propagations, dependencies, etc… to understand the right way to evolve our both our hardware and software architectures to best match future needs. As an example of how we’d evolved, our 4th generation architecture features new matrix multiplication and vector blocks optimized for LLMs and SD, while maintaining our market-leading processing efficiencies in traditional RNN and CNN-style networks.

-What do you think the biggest growth area for 2024 will be, and why?

>> One of our biggest growth areas is 2024 is going to be supporting an increasing variety of AI deployments in automobiles. While most are likely familiar with the usage of AI in the autonomous driving stack for visual-based networks, there are a lot more opportunities and uses that are emerging. Certainly, we’re seeing LLM usage in automobiles skyrocketing, like many other markets. However, we’re also seeing increased usage of AI in other aspects of the car – driver attentiveness, rear seat passenger detection, infotainment, predictive maintenance, personalization, and many others.  All of these are aimed at providing the consumer with the best possible user experience, one of the key reasons for the implementation of AI. However, the AI processing needs of all of these uses vary dramatically, not only in actual performance capabilities but also in the types of neural networks the use case presents.

-How is your company’s work addressing this growth?

>> Along with the aforementioned LLM and SD support, Expedera’s 4th generation architecture is also readily customizable. When Expedera engages in a new design-in with a customer, we seek to understand all the application conditions (performance goals, network support required, area and power limitations, future needs, and others) so that we can best customize our IP – essentially, give the customer exactly what they want without having to make sacrifices for things they don’t. If the customer desires a centralized, high-performance engine handing a number of different uses and support for a variety of networks, we can support that. If the customer wants to deploy decentralized engines handling only specific tasks and networks, we can support that as well – or anywhere in between. And this is all from the same IP architecture, done without time-to-market penalties.

-What conferences did you attend in 2023 and how was the traffic?

>>Expedera exhibits at a targeted group of conferences focused on edge AI, including but not limited to the Embedded Vision Summit and AI Hardware & AI Summit, as well as larger events like CES. Traffic at these events seemed on par with 2022, which is to say respectable. AI is obviously a very hot topic within the tech world today, and every company is looking at ways to integrate AI into their products, workflows, and design process. Accordingly, we’ve seen an ever-increasing variety of attendees at these events, all of whom come with different needs and expectations.

-Will you attend conferences in 2024? Same or more?

>>2024 will likely see a slight expansion of our conference plans, especially those focused on technology. As part of the semiconductor ecosystem, Expedera cannot afford to exist in a vacuum. We’ve spoken at past events about our hardware and software stacks, as well as implementations like our security-centric always-sensing NPU for smartphones. This year, we’ll be spending a lot of our time detailing edge implementations of LLMs, including at upcoming conferences later this Spring. We look forward to meeting many of you there!

Also Read:

Expedera Proposes Stable Diffusion as Benchmark for Edge Hardware for AI

WEBINAR: An Ideal Neural Processing Engine for Always-sensing Deployments

Area-optimized AI inference for cost-sensitive applications

Share this post via:

Time Stamp:

More from Semiwiki