OpenAI and Broadcom built a custom chip for LLM inference called Jalapeño

OpenAI and Broadcom have announced Jalapeño, a custom ASIC designed specifically for large language model inference in data centers. The chip was developed in nine months, and both companies say it’ll be deployed in data centers by the end of this year.

Broadcom designed the chip from scratch based on what OpenAI researchers told them about their future model roadmap. The pitch: a chip purpose-built for LLM inference will deliver better performance per watt than the general-purpose hardware currently doing the job in data centers.

OpenAI claims early testing shows “performance per watt substantially better than current state-of-the-art,” but they’re not sharing hard numbers yet. A detailed technical report is promised in the coming months.

Why does this matter? Custom silicon is becoming the next battleground for AI companies. OpenAI wants to own the full stack — from chips to applications — reducing dependence on Nvidia and potentially squeezing out more capacity during a global compute crunch. Broadcom, meanwhile, is building a business around making custom chips for hyperscalers and frontier model teams.

The Jalapeño is just the first generation. Both companies frame this as a long-term partnership that will see chips refined over time. Whether it delivers on the efficiency claims remains to be seen, but the direction is clear: the era of one-size-fits-all GPU clusters for AI inference is giving way to more specialized hardware.