Table of contents
Repost Notes
This blog series is reposted from Modular’s official blog, written by the creator of LLVM, Chris Lattner.
- Source: What about the MLIR compiler infrastructure? (Democratizing AI Compute, Part 8)
- Publish date: April 8, 2025
The original blog was parsed with Jina Reader.
By 2018, AI software had a system fragmentation problem. TensorFlow, PyTorch, JAX, Glow, ONNX, TensorFlow-Lite, XLA, TVMâthe list kept growing, and each framework invented its own tangled web of “AI graphs” with different “ops.” The ecosystem was splintering into silos, each racing to optimize for different hardware while reinventing the same ideas with subtle variations. Complexity was exploding, and something had to give.
At the time, I was helping scale Googleâs TPUs (and several other internal ASICs) in support of TensorFlow, and it was clear we couldnât keep reinventing compiler infrastructure from scratch for every project. We needed a better foundation. Fortunately, I had years of experience building LLVMâand Jeff Dean as my manager. Jeff, a legendary engineer and a compiler PhD himself, saw the same problem.
In a 1:1 conversation, Jeff said something like:
“Hey Chris, I agree we have a compiler problem. Why donât you go build a new compiler to unify this mess?”
And so, MLIR was bornâa modular, extensible compiler infrastructure designed to bring order to the chaos. It brought forth a foundation that could scale across hardware platforms, software frameworks, and the rapidly evolving needs of machine learning. It aimed to unify these systems, and provide a technology platform that could harmonize compute from many different hardware makers.
But unification is hard. What started as a technical project quickly turned into a battleground: open-source governance, corporate rivalries, and competing visions all collided. What could have been a straightforward engineering win became something much more complicated.
Today, MLIR is embedded in nearly every major AI stackâincluding CUDAâyet it still hasnât delivered on the dream of democratized AI compute.
This is the story of MLIR: how it started, what it changed, and the power struggles along the way.
MLIR, the Origin Story
Modern AI systems rely on complex graphs of operationsâmatrix multiplications, convolutions, attention mechanisms, and moreâall strung together into computational pipelines. To optimize and transform these efficiently requires a solid compiler foundation, as discussed in part 6.
But in 2018, most AI frameworks were reinventing compiler technologyâand often doing it poorly. Basic techniques like Static Single Assignment (SSA) were missing from many. Each framework had its own ad-hoc graph system, bolted together with hacks that didnât scale. The result was a fragmented, inefficient ecosystem, riddled with duplication.
I knew we needed a better approach, so I pulled four like-minded folks into a small room at Google. We spent days white-boarding, sketching out what a modern, scalable compiler infrastructure for AI might look like. Our central question: Could we build a unified representation that could support every AI framework, every hardware backend, and every kind of optimizationâfrom algebraic simplification to polyhedral analysis?

Circa 2018: Yours truly and four colleagues gather in front of a whiteboard to brainstorm a next-generation compiler
The breakthrough idea we created is now known as MLIR dialectsâa way to cleanly separate domain-specific concerns from the core infrastructure of a compiler. Rather than forcing every user to adopt a rigid, one-size-fits-all intermediate representation (like LLVM and other compilers), MLIR would let compiler engineers define their own representationsâcustom ops, types, and semanticsâtailored to their domain.
Aside: Iâm not diving deep on how MLIR works in this post. If youâre curious, check out theoriginal technical keynote or one of the many tutorials online.
At the time, this was a radical departure from how most compilers were built. Traditional infrastructures were monolithicâforcing all frontends and passes into a single, rigid model. But MLIR embraced heterogeneity from day one. It let multiple levels of abstraction coexist, transform, and interoperate seamlessly.
That modularity was the key. Instead of reimplementing the same infrastructure over and over, MLIR gave developers a shared foundationâwhether they were working with TensorFlow graphs, PyTorch IR, or custom TPU ops. It made it possible to build specialized compilers without starting from scratch, and it enabled true composability across the AI compiler stack.
MLIR wasnât just another compiler: It was a framework for building many compilers.
MLIR Growth Within Google and Beyond
MLIR began as a research project inside Google Brain as a focused team trying to rethink how AI compilers should work. My team was heads-down on the fundamentals: designing the IR, implementing transformations, and validating that the core ideas actually worked. Meanwhile, Googleâs open culture and MLIRâs modular design made it easy for others to pick it up and experiment. Before long, MLIR took on a life of its own.
Across Google, teams working on custom ASICs saw the potential. MLIR gave them a structured way to express and optimize hardware-specific operations. Application-focused teams started using it for mobile AI, and the TensorFlow team brought MLIR into TensorFlow Lite. Even individual researchers, fascinated by MLIRâs flexibility, began using it to prototype novel compiler techniques.
What followed was a mini-explosion of use cases. Every new application brought fresh feedback, often while we were still deep in iteration mode. Crucially, this validated our dialect-first approachâproving that MLIR could scale across wildly different domains, from edge devices to datacenter accelerators. Eventually, we reached a tipping point: MLIR was becoming a critical piece of infrastructure across many projects.
Many of us wanted MLIR to reach its full potentialâto go beyond Googleâs first-party use cases.

Above: a well-known meme within the MLIR community (Credit: Mehdi Amini)
So we took the leap: we open-sourced MLIR and contributed it to the LLVM Foundation, making it available for the entire industry. To support adoption, we organized regular “open design meetings,” where external contributors could participate in MLIRâs evolution and benefit from the engineering investment behind it. This open collaboration helped catalyze MLIRâs global momentum, especially among compiler developers hungry for a modern infrastructure.
With this as fuel, MLIR took off: It is now the foundation for many major AI projects: OpenXLA, Triton, and even parts of CUDA itself. Itâs also powering compilers in quantum computing, hardware design (via CIRCT), and many other domains. Companies around the worldâfrom scrappy startups to hyperscalersâstarted building their next-generation compilers using MLIR. Much of MLIRâs early growth and success was directly attributable to Googleâs leadership and open approachâsomething I think the industry still under-appreciates.
Yet for all that success, the grand vision remained out of reach. The ecosystem is still fractured. CUDA still reigns supreme. The dream of truly democratized AI compute remains just thatâa dream.
So what happened? Why did MLIR succeed technically, but fail to break the CUDA lock-in?
To understand that, we need to talk about the politics, power struggles, and compromises that shaped MLIRâs evolution.
The Race to Claim an End-to-end AI Solution
From the outset, MLIR was conceived as general-purpose compiler infrastructureâa framework designed to allow for domain-specific compilers. The goal was flexibility and modularityâMLIR was never just about Machine Learning. In fact, the “ML” in MLIR stood for everything but Machine Learning (yep, compiler jokes are nerdy!). However, the AI community was hungry for something more. The AI world wanted an end-to-end compilerâsomething that could map TensorFlow or PyTorch models cleanly to a broad range of hardware.

The race was on to build the first end-to-end MLIR-based AI solution
As MLIR gained traction, teams inside and outside Google began racing to build an end-to-end AI solution on top of it. Other projectsâlike OpenXLA, TritonLang and many othersâadopted MLIR as an implementation detail to strengthen their own stacks. This raised a question: Everyone wanted to be the next-gen AI stackâso who would get there first?
The race was on. Years later, we know the unfortunate answer: nobody.
MLIRâs AI Dialect Explosion
Contributing MLIR to the LLVM Foundation supercharged adoption. It gave companies a shared foundationâand compiler engineers a chance to prove serious impact inside their organizations. The LLVM Foundation helps with oversight and legal matters, but doesnât intervene in technical design. For that, the community is left to self-organize.
Engineers across the industry, led by Google, started contributing AI-specific dialectsâincluding arith, linalg, and tensorâproviding some bits and pieces useful for building a modern AI compiler stack. It started with Google research teams who had early access to MLIRâbut the precedent was set: many “potentially useful” contributions were upstreamed, with limited governance that allowed project leaders to say “no” in a principled way.
Unfortunately, this explosion happened very early in MLIRâs design, and many design decisions in these dialects werenât ideal for the evolving requirements of GenAI. For example, much of this early work was directed towards improving TensorFlow and building OpenXLA, so these dialects werenât designed with first-class PyTorch and GenAI support (as we discussed earlier in this series).
While many of these efforts hit their original goals, the world changed around them.
Competitive “Coopetition” Strikes Back
For a variety of reasons, almost all of the early MLIR developers (including myself) moved on from Google, with many of them ending up at hardware companies. This spread of MLIR knowledge was a positive outcomeâit meant that the technology would grow far and wideâbut it also brought new challenges.
The problem? MLIRâs success scattered its core developers across the industry. Former allies and colleaguesânow at competing companiesâbegan building proprietary AI stacks on top of shared MLIR dialects. What began as open collaboration soon collided with commercial competition. With a lack of central coordination, communication between these teams broke down. Competing priorities created tension, and the once-unified vision for MLIR began to splinter.

MLIR’s identity crisis: Machine learning solution or compiler framework?
MLIR now faces is an identity crisis: Is it a general-purpose compiler framework for any domainâor an AI solution? Today, it remains unmatched as general-purpose, reusable infrastructure, powering everything from hardware design to quantum computing. On the other hand, the built-in AI-related dialects are contested and incompleteâbut still critical to many open and proprietary downstream stacks.
It started to feel a lot like OpenCL all over again: no reference stack, competing hardware vendors, and a very polite battlefieldâjust like the old Khronos committee.
A New Hope: Improved MLIR Governance
The tensions have simmered for yearsâand they’re deeply felt across the broader LLVM and MLIR communities.
Fortunately, thereâs a new hope: LLVM is a meritocratic community with a long track record of aligning engineersâeven when their companies are at war in the market. The MLIR community is filled with amazing engineers who have poured years of their hearts and souls into improving the project to work through these challenges, and progress is now happening!
MLIR now has a new Area Team to help guide its evolution, along with a new organizational structure and charter and governance group. The charter defines separate area groups: MLIR Core (the domain-independent infrastructure), and the dialects (like the machine learning-specific pieces). I am extremely thankful to everyone who is spending time to improve MLIR and work through these issuesâsuch work has a profound impact on everyone building into the ecosystem as well as the downstream users.
If I could have one wish, it would be for “MLIR” to unambiguously refer to the domain-independent compiler infrastructure, and for these dialects to get a new, different name (perhaps “TensorIR”?). This would reduce confusion about what “MLIR” actually is!
Lessons learned from MLIR
The biggest lesson I learned from MLIR is how scaling too earlyâbefore the core foundations are fully settledâcan cause lasting problems. The early explosion of interest and contribution was exciting, but it also meant that many design decisions were made in parallel, without clear guidance or alignment. We got “many things fast” at the expense of getting “something great at each level,” and then fell prey to Hyrum’s Law.
This also reinforced a management lesson Iâve learned in other places: when you have too many smart engineers running ahead in different directions, itâs hard to steer the ship laterâeven if the ship is made of beautifully designed IR. In this case, while I remain influential in the LLVM/MLIR community, I learned that influence is no match for the paycheck from an employer, which guides a contributor to get their work into the tree so they can move on to the next bug fix or project.
Another lesson is about infrastructure with ambition. My goal for MLIR was to unify compiler implementationsâand it succeeded beyond my hopes. But I also encouraged and catalyzed others to aim beyond that, fueled by a shared optimism that community-led projects could move the world forward. That didnât work out, and it reinforced a lesson of mine seen across other industry-impactful projects Iâve helped buildâLLVM, Clang, Swift, and “MLIR Core.” I learned, more than ever, that small teams are best at aligning on a vision of success and driving it forward. Only once a projectâs identity is firmly established does it make sense to scale it to a broader community.

MLIR has many dialects, but many are contested or incomplete.
As with the tradition of my last three blog posts, Iâll try to evaluate the MLIR AI dialects against the wishlist of features for a next-generation AI solution. Hereâs my best take:
- “Provide a reference implementation”: While MLIR is excellent for general-purpose compiler infrastructure, it does not include an end-to-end solution that can be used directly for AI workloads, just useful building blocks with “some assembly required”. đ
- “Have****strong leadership and vision”: MLIR AI dialects lacked clear leadership early on, with contributions often driven by individuals or different teams, resulting in fragmented direction and confusion over its core identity. While strong leadership is emerging, it remains unresolved. đ
- “Run with top performance on the industry leaderâs hardware”: While MLIR Core provides a strong foundation for optimization, Iâm not aware of any downstream implementations built on the MLIR AI Dialects that match CUDAâs performance for GenAI LLMs on NVIDIA GPUs (including Triton or cuTile that leave 15-20% performance on the table). đ
- “Evolve rapidly”: MLIRâs pace of evolution has been impressive, with contributions flooding in from a broad community. The flexibility of its design has allowed for rapid adaptation to new use cases and domains. đ
- “Cultivate developer love”: MLIR has certainly won the hearts of compiler engineers and system researchers, offering a flexible and powerful toolkit for building custom compilers. đ However, AI developers, especially those in the machine learning community, have found the learning curve steep and the integration with existing ML frameworks to be less seamless. đ
- “Build an open community”: MLIR has built a truly open and thriving community. Regular design meetings, open contributions, and cross-company collaboration have helped it gain broad adoption and investment from many industry players.đđ
- “Avoid fragmentation”: This is where MLIR has struggled the most. The early explosion of dialects and contributions, combined with a lack of strong central governance, led to fragmentation in downstream systems. The vision for a unified approach to AI compilation was difficult to maintain as competing projects moved in different directions.đđđ
Ultimately, as we discussed before, this is a wildly unfair way to measure “MLIR core” as a compiler building toolkitâMLIR is widely used by dozens of systems and has certainly succeeded in its original mission. The success of MLIRâs AI dialects is best measured through its impact on the countless downstream AI implementations that it gets utilized inâIâm just not sure how to do that.
Why do HW companies struggle to build AI software?
At this point in the series, a pattern has emerged: whether itâs OpenCL/OneAPI, TVM/XLA, MLIR, or some other well-meaning acronym, weâve seen powerful attempts to build unifying AI infrastructureâbut none have delivered a solution that developers love. Projects fragment, promises fade, and users of alternate hardware are left with tools that donât “just work”.
The hard truth is this: only one company has ever truly figured this out, and thatâs NVIDIA. CUDA isnât just infrastructureâitâs a strategy, backed by tight vertical integration, application engineers on the ground, and a relentless focus on real-world performance. Itâs not open and itâs not prettyâbut it works great for NVIDIA, even if the innovatorâs dilemma is alive and well in Santa Clara.
So, why canât other hardware companies pull this off? Why do the industryâs smartest people, backed by billions in funding, keep producing software no one wants to use? When youâre competing against an entrenched, vertically integrated leader, the deck is stacked against youâand the incentives of the industry and the organizations within it shape the outcome:
“Show me the incentive and I’ll show you the outcome.”
â Charlie Munger
Weâll dive deeper into that next timeâand until then, let no dialect go uncanonicalized! đ