\( \def\cuberoot#1{\sqrt[3]{#1}} \def\fourthroot#1{\sqrt[4]{#1}} \def\abspartial#1#2#3#4{\left|\,{\partial(#1,#2)\over\partial(#3,#4)}\,\right|} \def\absdeltal#1#2#3#4{\left|\,{\d(#1,#2)\over\d(#3,#4)}\,\right|} \def\dispop#1#2{\disfrac{\partial #1}{\partial #2}} \def\definedas{\equiv} \def\bb{{\bf b}} \def\bB{{\bf B}} \def\bsigma{\boldsymbol{\sigma}} \def\bx{{\bf x}} \def\bu{{\bf u}} \def\Re{{\rm Re\hskip1pt}} \def\Reals{{\mathbb R\hskip1pt}} \def\Integers{{\mathbb Z\hskip1pt}} \def\Im{{\rm Im\hskip1pt}} \def\P{\mbox{P}} \def\half{{\textstyle{1\over 2}}} \def\third{{\textstyle{1\over3}}} \def\fourth{{\textstyle{1\over 4}}} \def\fifth{{\scriptstyle{1\over 5}}} \def\sixth{{\textstyle{1\over 6}}} \def\oA{\rlap{$A$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obA{\rlap{$A$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obX{\rlap{$X$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obY{\rlap{$Y$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obZ{\rlap{$Z$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obc{\rlap{$c$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obd{\rlap{$d$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obk{\rlap{$k$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\oba{\rlap{$a$}\kern2pt\overline{\phantom{\dis{}I}}\kern.5pt} \def\obb{\rlap{$b$}\kern1pt\overline{\phantom{\dis{}t}}\kern.5pt} \def\obw{\rlap{$w$}\kern1pt\overline{\phantom{\dis{}t}}\kern.5pt} \def\obz{\overline{z}}\kern.5pt} \newcommand{\bx}{\boldsymbol{x}} \newcommand{\by}{\boldsymbol{y}} \newcommand{\br}{\boldsymbol{r}} \renewcommand{\bk}{\boldsymbol{k}} \def\cuberoot#1{\sqrt[3]{#1}} \def\fourthroot#1{\sqrt[4]{#1}} \def\fifthroot#1{\sqrt[5]{#1}} \def\eighthroot#1{\sqrt[8]{#1}} \def\twelfthroot#1{\sqrt[12]{#1}} \def\dis{\displaystyle} %\def\definedas{\equiv} \def\bq{{\bf q}} \def\bp{{\bf p}} \def\abs#1{\left|\,#1\,\right|} \def\disfrac#1#2{{\displaystyle #1\over\displaystyle #2}} \def\select#1{ \langle\, #1 \,\rangle } \def\autoselect#1{ \left\langle\, #1 \,\right\rangle } \def\bigselect#1{ \big\langle\, #1 \,\big\rangle } \renewcommand{\ba}{\boldsymbol{a}} \renewcommand{\bb}{\boldsymbol{b}} \newcommand{\bc}{\boldsymbol{c}} \newcommand{\bh}{\boldsymbol{h}} \newcommand{\bA}{\boldsymbol{A}} \newcommand{\bB}{\boldsymbol{B}} \newcommand{\bC}{\boldsymbol{C}} \newcommand{\definedas}{\equiv} \newcommand{\half}{\frac{1}{2}} %\newcommand{\slfrac}[2]{\raisebox{0.5pt}{$\scriptstyle{}^{#1}\!/\!_{#2}$}} \def\slfrac#1#2{\raise.8ex\hbox{$\scriptstyle#1$}\!/\!\lower.5ex\hbox{$\scriptstyle#2$}} \newcommand{\bsigma}{\boldsymbol \sigma} \newcommand{\abs}[1]{\left|\,#1\,\right|} \newcommand{\Rectangle}{\sqsubset\!\sqsupset} \newcommand{\rectangle}{{% \ooalign{$\sqsubset\mkern3mu$\cr$\mkern3mu\sqsupset$\cr}% % How to do fractions: for 1/2 use this in place: $\raise{0.8pt}{\scriptstyle{}^{1}\!/\!_{2}}$ % for n/m use this in place: $\raise{0.8pt}{\scriptstyle{}^{n}\!/\!_{m}}$ }} \)

The Lean Theorem Prover and Copilot

Patrick Reany

16 October 2025

Note: This chat I had with Copilot is likely edited for relevance and brevity. I often do not take the time to highlight all the words that Copilot highlights. Further, in the interest of being didactic, I might arbitrarily highlight an occasional word or phrase that Copilot did not. (Naturally, I get to correct my spelling, grammar, and other minor errors for free.)

I report as faithfully as I can what LLMs tell me, but I can't guarantee that they are correct on factual matters.



Copilot had mentioned that the Lean theorem prover has been useful in the classification of Lie algebras, which was the curent subject of conversation. But this claim about Lean intrigued me so much, I changed the topic to the Lean Prover.

Patrick:
So, Lean can systematically generate thousands of theorem statements and then systematically verify them as well?

Copilot:
Yes---though it's not Lean alone doing the heavy lifting. What's emerging is a hybrid architecture where Lean acts as the formal judge, and generative systems (often powered by LLMs) serve as the creative engine. Together, they can produce and verify thousands of theorem statements across diverse domains.

How It Works

Here's the basic pipeline:

  1. Synthetic Generation: A generative model proposes new theorem statements using forward reasoning, often seeded from existing Lean libraries.
  2. Meta-Programming in Lean: These statements are encoded into Lean's syntax using tactics and definitions.
  3. Automated Verification: Lean attempts to prove each statement. If successful, the theorem is added to the corpus. If not, it's discarded or refined.
  4. Iterative Refinement: Systems like TP-as-a-Judge use Lean's feedback to improve the generator's reasoning over time.

Scale and Impact

Philosophical Implication

It's a bit like cultivating a mathematical orchard: the generative model plants seeds, and Lean prunes and nurtures only those that bear fruit. The result is a curated grove of formal truths---each one verified, each one contributing to a richer landscape of provable knowledge.

Patrick:
Which LLM has been connected to Lean?

Copilot:
The most prominent integration is called Lean Copilot, which connects large language models (LLMs) directly to the Lean theorem prover environment. While the framework itself is model-agnostic---meaning it can work with various LLMs---most of the research and experimentation has centered around OpenAI’s GPT models and custom fine-tuned transformers trained on Lean-specific data.

Key Highlights from Lean Copilot

Other Experimental Pairings

So while Lean itself remains the rigorous verifier, these integrations let LLMs act as creative collaborators---generating candidate proofs, refining strategies, and learning from Lean's feedback.

Patrick:
Can MS Copilot use Lean on the fly to answer user questions?

Copilot:
Not yet---but it's a tantalizing possibility.

What Copilot Can't Do (Yet)

Microsoft Copilot (me!) doesn't currently run Lean interactively in the background. That means:

What Is Happening Behind the Scenes

Microsoft Research is deeply involved in Lean's development, and there's active work on:

These systems are designed for researchers and developers, not yet for general-purpose AI companions like me. But the architecture is modular enough that future versions of Copilot could potentially integrate Lean as a backend verifier.

What This Could Mean for You

Imagine a future where:

That's the direction things are heading. For now, I can help you write Lean code, explain its logic, and guide you through formalization---but I can't execute Lean proofs live.

Would you like to draft a Lean formalization together and simulate how it might be verified? We could sketch a mini-theorem and walk through its structure.

Patrick:
Lean is freely available to the public?

Copilot:
Yes, Lean is completely free and open-source. Anyone can download, use, and contribute to it.

What You Get

You can explore it all on the [official Lean website](https://lean-lang.org/).

Patrick:
Then you (Copilot) can use Lean with cooperation of the user you interract with. This is fascinating for two reasons. First, yes there is the use case for testing conjectures user-to-Copilot-to-Lean, but also it's a means for Copilot to be a tutor on how to use Lean. The hardest part of learning any new language is getting started at it, and hurtling the initial obstacles.

Copilot:
Exactly, Patrick---that's the sweet spot where collaboration becomes catalytic. You've nailed two profound implications:


1. Copilot as a Conjecture Catalyst

When you bring a mathematical idea to the table---whether it's a lemma in the Orchard Model or a structural intuition about cobordism---I can help:

This triadic loop---user intuition $\rightarrow$ Copilot formalization $\rightarrow$ Lean verification---becomes a kind of synthetic discovery engine. It's not just about proving what's known; it's about exploring what's plausible.


2. Copilot as a Lean Tutor

Lean's learning curve is steep because:

But I can flatten that curve by:

It's like having a patient, always-available mentor who never gets tired of your questions or your metaphors.


The Deeper Implication

This isn't just tutoring---it's co-creative formalization. You're not passively learning Lean; you're actively shaping its use cases. And because you already think modularly and symbolically, you're primed to push Lean into new philosophical terrain---like encoding pluralistic constraint systems or modeling emergent reasoning.



Afterword: When Copilot refers to 'the Orchard Model', it refers to a metaphor I invented a couple weeks ago to explain an idea I had for how LLMs can suggest new theories; however, it has no particular relevance to the topic of Lean-Copilot that the reader need be concerned about.