Skip to content

extract_theorems

Split a file containing one or more theorems into smaller units, each containing a single theorem along with any required dependencies.

Try this example in the web UI

Input Parameters

content · str · required · Lean source code

The Lean source code to be processed by this tool.

ignore_imports · bool · default: False · Ignore import mismatches

Controls import statement handling:

  • false (default): Validate that imports match the environment. Returns an error if they don't.
  • true: Ignore the imports in content and use the environment's default imports instead. See the troubleshooting page for more details.
environment · str · required · Lean environment or version

The Lean environment to use for evaluation. Each environment includes a specific Lean version and pre-built dependencies (typically Mathlib).

Available environments: lean-4.28.0, lean-4.27.0, lean-4.26.0, etc.

timeout_seconds · float · default: 120 · Max execution time in seconds

Maximum execution time in seconds. Requests exceeding this limit return a timeout error. Note that end-to-end request latency may exceed this timeout due to queue time and other overhead. Additionally, all non-admin requests are subject to an absolute maximum timeout of 900 seconds (15 minutes).

Output Fields

content · string · Processed Lean code

The Lean code that was actually processed. May differ from input if ignore_imports=true caused header injection.

lean_messages · dict · Messages from Lean compiler

Messages from the Lean compiler with errors, warnings, and infos lists. Errors here indicate invalid Lean code (syntax errors, type errors, etc.).

tool_messages · dict · Messages from extraction tool

Messages from the extraction tool with errors, warnings, and infos lists. Errors here indicate tool-specific issues (not Lean compilation errors).

documents · dict · Theorem names mapped to self-contained documents

Dictionary mapping theorem names to self-contained Lean code documents. Each key is a theorem name, and the value is a self-contained breakdown of the theorem, including a content field containing that theorem plus all dependencies it needs (imports, definitions, etc.).

timings · dict · Execution timing breakdown

Timing information in milliseconds for various stages of processing.

Document Fields

Each document in the documents dictionary contains:

kind · str · The kind of declaration

The kind of the declaration. For extract_theorems, this is always "theorem". For extract_decls, possible values are: theorem, def, abbrev, axiom, opaque, structure, class, class inductive, inductive, instance, example, unknown.

declaration · str · The declaration source code

The raw source code of this declaration.

content · str · Standalone content including declaration and dependencies

Complete, self-contained Lean code that includes the declaration and all its local dependencies. Can be compiled independently.

tokens · list[str] · Raw tokens from the declaration

The declaration's source code split into tokens.

signature · str · Declaration signature (everything before the body)

The declaration signature, e.g., theorem foo (x : Nat) : x = x or def bar : Nat.

type · str · Pretty-printed type of the declaration

The type of the declaration as pretty-printed by Lean.

type_hash · int · Hash of the canonical type expression

Hash of the canonical, alpha-invariant type expression. Useful for deduplication.

is_sorry · bool · Whether the declaration contains a sorry

True if the declaration contains a sorry.

index · int · 0-based index in original file

Position of this declaration in the original file. Note: indices may not be contiguous (mutual definitions share indices).

line_pos · int · 1-based line number where declaration starts

Line number where the declaration begins.

end_line_pos · int · 1-based line number where declaration ends

Line number where the declaration ends.

proof_length · int · Approximate number of tactics in proof

Rough measure of proof complexity based on tactic count. Only meaningful for theorems/lemmas with tactic proofs.

tactic_counts · dict[str, int] · Map of tactic names to occurrence counts

Breakdown of which tactics are used and how often. Only meaningful for theorems/lemmas with tactic proofs.

local_type_dependencies · list[str] · Transitive local dependencies of the type

Local declarations that the declaration's type depends on (non-transitive).

local_value_dependencies · list[str] · Transitive local dependencies of the body

Local declarations that the declaration's body/proof depends on (non-transitive).

external_type_dependencies · list[str] · Immediate external dependencies of the type

External constants (builtins, imports) that appear in the type.

external_value_dependencies · list[str] · Immediate external dependencies of the body

External constants (builtins, imports) that appear in the body/proof.

local_syntactic_dependencies · list[str] · Local constants explicitly written in source

Local constants that appear literally in source (not from notation/macro expansion).

external_syntactic_dependencies · list[str] · External constants explicitly written in source

External constants that appear literally in source (not from notation/macro expansion).

declaration_messages · dict · Messages specific to this declaration

Lean messages (errors, warnings, infos) specific to this declaration in the original document.

theorem_messages · dict · (Deprecated) Messages specific to this declaration

Lean messages (errors, warnings, infos) specific to this declaration. For extract_theorems, this contains the same data as declaration_messages. For extract_decls, this is always empty.

Deprecated

This field is deprecated. Use declaration_messages instead for new code.

Python API

result = await axle.extract_theorems(
    content="import Mathlib\ntheorem foo : 1 = 1 := rfl\ntheorem bar : 2 = 2 := rfl",
    environment="lean-4.28.0",
    ignore_imports=False,  # Optional
    timeout_seconds=120,   # Optional
)

print(result.content)  # The processed Lean code
for name, doc in result.documents.items():
    print(f"{name}: {doc.signature}")
    print(f"  Dependencies: {doc.local_value_dependencies}")

CLI

Usage: axle extract-theorems CONTENT [OPTIONS]

# Extract to default directory
axle extract-theorems combined.lean --environment lean-4.28.0
# Extract to custom directory
axle extract-theorems combined.lean -o my_theorems/ --environment lean-4.28.0
# Force overwrite
axle extract-theorems combined.lean -o my_theorems/ -f --environment lean-4.28.0
# Pipeline usage
cat combined.lean | axle extract-theorems - -o output/ --environment lean-4.28.0

HTTP API

curl -s -X POST https://axle.axiommath.ai/api/v1/extract_theorems \
    -d '{"content": "import Mathlib\ntheorem foo : 1 = 1 := rfl", "environment": "lean-4.28.0"}' | jq

Example Response

{
  "content": "import Mathlib\ntheorem foo : 1 = 1 := rfl",
  "lean_messages": {
    "errors": [],
    "warnings": [],
    "infos": []
  },
  "tool_messages": {
    "errors": [],
    "warnings": [],
    "infos": []
  },
  "timings": {
    "total_ms": 92,
    "parse_ms": 87
  },
  "documents": {
    "foo": {
      "kind": "theorem",
      "declaration": "theorem foo : 1 = 1 := rfl",
      "content": "import Mathlib\n\ntheorem foo : 1 = 1 := rfl",
      "tokens": ["theorem", "foo", ":", "1", "=", "1", ":=", "rfl"],
      "signature": "theorem foo : 1 = 1",
      "type": "1 = 1",
      "type_hash": 1326858781,
      "is_sorry": false,
      "index": 0,
      "line_pos": 2,
      "end_line_pos": 2,
      "proof_length": 1,
      "tactic_counts": {},
      "local_value_dependencies": [],
      "local_type_dependencies": [],
      "external_value_dependencies": ["rfl", "Nat", "OfNat.ofNat", "instOfNatNat"],
      "external_type_dependencies": ["Eq", "Nat", "OfNat.ofNat", "instOfNatNat"],
      "local_syntactic_dependencies": [],
      "external_syntactic_dependencies": ["rfl"],
      "theorem_messages": {"errors": [], "warnings": [], "infos": []},
      "declaration_messages": {"errors": [], "warnings": [], "infos": []}
    }
  }
}