Alpha PreviewPlease note that AIML is still baking and is still under heavy development.

Brought to you by the experts at Fireworks AI

Build production-ready agents & AI workflows with nothing but prompting

Making the promise of "text as code" a reality with the AIML Agent Runtime!

/ Built for developers
who want to get stuff done!

AIML allows you to easily build reliable, deterministic agents and workflows without having to learn a new framework or constantly bounce between code and natural language prompts. This also allows AI experts and non-experts to collaborate more effectively.

Check out some examples
NO FRAMEWORK LOCK-IN

It's just Markdown with XML tags — Like you're used to!

The AIML language is designed to be simple and easy to understand. It's just Markdown with some special XML tags that provide deterministic, reliable multi-step actions and tools to any LLM. (and yes, JSX syntax is supported too!)

5ms
DEVELOPER EXPERIENCE

OpenAI Chat and Responses API compatible

Just send your AIML based prompts to your self deployed or cloud-based AIML Runtime server using the OpenAI Chat or Responses compatible API as your "developer" or "system" message. It's that simple!

Create
CODE WHEN YOU WANT IT

Still want to use code? No problem!

The AIML Runtime supports exicuting Javascript, or Python code inside script tags within your prompt within a secure sandboxed environment.

/ Workflows

AIML's runtime is State Graph based, allowing you to define complex workflows with conditional branching, merging, and orchestration... and visualize them with real-time results in the inspector!

AIML Prompt
---
model: account/fireworks/model/deepseek-v3
---

{/* 
I can include a comment here that wont be sent to the LLM... USEFUL!
This will be used as a system prompt for th default model defined in the header 
*/}

You are a helpful assistant that can answer questions and help with tasks.
The user has made a request of you, just think outloud to yourself about how to respond.
Wrap your thoughts in 

{/* 
Then we get the actual response as a second llm call to a smaller, cheaper, faster model
*/}
<llm model="accounts/fireworks/models/qwen3-30b-a3b">
    <instructions>
        {({lastElement}) => `Your thoughts on the conversation so far... ${lastElement.output}`}
        Based on your thoughts, respond to the users request.
    </instructions>
    <prompt>
        {({userInput}) => userInput.message}
    </prompt>
</llm>
AIML Runtime Inspector
Your server / clientIncoming RequestThinkAnswerStream Response

Simple control flow

Simple semantics for branching, chaining, merging, looping, and conditional execution. With no element being enforced. Just sending text? that will be intuited as a single LLM call and ran... only use elements you need when you need them.

Real-time streaming

Stream step completion events to users as they happen, or just wait and use results in the next step. You have complete control of how each step is executed and what data is returned.

State & Context

<data field="name" type="string"/> tags can be added to your prompt to define schemas and default values that can be updated by defining a <assign field="name" value={({lastElement}) => lastElement.output}/> tag. The value will then be persisted by the runtime between requests. Learn moretag.

/ Features

Lightweight, optimized, and language agnostic framework designed to simplify AI agent development.

Lightweight Architecture

Lives just above the inference layer with flows defined directly in the system prompt.

Dramatic Latency Reduction

Achieve single-digit latency compared to 120ms+ per step typical in other frameworks.

Language Agnostic

Works across language ecosystems, removing AI/product team bottlenecks.

Simplicity at its Core

While it's a full agentic framework, it's ultimately just a prompt that works with any SDK.

Why Choose AIML?

  • We've heard consistent feedback that frameworks like LangChain can be over-engineered with disconnected abstractions.
  • Our approach enables automatic prompt optimization and eventually self-fine-tuning models.
  • Our solution's API is just a prompt, so it works across language ecosystems.
  • By hosting directly next to models, we achieve single-digit latency compared to 120ms+ per step.

/ Debugging

Powerful debugging tools online or within your IDE

While AIML is as flexible and easy as a prompt, it's also as deterministic as code. You're in complete control with full visibility into every step of your agent's workflow.

  • Real-time tracing – Visualize each step of your agent's execution with detailed logs and state snapshots.
  • IDE integration – Debug your AIML workflows directly in VS Code or your favorite IDE with our extension.
  • UI Graphs for visualizing your flow and it's execution – Complex and recursive flows can be hard to reason about as they grow, this is true of any agentic framework. But with AIML, you can nativly visualize your flow and it's execution in a UI based graph to hone in on what's happening.
  • Time travel debugging – Step through your agent's execution history to pinpoint issues and optimize performance.
Explore debugging tools
AIML Debugger
WORKFLOW STEPS
Initialize
Parse Query
Search Web
Summarize
Generate Response
EXECUTION DETAILS
Search Web
Searching for: "Latest advancements in LLMs"
Started: 10:42:15 AM
Variables
{
  "query": "Latest advancements in LLMs",
  "searchResults": [
    {
      "title": "GPT-5: What We Know So Far",
      "url": "https://example.com/gpt5",
      "snippet": "The upcoming GPT-5 model is expected to..."
    },
    {
      "title": "Advances in Multimodal LLMs",
      "url": "https://example.com/multimodal",
      "snippet": "Recent research has shown significant..."
    }
  ]
}
Console
[10:42:15] Starting search operation
[10:42:16] Fetching results from API
[10:42:17] Received 5 results
[10:42:17] Filtering relevant information