Search
Search docs, blog posts, and ecosystem packages with citations.
Enter a query to see grounded citations.
We can't find the internet
Attempting to reconnect
Search docs, blog posts, and ecosystem packages with citations.
How agents survive restarts through state snapshots, thread journals, and rehydration.
In-memory agents vanish when a process crashes or a node restarts. Any accumulated state, conversation history, or workflow progress disappears with the process. You need a way to snapshot agent state and reconstruct it later without losing data.
Jido separates persistence into two concerns. Jido.Storage is a behaviour that defines how data reaches durable storage. Jido.Persist is the module that enforces correctness invariants on top of any storage adapter. This separation means you can swap storage backends without changing your persistence logic.
Jido.Storage defines six callbacks organized into two groups.
Checkpoints use key-value overwrite semantics. Each call replaces the previous value for that key.
@callback get_checkpoint(key :: String.t(), opts :: keyword()) ::
{:ok, map()} | :not_found | {:error, term()}
@callback put_checkpoint(key :: String.t(), data :: map(), opts :: keyword()) ::
:ok | {:error, term()}
@callback delete_checkpoint(key :: String.t(), opts :: keyword()) ::
:ok | {:error, term()}
Journals are append-only thread entries. New entries add to the existing log rather than replacing it.
@callback load_thread(thread_id :: String.t(), opts :: keyword()) ::
{:ok, thread :: map()} | :not_found | {:error, term()}
@callback append_thread(thread_id :: String.t(), entries :: list(), opts :: keyword()) ::
{:ok, updated_thread :: map()} | {:error, term()}
@callback delete_thread(thread_id :: String.t(), opts :: keyword()) ::
:ok | {:error, term()}
The append_thread/3 callback accepts an :expected_rev option for optimistic concurrency control. If the stored thread’s revision does not match, the call returns {:error, :conflict}.
| Adapter | Durability | Use case |
|---|---|---|
Jido.Storage.ETS | Ephemeral | Development and testing |
Jido.Storage.File | Durable (directory-based) | Single-node persistence |
Configure an adapter as a tuple of module and options:
storage = {Jido.Storage.ETS, []}
storage = {Jido.Storage.File, path: "priv/jido"}
Jido.Persist orchestrates the full lifecycle of saving and restoring agents. It calls through to your storage adapter while enforcing invariants that keep checkpoints and threads consistent.
When you call Jido.Persist.hibernate/2, the following steps execute in order:
agent.state[:__thread__]append_thread/3, diffing against the stored revision agent_module.checkpoint/2 if the agent implements it, otherwise use default serialization :__thread__ from the checkpoint state and store a thread pointer (%{id: id, rev: rev}) instead put_checkpoint/3{:ok, checkpoint} = Jido.Persist.hibernate(agent, storage)
The thread pointer separation is the critical invariant. Checkpoints stay small regardless of how long the thread grows. A thread with 10,000 entries produces the same checkpoint size as one with 10 entries.
When you call Jido.Persist.thaw/3, the restore sequence runs:
get_checkpoint/2agent_module.restore/2 if implemented, otherwise use default deserialization {:ok, agent} = Jido.Persist.thaw(MyApp.WorkflowAgent, agent_id, storage)
If the thread revision does not match the pointer, thaw returns an error. This catches cases where external processes modified the thread after the checkpoint was written.
A checkpoint contains five fields:
%{
version: 1,
agent_module: MyApp.WorkflowAgent,
id: "agent_abc123",
state: %{score: 42, status: :active},
thread: %{id: "thread_xyz789", rev: 42}
}
The state field holds the full agent state without :__thread__. The thread field is a pointer only, never the full thread data. If the agent has no thread, this field is nil.
This design means you can store thousands of checkpoints without duplicating thread content across them.
Thread operations support optimistic concurrency through the :expected_rev option on append_thread/3. This prevents two writers from silently overwriting each other’s entries.
{:ok, thread} = storage_mod.append_thread(
thread_id,
new_entries,
expected_rev: 41
)
If another process advanced the thread past revision 41 before your write lands, the call returns {:error, :conflict}. Persist handles this gracefully during hibernate: if the stored revision is greater than or equal to the local revision, the conflict resolves silently because another writer already flushed the same or newer entries.
Agents can optionally implement two callbacks to control how their state serializes:
defmodule MyApp.WorkflowAgent do
use Jido.Agent,
name: "workflow_agent",
schema: [
score: [type: :integer, default: 0],
db_conn: [type: :any]
]
def checkpoint(agent, _ctx) do
data = Map.drop(agent.state, [:db_conn])
{:ok, data}
end
def restore(checkpoint_data, _ctx) do
state = Map.put(checkpoint_data, :db_conn, MyApp.Repo.connection())
{:ok, state}
end
end
Use checkpoint/2 to strip non-serializable values like database connections or process references. Use restore/2 to rehydrate those values when the agent comes back.
If you do not implement these callbacks, Persist uses default serialization that captures the full agent state minus the thread.
To build your own storage backend, implement the six Jido.Storage callbacks:
defmodule MyApp.RedisStorage do
@behaviour Jido.Storage
@impl true
def get_checkpoint(key, _opts), do: # fetch from Redis
@impl true
def put_checkpoint(key, data, _opts), do: # write to Redis
@impl true
def delete_checkpoint(key, _opts), do: # delete from Redis
@impl true
def load_thread(thread_id, _opts), do: # load thread entries
@impl true
def append_thread(thread_id, entries, opts) do
expected = Keyword.get(opts, :expected_rev)
# check revision, append entries, return updated thread
end
@impl true
def delete_thread(thread_id, _opts), do: # delete thread
end
Each callback receives an opts keyword list for adapter-specific configuration. The adapter is responsible for honoring :expected_rev in append_thread/3 and returning {:error, :conflict} when the check fails.