Skip to content

Memory

Agents use a sliding-window memory to maintain conversation context. The system message is always preserved. Older messages are dropped when the window is full.

[system] Always preserved
[user] message 1 <- may be dropped if window full
[assistant] reply 1 <- may be dropped if window full
...
[user] latest <- always kept
[assistant] latest <- always kept
const agent = new Agent({
name: "Chat Bot",
instructions: "You are helpful.",
memory: { maxMessages: 100 },
transport: schift.transport,
});
OptionTypeDefaultDescription
maxMessagesnumber50Maximum messages to retain
transformContext(messages: ChatMessage[]) => ChatMessage[]Transform messages before each LLM call

Use transformContext to modify messages before they’re sent to the LLM. This is useful for summarizing older messages, injecting context, or filtering:

const agent = new Agent({
memory: {
maxMessages: 100,
transformContext: (messages) => {
// Truncate long tool results to save tokens
return messages.map((m) =>
m.role === "tool" && m.content.length > 2000
? { ...m, content: m.content.slice(0, 2000) + "..." }
: m,
);
},
},
...
});

Each agent.run() call creates a fresh memory. The system message and user input are added automatically.

For multi-turn conversations, manage conversation state externally:

// Simple multi-turn pattern
const messages: string[] = [];
async function chat(userMessage: string) {
messages.push(userMessage);
const result = await agent.run(messages.join("\n---\n"));
return result.output;
}

During a ReAct loop, tool calls and their results are stored in memory so the LLM can reference previous tool outputs:

[system] instructions
[user] "What's the weather?"
[assistant] [Tool call: get_weather({"city": "Seoul"})]
[tool] {"temperature": 22}
[assistant] "It's 22C in Seoul." <- final answer