How AI Agents Work (Instructions, Memory, Aggregation) — with the NoETL ai-meta Example
This lesson explains how modern AI “agents” (Codex, Claude, and other LLM-powered assistants) work internally when they plan and execute chains of tasks in real engineering projects. We’ll use the NoETL ecosystem meta-repo (noetl/ai-meta) as a concrete example of how to structure an agent-friendly project.
Example repo: https://github.com/noetl/ai-meta
1) What an AI Agent is (vs. just “chatting with an LLM”)
A plain LLM chat is typically:
- You ask → model answers
- No durable state (beyond the current conversation)
- No tools, no execution environment, no continuity
An AI agent is typically a system built around an LLM that adds:
- Instructions / policy (what it must do, what it must not do)
- Tools (git, filesystem, CI, APIs, shells, browsers, etc.)
- Memory (short-term context + long-term storage)
- Planning / orchestration (break work into steps, execute in order)
- Feedback loops (evaluate results, retry, compact, update state)
2) The core internal loop (plan → act → observe → update)
2.1 How agents make decisions (if/then/else) — and what the LLM is actually doing
People often describe agents as if the LLM itself is executing explicit if/else logic internally. In practice:
- The agent program (outside the LLM) executes real
if/then/elsebranching. - The LLM generates text (plans, code, summaries) by predicting the next token given the provided context.
A) Where the real if/then/else lives: the agent controller
An agent is usually a controller loop that looks like this:
while not done:
context = assemble_context(goal, memory, repo_state, tool_outputs)
plan = LLM(context)
for step in plan:
result = run_tool(step)
if result.failed:
repair_or_retry()
break
if success_criteria_met():
done = True
else:
update_memory_and_continue()
Those if result.failed / if success_criteria_met checks are explicit program logic.
The LLM advises (plans, writes patches), but the controller decides what happens next.
B) What the LLM is doing: pattern completion, not symbolic branching
Modern LLMs (transformers) work roughly like this:
- Convert input text into tokens
- Pass tokens through many layers that compute attention-based representations
- Output a probability distribution over the next token
- Sample/select the next token
- Repeat until completion
So when an LLM “decides” something, it is choosing tokens that best match patterns it learned from training data and the constraints in the prompt.
This can look like reasoning, but it is best understood as:
- statistical pattern matching + constrained generation
- guided by instructions and evidence present in the context window
C) How agents turn text into decisions: “structured prompts + structured parsing”
To make LLM output actionable, agents commonly use:
- Templates (explicit output formats): JSON, YAML, bullet plans, checklists
- Parsing (turn text into structured objects)
- Validation (schema checks, required fields, allowed actions)
- Policy gates (block forbidden operations)
Example: ask the LLM to output steps like:
- action: read_file / run_tests / edit_file
- target: path
- expected: outcome
Then the controller parses that into executable steps.
D) Decision sources (what influences the branch)
When the controller chooses a branch, it usually uses:
- Tool results (tests passed/failed, build errors, API responses)
- Repo rules (AGENTS.md forbids secrets or rewriting history)
- State (what’s already done in this session)
- Memory (prior decisions, known pitfalls, current priorities)
- Cost/time constraints (limit retries, limit broad searches)
So the agent is “intelligent” because:
- it repeatedly gathers evidence (tools),
- applies constraints (instructions),
- and adapts its plan (LLM + controller loop).
E) Example: a concrete branch in NoETL ai-meta
Scenario: updating a workflow across submodules.
- If CI fails in
repos/noetl-gateway→ prioritize fixing gateway workflow first. - Else if server workflow differs → align server next.
- Else → write sync note and bump submodule SHAs.
The LLM can propose these branches, but the controller makes the final call based on actual CI outputs.
Most practical agents run a loop like:
┌────────────────────────────────────────────────────────────┐
│ USER GOAL │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 1) INSTRUCTIONS / POLICY │
│ - system rules, repo rules, safety rules │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 2) CONTEXT ASSEMBLY │
│ - task prompt + relevant files + retrieved memories │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 3) PLANNER │
│ - break goal into steps │
│ - select tools │
│ - decide what to read/write │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 4) EXECUTION │
│ - call tools (git, shell, tests, APIs) │
│ - write files / commit changes │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 5) OBSERVE + EVALUATE │
│ - read tool outputs │
│ - verify success criteria │
└───────────────┬─────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────┐
│ 6) MEMORY UPDATE │
│ - store decisions / results │
│ - compact/summarize as needed │
└────────────────────────────────────────────────────────────┘
This loop is why agents can “keep going” on complex tasks: they continually rebuild context, execute actions, and record outcomes.
3) The mechanisms agents rely on (the “full list” you should know)
Below is a practical checklist of mechanisms used by agentic systems in real projects:
A. Instruction layers (the agent’s “constitution”)
- System instructions (highest priority): safety rules, tool constraints
- Repo-level rules: “how to work in this repo”
- Task-level instructions: your immediate goal / acceptance criteria
- Conventions: style guides, naming, commit format, directory layout
- Guardrails: “don’t touch X”, “never store secrets”, “don’t rewrite history”
In ai-meta, AGENTS.md is the repo-level “constitution” for AI work.
B. Context assembly (short-term memory)
Agents must fit work context into a limited prompt window. Context assembly is the packaging step that decides:
- what to include (files, notes, diffs, logs),
- how much to include (full text vs. excerpts),
- and in what order (rules first, then current state, then evidence).
Think of it as building a “mini-briefing” for the LLM, every time the agent asks it to plan or write.
B.1 The token budget problem (why context assembly exists)
LLMs have a finite context window (a maximum number of tokens they can “see” at once). Real projects exceed that immediately:
- thousands of files,
- long issue histories,
- many PRs,
- months of decisions,
- large logs.
So agents need a deterministic method to compress the world into a small set of relevant inputs.
B.2 A practical context assembly recipe (what agents actually do)
-
Always include hard guardrails (highest priority)
AGENTS.md(repo rules)- any “do not” lists (secrets, forbidden paths, no history rewrites)
-
Include the current snapshot
memory/current.md(what matters now)
-
Retrieve task-relevant evidence
- the few files that define the behavior you’re changing
- recent sync notes for cross-repo impact
- the latest compaction for relevant decisions
-
Include execution feedback
- failing test output (trimmed to the key error)
- lints/CI logs (only the relevant sections)
- command outputs (only what’s needed to reason)
-
Summarize and pointerize
- turn large bodies into short summaries
- keep stable references to originals via paths and links
B.3 The four tactics — expanded in detail
1) Load the minimum necessary files
- Use path heuristics: “CI change →
.github/workflows/**”, “DSL change →docs/reference/**” - Use dependency hints: read the config file plus the code that consumes it
- Prefer entry points over internals first (e.g., “workflow file” before “helper modules”)
2) Add short summaries instead of full histories
- Summarize a PR thread into: intent, decision, outcome, follow-ups
- Summarize a long doc into: contract, constraints, examples
- Summaries are “lossy compression” that preserves what the agent needs to act.
3) Prefer pointers (links/paths) over copying content
- Include
repos/noetl-gateway/.github/workflows/release.ymlas a path pointer - Quote only the 10–30 lines that matter (the snippet that is being changed)
- Pointers keep context small while remaining auditable and navigable.
4) Rehydrate context via search and retrieval
- If the plan needs more detail, the agent fetches more context:
rg "tag-and-bake" repos/noetl-gateway -n- search memory for “publish workflow”
- read the latest sync note for “gateway release”
- This is a loop: retrieve → pack → ask LLM → act → observe → retrieve more.
B.4 A context assembly “packing order” that works well
1) Rules (AGENTS.md)
2) Current snapshot (memory/current.md)
3) Retrieved evidence (files/snippets/sync/compactions)
4) Tool outputs (errors, diffs, tests)
5) Task prompt (goal + acceptance criteria)
This order prevents the LLM from ignoring rules or missing “what matters now”.
B.5 Example: context assembly for a cross-repo workflow change
Goal: “Unify tag+bake publish across gateway and server.”
The agent would pack something like:
AGENTS.md(constraints)memory/current.md(active priorities + known issues)sync/2026/03/20260303-...(latest cross-repo note, if any)- snippet from
repos/noetl-gateway/.github/workflows/... - snippet from
repos/noetl-server/.github/workflows/... - last CI failure excerpt (if applicable)
Then it calls the LLM to plan edits, executes them, and repeats as needed.
C. Long memory (durable state)
Long memory is how an agent “remembers” beyond a single session:
- Append-only journal entries (what happened, decisions, outcomes)
- Indexes and timelines (so you can find things later)
- Compactions (summaries of many raw entries)
- Current working state (“what matters now”)
ai-meta implements this as a Git-tracked memory store in memory/.
D. Aggregation / compaction (keeping memory useful)
If you keep everything forever in raw form, memory becomes noisy and unusable.
So agents use aggregation:
- Compaction: summarize a batch of raw notes into a shorter artifact
- Promotion: move important facts into “current state”
- Archival: move processed raw entries out of the inbox
This is the exact flow in ai-meta:
memory/inbox/→memory/compactions/→ updatesmemory/current.mdandmemory/timeline.md→ move tomemory/archive/
E. Retrieval (finding the right facts at the right time)
Common retrieval strategies:
- Path-based retrieval: “read
AGENTS.mdfirst” - Keyword search: search in repo for terms
- Tag-based retrieval: memory entries tagged by topic
- Recency-based retrieval: “latest compaction first”
- Semantic retrieval (optional): embeddings/vector search across notes
Even without embeddings, good structure and naming enable effective retrieval.
E.1 Retrieval-Augmented Generation (RAG) — how it fits here
RAG is a pattern where the agent retrieves relevant external knowledge (documents, code, tickets, runbooks, memory entries) and injects it into the model’s context before asking the LLM to plan or generate outputs.
RAG matters because LLMs do not reliably “remember” your repo or history unless you provide it in the prompt window. RAG is how agents turn a large, long-lived project into a set of small, relevant snippets the LLM can use right now.
In an ai-meta-style workflow, RAG typically pulls from:
- Repo rules:
AGENTS.md,agents/*.md - Current state:
memory/current.md - Latest compactions:
memory/compactions/<latest>.md - Recent sync notes:
sync/YYYY/MM/*.md - Target code: files inside the relevant submodule(s) under
repos/
RAG pipeline (minimal, practical form)
Query (goal) → Retrieve candidates → Rank/filter → Pack into context → LLM call
Where retrieval and ranking can be:
- Lexical: grep/ripgrep, filename/path heuristics, commit message search
- Structural: “always include current.md + AGENTS.md”
- Recency-based: “prefer newest compaction + newest sync notes”
- Semantic (optional): embeddings / vector search over memory and docs
Why ai-meta is a good RAG substrate
Good RAG depends on high-signal artifacts. ai-meta creates those intentionally:
- Inbox entries are short, timestamped, and tagged → easy to retrieve.
- Compactions summarize many entries → fewer tokens, higher density.
- Current is an always-fresh snapshot → the “working memory”.
- Sync notes are scoped to cross-repo changes → targeted context.
- Commit message conventions act as searchable “labels” for state transitions.
RAG in practice: one concrete example
Goal: “Update gateway publish workflow; keep compatibility with server and docs.”
RAG inputs the agent would pack into context:
AGENTS.md(guardrails)memory/current.md(what’s active right now)- latest
memory/compactions/...(recent decisions) - most recent
sync/...note referencing publish/tag/bake - the relevant CI files inside
repos/noetl-gateway/.github/workflows/...
Then the LLM plans, and tools execute the plan.
F. Planning, tool use, and feedback loops
Key patterns:
- Decomposition: break a goal into small verifiable steps
- Tool selection: choose git/tests/scripts/API calls
- Checkpoints: run tests, lint, validate outputs
- Retry / repair: if something fails, adjust and rerun
- Determinism: prefer reproducible commands and pinned SHAs
ai-meta explicitly aims for determinism by pinning submodule SHAs.
4) Why folder structure and naming conventions matter (a lot)
Agents are limited by:
- context window size
- time/attention (they must choose what to read)
- retrieval quality
So the project must be agent-readable.
Folder structure helps agents by:
- Creating semantic “zones” (“docs go here”, “memory goes here”, “scripts go here”)
- Making retrieval cheap (“look in
memory/current.mdfor current state”) - Enabling safe behavior (“product code doesn’t live here”)
Naming conventions help agents by:
- Encoding meaning in filenames:
YYYYMMDD-topic.md - Ensuring stable references: scripts and docs don’t move constantly
- Improving search and sorting (lexicographic order matches chronology)
“Small, consistent files” beat “one huge wiki”
Agents do better with:
- many small files with tight scope
- predictable, repeated templates
- an index (
timeline.md) + a current snapshot (current.md)
5) NoETL ai-meta as an “agent-native” coordination repo
ai-meta is designed as a control plane for engineering work across many repos.
The role of each top-level folder (how it works)
ai-meta/
AGENTS.md # global AI rules (public/safe, no product code)
agents/ # per-agent guidance (Codex, Claude)
repos/ # Git submodules for all NoETL repos (pinned SHAs)
sync/ # cross-repo sync notes (what changed, PR links, SHAs)
playbooks/ # repeatable checklists (how to do X)
scripts/ # helpers to manage memory + automation
memory/ # long-memory store (inbox → compactions → current)
Why git submodules are valuable for agents
Submodules give you:
- a single workspace spanning many repos
- a deterministic “ecosystem snapshot” (exact SHAs)
- a safe boundary: code changes happen inside the correct submodule repo
An agent can coordinate a multi-repo change like:
- implement in each repo (submodule)
- merge PRs upstream
- bump submodule pointers in
ai-metawith one coordination commit - record the sync note + memory entry
6) Commit messages as agent-readable instructions
In agent workflows, commit messages become part of the “instruction layer” because they:
- summarize intent (“what did we do and why?”)
- label state transitions (“memory(add)”, “memory(compact)”, “chore(sync)”)
- provide an auditable chain agents can rehydrate later
Example patterns used in ai-meta
memory(add): <title>memory(compact): <scope>chore(sync): bump <repo1>, <repo2> to merged SHAsdocs(agents): update cross-repo workflow guidance
These are not just for humans — they also create machine-retrievable signals that an agent can search and follow.
7) The ai-meta long-memory pipeline (inbox → compaction → current)
Add memory (raw entry)
You capture a fact/decision with a consistent template:
./scripts/memory_add.sh "<title>" "<summary>" "<tags>"
git add memory
git commit -m "memory(add): <title>"
This creates a timestamped entry under memory/inbox/YYYY/MM/.
Compact memory (aggregate and clean up)
./scripts/memory_compact.sh
git add memory
git commit -m "memory(compact): <scope>"
Compaction does three important things:
- Generates a summary file in
memory/compactions/ - Updates
memory/current.mdandmemory/timeline.md - Moves processed entries to
memory/archive/
This keeps the “active brain” small while preserving a complete history.
8) Codex + Claude together (a practical division of labor)
You can use multiple agents/models productively if each has a clear role:
- Codex-style agent: strong at code edits, refactors, test-driven changes, repo navigation
- Claude-style agent: strong at summarization, documentation, reasoning about architecture, writing playbooks
In ai-meta, this division is natural:
- submodule code changes → Codex (in the relevant repo)
- orchestration docs, memory compaction, sync notes → Claude or Codex (both work)
What matters is that both follow the same instruction layer (AGENTS.md) and the same memory pipeline.
9) Text diagrams: an AI agent architecture for NoETL ecosystem work
A) “Control plane” view (where ai-meta fits)
┌──────────────────────────────────────────┐
│ ai-meta │
│ - rules (AGENTS.md) │
User Goal ───▶ │ - memory (current/timeline/compactions) │
│ - playbooks + sync notes │
│ - submodule pointers (ecosystem snapshot) │
└───────────────┬───────────────────────────┘
│
▼
┌────────────────────┐
│ repos/* (submodules)│
│ noetl, server, ui, │
│ gateway, worker... │
└─────────┬───────────┘
│
▼
PRs / merges / tags
B) “Agent loop” view (how it executes cross-repo tasks)
[Read AGENTS.md] → [Load current memory] → [Plan steps]
↓ ↓ ↓
[Select repo(s)] → [Edit in submodule] → [Run tests]
↓ ↓ ↓
[Open PRs] → [After merge, bump SHAs] → [Write sync note]
↓
[Add memory entry] → [Compact periodically] → [Update current state]
10) A worked example: orchestrating a NoETL ecosystem change
Goal: “Update release tagging+bake workflow across gateway + server + docs”
Agent plan:
- Identify repos involved (
repos/noetl-gateway,repos/noetl-server,repos/docs) - Make code changes inside each submodule, open PRs
- After merges:
- update submodule pointers in
ai-meta - commit:
chore(sync): bump gateway, server, docs to merged SHAs
- update submodule pointers in
- Add a sync note under
sync/YYYY/MM/ - Add memory entry:
- title: “release workflow unified”
- tags:
release,ci,distribution
- Run compaction weekly to keep
current.mdsmall
This is how you get repeatable, auditable, multi-repo execution driven by AI agents.
11) Best practices for agent-friendly long-running projects
-
One repo = one purpose
ai-metais coordination + memory, not product code. -
Make rules explicit
Put repo-level constraints inAGENTS.mdand keep it short and strict. -
Prefer append-only memory
Don’t rewrite history; add new entries and compact. -
Use stable filenames and templates
Timestamped entries + standard sections (“Summary”, “Actions”, “Repos”). -
Treat commits as structured signals
A consistent commit grammar makes the project “queryable” by agents. -
Keep a current snapshot
Always maintainmemory/current.mdas the “active brain” for new sessions.
12) Quick reference: what to tell contributors
- Start in
AGENTS.md - Work in submodules under
repos/ - Record cross-repo outcomes in
sync/ - Record decisions in
memory/inbox/ - Compact regularly into
memory/compactions/+ updatememory/current.md
Optional homework (for your team)
- Add a
sync/TEMPLATE.mdand enforce it in PR reviews - Add a weekly “memory compaction” ritual
- Add tags that match your roadmap areas (dsl, gateway, ui, release, runtime, docs)
- Introduce semantic search later (optional) once file naming and templates are stable