Skip to content

4. Tasks and Turns

From User Input to Request Construction

Overview

This document describes the internal request processing flow of Codex, focusing on how user input is transformed into Task and Turn abstractions, and how the final LLM prompt is constructed.

Input Processing: From Op::UserTurn to Task Creation

Input Entry Point

User interactions are encapsulated as Op::UserTurn (or the legacy Op::UserInput) and sent to the Daemon. Op::UserTurn contains the configuration required for a specific execution cycle:

// codex-rs/protocol/src/protocol.rs
pub enum Op {
    UserTurn {
        items: Vec<UserInput>,
        cwd: PathBuf,
        approval_policy: AskForApproval,
        sandbox_policy: SandboxPolicy,
        model: String,
        effort: Option<ReasoningEffortConfig>,
        summary: Option<ReasoningSummaryConfig>,
        service_tier: Option<Option<ServiceTier>>,
        final_output_json_schema: Option<Value>,
        collaboration_mode: Option<CollaborationMode>,
        personality: Option<Personality>,
    },
    // ...
}

Input Routing Logic

When a new input arrives, the Session attempts to route it using the following priority:

  1. Steering: The session calls steer_input. If a task is currently active and accepts the steering (i.e., the expected_turn_id matches or is None), the input is queued into the pending_input of the active turn.
  2. Task Spawning: If no task is active or steering fails, the session creates a new ActiveTurn and spawns a RegularTask.

ContextManager: Centralized State History

The ContextManager (located in codex-rs/core/src/context_manager/history.rs) serves as the Single Source of Truth for the conversation history.

  • Storage: It maintains a Vec<ResponseItem> containing messages, tool calls, and outputs.
  • Filtering: The for_prompt method prepares the history for the LLM. It acts as a safety boundary, normalizing items and filtering out internal states (like compression snapshots) before they hit the API. (We will explore the details of this transformation pipeline in Chapter 7).
  • Token Tracking: It tracks token usage but does not trigger compaction; compaction is an external task triggered by the session logic.

Abstractions: Task vs. Turn

Task: The Execution Controller

A Task is a unit of control flow, typically running as a tokio::spawn handle. It implements the SessionTask trait.

  • Responsibility: Manages the lifecycle (start, cancel, finish), handles CancellationToken propagation, and reports status to the Session.
  • Types: Includes RegularTask (standard turns), ReviewTask, CompactTask, and UndoTask.

Turn: The Data Flow Executor

A Turn (represented by the run_turn function in codex-rs/core/src/codex.rs) is the logic execution unit within a task.

  • Responsibility: Executes the sampling loop, interacts with the ModelClient, handles tool execution, and records results into the ContextManager.
  • Events: Emits EventMsg (e.g., OutputTextDelta, ToolCallStarted) to the UI using the turn_id for scoping.

The run_turn Sampling Loop

Session State Transition

The Session tracks execution via active_turn. Only one ActiveTurn exists per user request cycle.

sequenceDiagram
    participant S as Session
    participant AT as ActiveTurn State
    participant T as Task (JoinHandle)
    participant R as run_turn Loop

    Note over S: Op::UserTurn received
    S->>S: Check active_turn
    alt No active_turn
        S->>AT: Create ActiveTurn
        S->>T: Spawn Task
        T->>R: Invoke run_turn()
        loop Sampling
            R->>R: Sampling & Tool Execution
        end
        R-->>T: Return last_agent_message
        T->>S: on_task_finished()
        S->>AT: Remove Task/Cleanup
        S->>S: active_turn = None
    else Steering
        S->>AT: Push to pending_input
    end

Sampling Logic

run_turn executes a while loop. In each iteration: 1. Input Consumption: Consumes any pending_input queued via steering. 2. Prompt Rebuild: Reconstructs the complete Prompt (history + tools + instructions). 3. Model Interaction: Calls the LLM and streams responses. 4. Follow-up Determination: Sets needs_follow_up to true if tool calls are pending or more input was steered.

Termination Criteria

The loop terminates when SamplingRequestResult.needs_follow_up is false.

  • false: Model returned only text, or a terminal error occurred.
  • true: Model requested tool calls, or token limits triggered an auto-compaction that requires a re-sampling.

Prompt Construction

The Prompt Structure

The Prompt (defined in codex-rs/core/src/client_common.rs) is the payload sent to the LLM:

pub struct Prompt {
    pub input: Vec<ResponseItem>,
    pub(crate) tools: Vec<ToolSpec>,
    pub(crate) parallel_tool_calls: bool,
    pub base_instructions: BaseInstructions,
    pub personality: Option<Personality>,
    pub output_schema: Option<Value>,
}

Tool Loading Mechanics

Tools are dynamically injected into each sampling request via the ToolRouter (located in codex-rs/core/src/tools/router.rs:L40). Codex avoids overloading the LLM context by loading MCP/App tools only when necessary.

Tool specification format sent to LLM (defined in codex-rs/protocol/src/models.rs):

{
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Read a file from the local filesystem with optional line range.",
    "parameters": {
      "type": "object",
      "properties": {
        "file_path": {
          "type": "string",
          "description": "Absolute path to the file"
        },
        "offset": {
          "type": "number",
          "description": "1-indexed line number to start; defaults to 1"
        },
        "limit": {
          "type": "number",
          "description": "Maximum lines to return; defaults to 2000"
        }
      },
      "required": ["file_path"]
    }
  }
}

Built-in Tools

Always loaded based on session Config (e.g., local_shell, read_file). (located in codex-rs/core/src/tools/spec.rs:L1712)

Explicit Mentions

If the user mentions a tool explicitly (e.g., @github), it is added to the active tool list via filter_connectors_for_input.

Implicit Discovery (BM25)

The search_tool_bm25 (located in codex-rs/core/src/tools/handlers/search_tool_bm25.rs) is always injected. If the LLM needs a tool not currently loaded, it requests codex to call it and codex sends the full schema of discovery tools back. The discovered tools are saved to the session state and become available in subsequent requests.

The tools are not passed to LLM by default, which means when LLM decides to use a tool it needs to invoke toolcall search_tool_bm25, and then requires a following toolcall to ask codex to invoke the real discoveried tool(such as an MCP tool).

Full Diagram

sequenceDiagram
    participant Turn as run_turn
    participant Session as Session State
    participant LLM as Model Client

    Note over Turn: 1. Initialization
    Turn->>Turn: Load Built-in Tools (Shell, File ops)

    Note over Turn: 2. Resolve Active Tools
    Turn->>Session: Get user mentions & prior tool selections
    Session-->>Turn: Active tool identifiers (e.g., "@github")

    Note over Turn: 3. Final Assembly
    Turn->>Turn: Combine Built-in, Mentioned, and Selected tools
    Turn->>Turn: Always Add search_tool_bm25
    Turn->>LLM: Send Prompt (with Active Tools)

    opt 4. LLM Discovers New Tools
        LLM->>Turn: ToolCall: search_tool_bm25(query="jira")
        Turn->>Session: Search & Save "jira" to active selections
        Note over Turn,Session: Discovered tools will be included <br/>in the next sampling round automatically.
    end

Skill Loading and Injection

Skills differ from tools: they're markdown instruction packages, not executable functions. Loading happens in three stages:

Stage 1: Load Metadata only

Session init scans directories, loads only {name, description, path} into SkillMetadata. No full content yet.

(located in codex-rs/core/src/skills/manager.rs:L76)

Stage 2: Render to Input

Metadata rendered to markdown and appended to user_instructions:

## Skills
### Available skills
- commit: Git commit helper (file: /Users/.agents/skills/commit/SKILL.md)
- test: Test runner (file: /Users/.agents/skills/test/SKILL.md)

### How to use skills
- Trigger rules: If user names a skill (`$SkillName`) OR task matches description, you must use that skill.
- Progressive disclosure: Open its `SKILL.md`, read only enough to follow the workflow.

(located in codex-rs/core/src/skills/render.rs:L3)

Stage 3: Full Content Loading

Codex will load full skills when user invocation or LLM-driven.

User invocation ($skill-name) means user is explicitly asking codex to use the skills. Codex will read the full SKILL.md and wraps as:

<skill>
<name>commit</name>
<path>/Users/.agents/skills/commit/SKILL.md</path>
# Commit Skill
## Workflow
1. Run git status
...
</skill>

Besides, it's possible for LLM-driven skill loading. As we mentioned previously, all skills are loaded by metadata only, LLM server can decide to load skills as necessary based on input metadata. Note that this is internal details of LLM server side, need to check further what's the algorithm if any. There is no search tool like bm25 offered by codex for LLM to make a toolcall to query the relavant skills. LLM discovers skills and makes decision to load skills based on metadata.

LLM loads skills by triggering toolcall read_file, and there is no dedicated structure inside LLM response for skills. Skills are just a part of inputs from the whole context. Esstentially, it's nothing different from an user input query.

Full Diagram

sequenceDiagram
    participant R as run_turn
    participant S as SkillsManager
    participant FS as File System
    participant CM as ContextManager

    Note over R: Stage 1: Metadata Loading
    R->>S: Scan directories for SKILL.md
    S->>FS: Read YAML frontmatter only
    FS-->>S: {name, description, path}
    S-->>R: Vec<SkillMetadata>

    Note over R: Stage 2: Render to Input
    R->>R: render_skills_section()
    R->>CM: Add to user_instructions

    opt User mentions $skill-name
        R->>S: collect_explicit_skill_mentions()
        S->>FS: Read full SKILL.md content
        FS-->>S: File content
        S-->>R: SkillInstructions
        R->>CM: record_conversation_items()
    end

    opt LLM decides to use skill
        LLM->>R: ToolCall: read_file(path="SKILL.md")
        R->>FS: Read file
        FS-->>R: Full content
        R-->>LLM: SKILL.md content
    end

Final Construction Flow

  1. Instructions: Fetch BaseInstructions from Session.
  2. History: Fetch filtered history from ContextManager.
  3. Tools: Build ToolRouter (Built-in + Mentioned MCP + BM25).
  4. Skills: Inject content of mentioned skills into history.
  5. Assembly: Combine all components into the Prompt and dispatch via ModelClient.