Skip to content

7. API Subset

Overview

In the previous posts, we followed the lifecycle of a user request from the Op::UserTurn to the execution of ToolCalls. We observed how Codex interacts with the OpenAI /responses API to achieve its agentic behavior.

If you are familiar with the OpenAI /responses API documentation, you may notice that it provides a vast array of configuration options. However, Codex only utilizes a specific subset of these features.

This post takes a static perspective to analyze the differences between the OpenAI Responses API specification and the Codex Rust implementation (ResponsesApiRequest). We will explore how Codex makes calculated reductions and the engineering trade-offs behind these architectural decisions.

1. Architectural Positioning

Codex implements a subset of the OpenAI Responses API, primarily driven by two core architectural differences:

1.1 State Management: Client vs. Server

Aspect OpenAI API Codex Trade-off
Conversation state Server-side (conversation, previous_response_id) Client-side (ContextManager) Codex requires full control for editing, compressing, and rolling back context. OpenAI's approach offers simpler client logic.
History transmission Optional incremental Required full history The bandwidth cost of sending full history is mitigated by using prompt_cache_key for prefix caching.
State persistence OpenAI servers Local storage Enables a stateless architecture for the server while allowing complex local context operations.
Caching Built-in conversation cache prompt_cache_key for prefix caching Leverages OpenAI's prefix caching to avoid reprocessing unchanged history.

1.2 Parameter Simplification

Codex removes several generative tuning parameters:

  • Sampling Controls: temperature, top_p, top_logprobs
  • Resource Limits: max_output_tokens, max_tool_calls
  • Stream Options: stream_options, truncation

Rationale: Codex is highly optimized for precise code generation, which relies heavily on the model's carefully tuned defaults. Exposing these parameters could degrade the model's reliability for code tasks. If fine-grained generative control is needed, developers should use the standard OpenAI SDK directly.

2. API Schema Mapping

Here is a simplified JSON structure showing the OpenAI Responses API request fields alongside their Codex support status:

{
  // ===== Core Parameters =====
  "model": "[weak] String (no Enum constraint)",
  "instructions": "[full] string",
  "input": "[subset+extension] Vec<ResponseItem>, see Section 3",

  // ===== Tool Control =====
  "tools": "[weak] Vec<serde_json::Value> (avoids 12+ type union)",
  "tool_choice": "[simplified] string only (e.g. \"auto\"), no object constraint",
  "parallel_tool_calls": "[full] boolean",

  // ===== Generation Control =====
  "temperature": "[not supported] uses model default",
  "top_p": "[not supported] uses model default",
  "top_logprobs": "[not supported] uses model default",
  "max_output_tokens": "[not supported] uses model default",
  "max_tool_calls": "[not supported] uses model default",
  "reasoning": "[full] { effort?, summary? }",
  "text": "[partial] json_schema format only",
  "stream": "[full] SSE streaming",

  // ===== State Management (architectural difference) =====
  "conversation": "[not supported] Codex uses client-side state",
  "previous_response_id": "[not supported] Codex uses client-side state",
  "prompt": "[not supported] use input field instead",
  "prompt_cache_retention": "[not supported] uses model default",

  // ===== Optimization Options =====
  "prompt_cache_key": "[full] string key for prefix caching",
  "service_tier": "[weak] Option<String> (no Enum constraint)",
  "include": "[weak] Vec<String> (no enum validation)"

  // Metadata and other minor features are also not supported.
}

Coverage Analysis: - By field count: ~50% (13 of 30 fields) - Weighted by functionality: ~60% (core routing and tool execution features are complete; fine-tuning parameters and metadata are omitted).

3. Input Field Processing Pipeline

The OpenAI API accepts over 25 input types. Codex represents all conversation history as a Vec<ResponseItem> and applies a filtering and transformation pipeline before transmission.

3.1 Supported Standard Types

These types are mapped and sent directly to OpenAI: - Message - Reasoning - FunctionCall and FunctionCallOutput - WebSearchCall - ImageGenerationCall

3.2 Codex-Specific Extensions and Filtering

Codex maintains several internal types in its ContextManager that are processed or removed before hitting the API boundary:

Type Pre-transmission Processing
LocalShellCall / CustomToolCall Ensures call_id has corresponding output items to satisfy API strictness.
GhostSnapshot Filtered out. This represents a compression snapshot (GhostCommit) and is never transmitted to the LLM.
Compaction An encrypted compression summary used for token estimation, but not active context.

3.3 The Transformation Pipeline

Internal ResponseItem Vec
┌─────────────────────────────────────────┐
│ ContextManager::for_prompt()            │
│                                         │
│ 1. normalize_history()                  │
│    - ensure_call_outputs_present()      │
│    - remove_orphan_outputs()            │
│    - rewrite_image_generation_calls()   │
│    - strip_images_when_unsupported()    │
│                                         │
│ 2. retain(|item| !GhostSnapshot)        │
└─────────────────────────────────────────┘
Ready for OpenAI ResponseItem Vec
    JSON Serialization
   OpenAI API Request

4. Type Safety: The "Relaxed Exterior, Strict Interior" Strategy

Codex employs a fascinating typing strategy that balances safety with maintainability.

4.1 Weak Types at External Boundaries

Fields like model, service_tier, and include are typed as String rather than strict Enums. More notably, tools is typed as Vec<serde_json::Value> instead of managing the 12+ type union defined by OpenAI.

Benefits: - Forward Compatibility: When OpenAI adds a new model or tool type, Codex requires zero code changes. - Reduced Maintenance: Avoids the burden of syncing with OpenAI's massive and frequently updated type definitions. - Decoupled Upgrades: API specification changes don't break Codex compilation.

Costs: - Invalid values are rejected at runtime (by the server) rather than at compile time.

4.2 Strong Types Internally

Conversely, the input field uses the strict ResponseItem enum. This guarantees full type safety in memory, which is essential for the ContextManager to safely perform complex operations like editing, token compression, and rollback without breaking the data structure.

4.3 Engineering Conclusion

This layered strategy prioritizes flexibility and compatibility at system boundaries (the API interaction) while maintaining strict type safety and maintainability internally (state management). By owning its state and simplifying the API surface, Codex achieves a robust, specialized architecture optimized solely for the complexities of an AI coding agent.