Skip to content

[LLM-Gen] Codex Responses API Subset Analysis

This doc is generated by code agent fully and aims to clarify the usage of /response API in codex, as codex only uses a part of the features.

This document analyzes the differences between the OpenAI Responses API specification and the Codex Rust implementation (ResponsesApiRequest in codex-rs/codex-api/src/common.rs).

Executive Summary

Codex implements a subset of the OpenAI Responses API. The implementation prioritizes client-side state management, uses relaxed typing for external interfaces, and omits advanced generative tuning controls. The primary architectural difference is that Codex manages conversation state locally rather than relying on OpenAI's server-side session management.


1. Architectural Differences

1.1 State Management

Aspect OpenAI API Codex Implementation
Conversation State Server-side (conversation, previous_response_id) Client-side (ContextManager)
History Transfer Optional: send only new messages Required: send full history
State Persistence OpenAI servers Local storage
Caching Built-in conversation caching prompt_cache_key for prefix caching

Design Implications

  • OpenAI: Simplifies client logic, reduces bandwidth
  • Codex: Complete control over conversation content, enables complex operations (editing, compression), supports stateless architecture

Trade-offs

Codex must transmit full conversation history on each request. This is mitigated by prompt_cache_key, which leverages OpenAI's prefix caching to avoid reprocessing identical prefixes.

1.2 Type Safety Strategy

Field OpenAI Type Codex Type Approach
input 25+ types (polymorphic) Vec<ResponseItem> Subset: 8 types supported
tools 12+ specific types (union) Vec<serde_json::Value> Dynamic typing
model Enum String Relaxed typing
include Array of enum strings Vec<String> Generic typing
service_tier Enum Option<String> Weak typing

Rationale

  • input: OpenAI supports 25+ input types (string, file, audio, etc.); ResponseItem implements only 8 (Message, Reasoning, FunctionCall, WebSearchCall, etc.) plus 4 Codex-specific types (LocalShellCall, GhostSnapshot, Compaction, CustomToolCall)
  • tools as dynamic: Avoids maintaining massive union types; delegates validation to server
  • model/service_tier as strings: Forward-compatible with new values without code changes

2. Field Support Comparison

2.1 Table 1: What Codex Sends to OpenAI API

This table shows the actual content sent to OpenAI after Codex's internal transformations (via for_prompt() and normalize_history()).

OpenAI Spec Field OpenAI Type Sent by Codex Notes
instructions string String System instructions
model enum String Generic string for compatibility
parallel_tool_calls boolean bool Parallel tool execution
prompt_cache_key string Option<String> Caching optimization
reasoning object Option<Reasoning> Supports effort and summary
store boolean bool Response storage control
stream boolean bool SSE streaming
include array of string (enum) Vec<String> No enum constraints
input string, array, or object (25+ types) Vec<ResponseItem> (subset) Only sends OpenAI-compatible types: Message, Reasoning, FunctionCall, FunctionCallOutput, WebSearchCall, ImageGenerationCall. Codex-internal types (GhostSnapshot, Compaction) filtered out. LocalShellCall converted to FunctionCallOutput. CustomToolCall may be transformed.
service_tier enum Option<String> No enum constraints
text object Option<TextControls> Only json_schema format supported
tool_choice string or object String Simplified to string only
tools array of object (union) Vec<serde_json::Value> Opaque JSON objects

Missing Fields (not sent to OpenAI):

  • background, context_management, conversation, max_output_tokens, max_tool_calls, metadata, previous_response_id, prompt, prompt_cache_retention, safety_identifier, stream_options, temperature, top_logprobs, top_p, truncation, user

2.2 Table 2: Codex Internal Rust Representation

This table shows how Codex represents these fields internally in ResponsesApiRequest, before transformation.

OpenAI Spec Field Codex Internal Type Internal Management
input Vec<ResponseItem> Managed by ContextManager. Contains 12 ResponseItem variants: 7 OpenAI-compatible (Message, Reasoning, FunctionCall, FunctionCallOutput, WebSearchCall, ImageGenerationCall, etc.) + 5 Codex-specific types (LocalShellCall, CustomToolCall, CustomToolCallOutput, GhostSnapshot, Compaction) for internal conversation management. Before sending to OpenAI, for_prompt() filters out GhostSnapshot and Compaction, converts LocalShellCall to FunctionCallOutput.
tools Vec<serde_json::Value> Stored as opaque JSON; validated by OpenAI server at request time
model String Generic string for forward compatibility with new models
include Vec<String> Generic string vector; no enum validation
service_tier Option<String> Generic string; no enum constraints
tool_choice String Simplified to string value (typically "auto")
text Option<TextControls> Wrapper for verbosity and JSON schema formatting
reasoning Option<Reasoning> Internal reasoning configuration struct

Key Transformation Points:

  1. ContextManager::for_prompt():
  2. Filters out: GhostSnapshot, Compaction (Codex-internal compression artifacts)
  3. Converts: LocalShellCall to FunctionCallOutput with "aborted" status
  4. Normalizes: Ensures call/output pairs, removes orphan outputs

  5. build_responses_request():

  6. Constructs ResponsesApiRequest from internal Prompt struct
  7. Converts internal tool definitions to JSON format for OpenAI API

Summary

  • Total fields: 30
  • Implemented: 13 (43%) - fields sent to OpenAI
  • Partial: 6 (20%) - fields with simplified/weak typing
  • Missing: 11 (37%) - fields not implemented
  • Functionality-weighted coverage: approximately 60%

3. Missing Features

3.1 Sampling Controls

Missing fields: temperature, top_p, top_logprobs

Rationale: Codex uses model default sampling parameters optimized for code generation.

Impact: Cannot adjust model behavior for different scenarios (creative vs analytical tasks).

3.2 Server-Side State Management

Missing fields: conversation, previous_response_id

Rationale: Codex implements ContextManager for complete control over conversation history, enabling editing, compression, and custom truncation.

Impact: Must transmit full conversation history on each request. Mitigated by prompt_cache_key for prefix caching.

3.3 Resource Limits

Missing fields: max_output_tokens, max_tool_calls

Rationale: Relies on OpenAI server-side defaults and limits.

Impact: Cannot enforce token or tool call limits at the API level.

3.4 Complex Tool Choice

OpenAI API supports complex tool choice constraints (allowed tools, mode restrictions). Codex only supports simple string values (typically "auto").

Rationale: Codex primarily uses "auto" mode; complex constraints add implementation complexity without clear benefit.

Impact: Cannot restrict model to specific tool subsets for security or control.


4. Input Type Coverage

OpenAI API supports 25+ input types. Codex's ResponseItem enum implements a subset of these types plus Codex-specific extensions.

4.1 Supported OpenAI Input Types

OpenAI Type ResponseItem Variant Status
EasyInputMessage Message Implemented
ResponseOutputMessage Message Implemented
FunctionCall FunctionCall Implemented
FunctionCallOutput FunctionCallOutput Implemented
WebSearchCall WebSearchCall Implemented
ImageGenerationCall ImageGenerationCall Implemented
Reasoning Reasoning Implemented

4.2 Missing OpenAI Input Types

OpenAI Type ResponseItem Variant Status
TextInput (string) - Missing
ResponseInputFile - Missing (PDF, etc.)
ResponseInputAudio - Missing
ResponseInputImage - Missing (inline in Message content)
ComputerCall - Missing
FileSearchCall - Missing
CodeInterpreterCall - Missing
15+ other types - Missing

4.3 Codex-Specific Extensions

Type Purpose Notes
LocalShellCall Shell command execution Not in OpenAI spec; executes local bash commands
CustomToolCall Custom tool invocation Not in OpenAI spec; calls Codex-specific tools
CustomToolCallOutput Custom tool output Not in OpenAI spec; returns from custom tools
GhostSnapshot Compression snapshot Codex-internal; marks compressed conversation history
Compaction Compression summary Codex-internal; encrypted summary of compressed content
Other Fallback for unknown types Forward compatibility

Rationale

Codex focuses on core conversation and tool calling types required for AI coding assistance. Simplified file handling (no direct file inputs), no computer vision or audio inputs, and Codex-specific shell execution for local command running.

Impact

Cannot use simplified string inputs, direct file uploads, or audio inputs. Must construct structured ResponseItem objects for all inputs. Gains type safety and support for Codex-specific operations (shell execution, compression).


5. Simplified Features

5.1 Text Format

Codex only supports json_schema format. Does not support text or json_object modes.

Rationale: JSON Schema is the modern approach for structured outputs; simplifies implementation.

Impact: Cannot use plain text or legacy JSON modes.

5.2 Weak Typing

Fields service_tier, include, model use generic strings instead of enums.

Rationale: Forward compatibility with new values without code changes; delegates validation to server.

Impact: Invalid values caught at runtime instead of compile-time; no autocomplete support.

5.3 Tool Definitions

Tools represented as Vec<serde_json::Value> instead of typed union.

Rationale: Avoids maintaining complex 12+ type union; provides insulation against schema changes.

Impact: Lost compile-time type safety; invalid tool definitions caught at runtime.


6. Implemented Features

6.1 Core Conversation

  • model: Model selection (generic string)
  • instructions: System instructions
  • input: Conversation history (structured ResponseItem, subset of 8 OpenAI types plus 4 Codex-specific types)
  • stream: SSE streaming
  • store: Response storage control

6.2 Tools

  • tools: Tool definitions (opaque JSON)
  • parallel_tool_calls: Parallel execution control
  • tool_choice: String-based selection ("auto")

6.3 Reasoning

  • reasoning.effort: Effort level (none/minimal/low/medium/high/xhigh)
  • reasoning.summary: Summary level (auto/concise/detailed)

6.4 Text Controls

  • text.verbosity: Verbosity control (low/medium/high)
  • text.format: JSON Schema structured output

6.5 Optimization

  • prompt_cache_key: Prefix caching
  • service_tier: Service tier selection
  • include: Additional data requests

7. Usage Guidelines

7.1 Appropriate Use Cases

Codex implementation is suitable for: - AI coding assistants - Applications requiring complete control over conversation history - Scenarios with complex context manipulation (editing, compression) - Stateless architectures requiring testing and parallelization - Teams prioritizing simplicity and maintainability

7.2 Inappropriate Use Cases

Full OpenAI API is required for: - Fine-grained sampling control (temperature, top_p) - Resource limits (max_output_tokens, max_tool_calls) - Complex tool choice constraints - Multiple text format modes - Server-side conversation state management - Background execution - Advanced stream options

7.3 Migration Options

If missing features are required:

  1. Use OpenAI SDK directly
  2. Extend ResponsesApiRequest:
    pub struct ResponsesApiRequest {
        // existing fields...
        pub temperature: Option<f32>,
        pub top_p: Option<f32>,
        pub max_output_tokens: Option<u32>,
        // other missing fields...
    }
    
  3. Hybrid approach: Use Codex for core workflow, call OpenAI SDK for specialized needs

8. Statistical Summary

8.1 By Category

Category Implemented Partial Missing Total Rate
Core Parameters 6 2 0 8 100%
State Management 1 0 4 5 20%
Output Control 2 2 1 5 80%
Sampling Controls 0 0 3 3 0%
Resource Limits 0 0 2 2 0%
Metadata 0 0 2 2 0%
Context Management 0 0 2 2 0%
Reasoning 2 0 0 2 100%
Service Config 0 1 2 3 33%
Overall 11 5 16 32 50%

8.2 By Priority

Priority Features Coverage
Critical model, input, instructions, tools, reasoning 80-100%
Important stream, store, prompt_cache_key, text controls 70-90%
Useful service_tier, include, parallel_tool_calls 50-70%
Advanced temperature, top_p, resource limits 0%
Specialized conversation, background, metadata N/A (different architecture)

9. Code Examples

9.1 OpenAI Full API

const response = await openai.responses.create({
  model: "gpt-5",
  conversation: "conv_abc123",
  input: "New user message",
  temperature: 0.8,
  max_output_tokens: 4000,
  tool_choice: {
    type: "allowed_tools",
    mode: "required",
    tools: [
      { type: "function", name: "get_weather" }
    ]
  }
});

9.2 Codex Implementation

let request = ResponsesApiRequest {
    model: "gpt-5".to_string(),
    instructions: "System instructions".to_string(),
    input: vec![/* Full conversation history */],
    tools: vec![/* Tool definitions */],
    tool_choice: "auto".to_string(),
    parallel_tool_calls: true,
    reasoning: Some(Reasoning {
        effort: Some(ReasoningEffortConfig::Medium),
        summary: Some(ReasoningSummaryConfig::Auto),
    }),
    store: false,
    stream: true,
    prompt_cache_key: Some(conversation_id),
    // No temperature, top_p, max_output_tokens
    // No conversation, previous_response_id
};

10. Conclusion

Codex implements a subset of the OpenAI Responses API through four strategies:

  1. Stripping Generative Tuning: Removes temperature, top_p, max_output_tokens
  2. Benefit: Simplified configuration
  3. Trade-off: No sampling control

  4. Dynamic/Relaxed Types: Uses serde_json::Value for tools, String for enums

  5. Benefit: Forward compatibility, reduced maintenance
  6. Trade-off: Weaker compile-time guarantees

  7. Strict Internal Types: Vec for input

  8. Benefit: Type safety in memory manipulation
  9. Trade-off: Less input flexibility

  10. Client-Side State Management: ContextManager instead of conversation

  11. Benefit: Complete control, complex operations
  12. Trade-off: Full history transmission (mitigated by caching)

Assessment

The implementation is optimized for AI coding assistant use cases. It prioritizes simplicity, control, and maintainability over feature completeness. Suitable for applications requiring strong context control but not appropriate for scenarios needing fine-grained sampling control, server-side state management, or complex tool constraints.


Document Version: 3.0 (Simplified) Generated: 2025-03-10 Based on: spec.md and codex-rs/codex-api/src/common.rs