> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudglue.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Responses API

> Build conversational video experiences with entity-backed knowledge, streaming, and multi-turn support

The Responses API is Cloudglue's next-generation conversational video interface. Compatible with the OpenAI Responses API format, it provides richer annotations, built-in multi-turn conversations, system instructions, streaming, and background processing — all grounded in your video content.

<Tip>
  The Responses API works with both Media Description Collections (speech,
  visual, text) and Entity Collections (structured extracted data). Combine both
  for the richest video understanding.
</Tip>

## When to Use Responses API vs Chat Completions

| Feature                  | Chat Completions        | Responses API                                |
| ------------------------ | ----------------------- | -------------------------------------------- |
| Multi-turn conversations | Manual message history  | Built-in message array                       |
| Streaming                | Not available           | SSE streaming                                |
| Background processing    | Not available           | `background: true` with polling              |
| System instructions      | System message in array | Dedicated `instructions` param               |
| Entity-backed knowledge  | Not available           | `nimbus-002-preview` with entity collections |
| Annotations/Citations    | `citations` array       | Rich `annotations` with timestamps           |
| Models                   | `nimbus-001`            | `nimbus-001`, `nimbus-002-preview`           |
| API compatibility        | Custom format           | OpenAI Responses-compatible                  |

**Use Chat Completions** if you have an existing integration and only need basic Q\&A over media description collections.

**Use Responses API** for new projects, streaming UIs, entity-backed reasoning, or when you need background processing.

## Model Selection

### nimbus-001

Fast general question answering model. Works with media description collections for speech, visual, and text understanding. Good for straightforward Q\&A and summarization.

### nimbus-002-preview

Light reasoning model capable of multi-step reasoning and inspecting your video assets from different dimensions. In addition to media description collections, it supports **entity-backed knowledge** — combining structured entity data with unstructured video descriptions for richer, more precise answers.

<Warning>
  `nimbus-002-preview` is a preview model. Behavior may change as we iterate.
</Warning>

**When to use which:**

* **nimbus-001** — Fast Q\&A, summarization, general questions over video content
* **nimbus-002-preview** — Multi-step reasoning, cross-video synthesis, queries that need structured + unstructured data together

## Basic Response (Sync)

The simplest usage — send a question and get a complete response.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    response = client.responses.create(
        input="What techniques are discussed in these videos?",
        collections=["COLLECTION_ID"],
        model="nimbus-001",
    )

    print(response.output[0].content[0].text)
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    const response = await client.responses.createResponse({
      model: 'nimbus-001',
      input: 'What techniques are discussed in these videos?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
    });

    console.log(response.output?.[0]?.content?.[0]?.text);
    ```
  </Tab>
</Tabs>

<Note>
  The Python SDK uses a top-level `collections` parameter, while the TypeScript SDK nests it under `knowledge_base: { collections: [...] }`. Both achieve the same result — the difference is SDK-specific.
</Note>

## Streaming Responses

For real-time UIs, stream the response as it's generated via Server-Sent Events (SSE). The JavaScript SDK uses web-standard APIs (`fetch` + `ReadableStream`) internally, so streaming works in both Node.js 18+ and modern browsers — including Next.js client components, React apps, and other browser environments.

The stream emits three event types:

* `response.output_text.delta` — Incremental text chunks
* `response.completed` — Final event with the full response object (including annotations)
* `error` — Error event if something goes wrong

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    events = client.responses.create(
        input="What are the key topics in these videos?",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
        stream=True,
    )

    for event in events:
        evt_type = event.get("event")
        data = event.get("data")

        if evt_type == "response.output_text.delta" and isinstance(data, dict):
            print(data.get("delta", ""), end="", flush=True)
        elif evt_type == "response.completed":
            print()  # final newline
        elif evt_type == "error":
            print(f"Error: {data}")
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    const stream = await client.responses.createStreamingResponse({
      model: 'nimbus-002-preview',
      input: 'What are the key topics in these videos?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
    });

    for await (const event of stream) {
      if (event.type === 'response.output_text.delta') {
        process.stdout.write(event.delta);
      } else if (event.type === 'response.completed') {
        console.log('\n');
        // Access annotations from the completed response
        const annotations = event.response?.output?.[0]?.content?.[0]?.annotations;
        if (annotations?.length) {
          console.log(`[${annotations.length} citation(s)]`);
        }
      } else if (event.type === 'error') {
        console.error(`Error: ${event.error.message}`);
      }
    }
    ```
  </Tab>
</Tabs>

<Warning>
  Streaming and background mode cannot be used together. Setting both `stream:
      true` and `background: true` returns a 400 error.
</Warning>

## Entity-Backed Knowledge

This is the key differentiator of `nimbus-002-preview`. Entity-backed knowledge lets the model reason over **both** your structured entity data (extracted schemas) and unstructured media descriptions simultaneously.

For example, if you have an entity collection with extracted recipe schemas (ingredients, cook times, difficulty) and a media description collection with the full video transcripts, the model can answer questions like "Which beginner recipes take under 30 minutes?" by combining both data sources.

### How It Works

1. Create an **entity collection** with your extraction schema (see [Entity Collections](/core-concepts/entity-collection))
2. Create a **media description collection** with your videos
3. Pass both to the Responses API via `entity_backed_knowledge` configuration

<Tabs>
  <Tab title="Python">
    The Python SDK provides helper methods for building entity-backed knowledge configs:

    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    # Build entity collection config using helper method
    entity_config = client.responses.create_entity_backed_knowledge_config(
        entity_collections=[
            client.responses.create_entity_collection_config(
                collection_id="ENTITY_COLLECTION_ID",
                name="recipes",
                description="Recipe details including ingredients, cook time, and difficulty",
            )
        ],
        description="Cooking videos with structured recipe data",
    )

    response = client.responses.create(
        input="Which recipes require the fewest ingredients?",
        collections=["MEDIA_DESCRIPTION_COLLECTION_ID"],
        model="nimbus-002-preview",
        knowledge_base_type="entity_backed_knowledge",
        entity_backed_knowledge_config=entity_config,
    )

    print(response.output[0].content[0].text)
    ```
  </Tab>

  <Tab title="Node">
    In TypeScript, pass entity configuration as a raw object in `knowledge_base`:

    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    const response = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: 'Which recipes require the fewest ingredients?',
      knowledge_base: {
        type: 'entity_backed_knowledge',
        collections: ['MEDIA_DESCRIPTION_COLLECTION_ID'],
        entity_backed_knowledge_config: {
          description: 'Cooking videos with structured recipe data',
          entity_collections: [
            {
              name: 'recipes',
              description: 'Recipe details including ingredients, cook time, and difficulty',
              collection_id: 'ENTITY_COLLECTION_ID',
            },
          ],
        },
      },
    });

    console.log(response.output?.[0]?.content?.[0]?.text);
    ```
  </Tab>
</Tabs>

### Configuration Fields

#### `entity_backed_knowledge_config` (top-level)

| Field                | Required | Description                                                                                                                             |
| -------------------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| `entity_collections` | Yes      | Array of entity collection configs (see below)                                                                                          |
| `description`        | No       | Describes the overall knowledge base — gives the model context on what these collections represent and how they should be used together |

<Tip>
  The top-level `description` on `entity_backed_knowledge_config` is important
  for guiding the model's reasoning. For example, `"Sales call recordings from
      Q4 2024 with deal outcomes and customer feedback"` helps the model understand
  the domain and tailor its analysis. Without it, the model only sees individual
  collection descriptions and may miss the bigger picture.
</Tip>

#### Entity collection config (per-collection)

| Field           | Required | Description                                                                                                                                              |
| --------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `collection_id` | Yes      | ID of the entity collection                                                                                                                              |
| `name`          | No       | Short identifier for the collection (e.g., `"recipes"`, `"speakers"`). Falls back to the collection's stored name if omitted                             |
| `description`   | No       | Describes what entities are in this collection — helps the model understand when to use it. Falls back to the collection's stored description if omitted |

### Entity-Backed Knowledge + Streaming

You can combine entity-backed knowledge with streaming for real-time entity-aware responses:

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    entity_config = client.responses.create_entity_backed_knowledge_config(
        entity_collections=[
            client.responses.create_entity_collection_config(
                collection_id="ENTITY_COLLECTION_ID",
                name="recipes",
                description="Structured recipe data from cooking videos",
            )
        ],
        description="Cooking videos with recipes and techniques",
    )

    events = client.responses.create(
        input="What dishes are mentioned and who discussed them?",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
        stream=True,
        knowledge_base_type="entity_backed_knowledge",
        entity_backed_knowledge_config=entity_config,
    )

    for event in events:
        evt_type = event.get("event")
        data = event.get("data")
        if evt_type == "response.output_text.delta" and isinstance(data, dict):
            print(data.get("delta", ""), end="", flush=True)
        elif evt_type == "response.completed":
            print()
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    const stream = await client.responses.createStreamingResponse({
      model: 'nimbus-002-preview',
      input: 'What dishes are mentioned and who discussed them?',
      knowledge_base: {
        type: 'entity_backed_knowledge',
        collections: ['COLLECTION_ID'],
        entity_backed_knowledge_config: {
          description: 'Cooking videos with recipes and techniques',
          entity_collections: [
            {
              name: 'recipes',
              description: 'Structured recipe data from cooking videos',
              collection_id: 'ENTITY_COLLECTION_ID',
            },
          ],
        },
      },
    });

    for await (const event of stream) {
      if (event.type === 'response.output_text.delta') {
        process.stdout.write(event.delta);
      } else if (event.type === 'response.completed') {
        console.log('\n');
      }
    }
    ```
  </Tab>
</Tabs>

## Multi-Turn Conversations

The `input` parameter accepts either a string (single question) or a message array (multi-turn conversation). For multi-turn, include the full conversation history.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    # Turn 1: initial question
    resp1 = client.responses.create(
        input="What topics are discussed in these videos?",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
    )
    turn1_text = resp1.output[0].content[0].text
    print("Turn 1:", turn1_text)

    # Turn 2: follow-up with conversation history
    resp2 = client.responses.create(
        input=[
            {"role": "user", "content": "What topics are discussed in these videos?"},
            {"role": "assistant", "content": turn1_text},
            {"role": "user", "content": "Can you go into more detail about the first topic?"},
        ],
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
    )
    print("Turn 2:", resp2.output[0].content[0].text)
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    // Turn 1: initial question
    const turn1 = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: 'What topics are discussed in these videos?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
    });
    const turn1Text = turn1.output?.[0]?.content?.[0]?.text ?? '';
    console.log('Turn 1:', turn1Text);

    // Turn 2: follow-up with conversation history
    const turn2 = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: [
        {
          type: 'message',
          role: 'user',
          content: [{ type: 'input_text', text: 'What topics are discussed in these videos?' }],
        },
        {
          type: 'message',
          role: 'assistant',
          content: [{ type: 'input_text', text: turn1Text }],
        },
        {
          type: 'message',
          role: 'user',
          content: [{ type: 'input_text', text: 'Can you go into more detail about the first topic?' }],
        },
      ],
      knowledge_base: { collections: ['COLLECTION_ID'] },
    });
    console.log('Turn 2:', turn2.output?.[0]?.content?.[0]?.text);
    ```
  </Tab>
</Tabs>

<Note>
  The Python SDK uses a simplified message format (`{"role": ..., "content": ...}`) while the TypeScript SDK uses the full OpenAI Responses format with `type: 'message'` and structured content arrays.
</Note>

## Instructions

Use the `instructions` parameter to control response behavior — set a persona, language, format, or domain constraints.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    response = client.responses.create(
        input="What techniques are discussed?",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
        instructions="Always respond in Spanish. Use bullet points for lists.",
    )
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    const response = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: 'What techniques are discussed?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
      instructions: 'Always respond in Spanish. Use bullet points for lists.',
    });
    ```
  </Tab>
</Tabs>

## Background Processing + Polling

For long-running queries or batch workflows, use `background: true`. The response returns immediately with status `in_progress`, and you poll for completion.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    import time
    from cloudglue import Cloudglue

    client = Cloudglue()

    # Start background response
    response = client.responses.create(
        input="Provide a comprehensive analysis of all techniques shown.",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
        background=True,
    )
    print(f"Started: {response.id}, status: {response.status}")  # "in_progress"

    # Poll until complete
    max_attempts = 30
    for attempt in range(max_attempts):
        result = client.responses.get(response.id)
        if result.status == "completed":
            print(result.output[0].content[0].text)
            break
        elif result.status in ("failed", "cancelled"):
            print(f"Response ended with status: {result.status}")
            break
        time.sleep(2)
    else:
        print(f"Polling timed out after {max_attempts} attempts")

    # Or cancel if needed
    # cancelled = client.responses.cancel(response.id)
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    import { Cloudglue } from '@cloudglue/cloudglue-js';

    const client = new Cloudglue();

    // Start background response
    const response = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: 'Provide a comprehensive analysis of all techniques shown.',
      knowledge_base: { collections: ['COLLECTION_ID'] },
      background: true,
    });
    console.log(`Started: ${response.id}, status: ${response.status}`); // "in_progress"

    // Poll using the SDK helper
    const result = await client.responses.waitForReady(response.id, {
      pollingInterval: 2000,
      maxAttempts: 30,
    });
    console.log(result.output?.[0]?.content?.[0]?.text);

    // Or cancel if needed
    // const cancelled = await client.responses.cancelResponse(response.id);
    ```
  </Tab>
</Tabs>

## Rich Citations

Request detailed media description annotations on citations by passing the `include` parameter. This returns speech transcripts, visual scene descriptions, and scene text for each cited segment.

<Tabs>
  <Tab title="Python">
    ```python theme={null}
    response = client.responses.create(
        input="What techniques are discussed?",
        collections=["COLLECTION_ID"],
        model="nimbus-002-preview",
        include=["cloudglue_citations.media_descriptions"],
    )

    text = response.output[0].content[0].text
    annotations = response.output[0].content[0].annotations

    for ann in annotations:
        print(f"[{ann.start_time}s - {ann.end_time}s] {ann.file_id}")
        if ann.speech:
            for s in ann.speech:
                print(f"  Speaker {s.speaker}: {s.text}")
        if ann.visual_scene_description:
            for v in ann.visual_scene_description:
                print(f"  Visual: {v.text}")
    ```
  </Tab>

  <Tab title="Node">
    ```typescript theme={null}
    const response = await client.responses.createResponse({
      model: 'nimbus-002-preview',
      input: 'What techniques are discussed?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
      include: ['cloudglue_citations.media_descriptions'],
    });

    const annotations = response.output?.[0]?.content?.[0]?.annotations ?? [];

    for (const ann of annotations) {
      console.log(`[${ann.start_time}s - ${ann.end_time}s] ${ann.file_id}`);
      ann.speech?.forEach((s: any) => {
        console.log(`  Speaker ${s.speaker}: ${s.text}`);
      });
      ann.visual_scene_description?.forEach((v: any) => {
        console.log(`  Visual: ${v.text}`);
      });
    }
    ```
  </Tab>
</Tabs>

## Filters

Constrain which videos are searched using metadata, file, or video info filters.

<Tabs>
  <Tab title="Python">
    The Python SDK provides a `create_filter()` helper:

    ```python theme={null}
    from cloudglue import Cloudglue

    client = Cloudglue()

    # Filter by file metadata
    search_filter = client.responses.create_filter(
        metadata=[
            {"path": "topic", "operator": "Equal", "valueText": "cooking"}
        ]
    )

    response = client.responses.create(
        input="What techniques are discussed?",
        collections=["COLLECTION_ID"],
        model="nimbus-001",
        filter=search_filter,
    )

    # Combine multiple filter types
    combined_filter = client.responses.create_filter(
        metadata=[
            {"path": "cuisine", "operator": "Equal", "valueText": "Italian"}
        ],
        video_info=[
            {"path": "duration_seconds", "operator": "LessThan", "valueText": "600"}
        ],
    )
    ```
  </Tab>

  <Tab title="Node">
    In TypeScript, pass filter objects directly:

    ```typescript theme={null}
    const response = await client.responses.createResponse({
      model: 'nimbus-001',
      input: 'What techniques are discussed?',
      knowledge_base: { collections: ['COLLECTION_ID'] },
      filter: {
        metadata: [
          { path: 'topic', operator: 'Equal', valueText: 'cooking' },
        ],
      },
    });
    ```
  </Tab>
</Tabs>

### Supported Filter Operations

| Operator                      | Description                   | Value Field      |
| ----------------------------- | ----------------------------- | ---------------- |
| `Equal` / `NotEqual`          | Exact match                   | `valueText`      |
| `LessThan` / `GreaterThan`    | Numeric comparison            | `valueText`      |
| `In`                          | Value in comma-separated list | `valueText`      |
| `ContainsAny` / `ContainsAll` | Array operations              | `valueTextArray` |

### Filter Categories

* **`metadata`** — Filter on custom metadata fields (e.g., `metadata.topic`)
* **`file`** — Filter on file properties (e.g., `id`)
* **`video_info`** — Filter on video properties (e.g., `duration_seconds`, `has_audio`)

## Best Practices

### Model Selection

* Start with **nimbus-001** for fast Q\&A and summarization
* Use **nimbus-002-preview** when you need multi-step reasoning, cross-video synthesis, or entity-backed knowledge
* Both models support streaming, background processing, multi-turn, and instructions

### Streaming vs Background

* **Streaming** for interactive UIs where users see text appear in real-time
* **Background** for batch processing, long-running analysis, or server-to-server workflows
* You cannot use both simultaneously

### Entity Collections

* Set a top-level `description` on `entity_backed_knowledge_config` to give the model overall context — e.g., `"Customer support calls from enterprise accounts in Q1 2025"` helps the model frame its reasoning across all collections
* Give each entity collection descriptive `name` and `description` values — the model uses these to decide when and how to query each collection
* A `name` like `"recipes"` with a `description` like `"Ingredients, cook times, and difficulty levels for each dish"` is more helpful than `"data"` with no description

### Multi-Turn Conversations

* Include the full conversation history in each request for proper context
* For long conversations, consider trimming older turns to stay within token limits
* Use `instructions` to set consistent behavior across turns rather than repeating guidance in each message

### Citations

* Use `include: ["cloudglue_citations.media_descriptions"]` when you need to display source segments to users or verify answers
* Citation annotations include timestamps, speech, visual descriptions, and scene text — useful for building "jump to source" features

## Try It Out

Ready to build with the Responses API?

* **Quickstart**: [Responses API Quickstart](/getting-started/responses-api)
* **API Reference**: [Create Response](/api-reference/endpoint/responses/post)
* **SDKs**: [JavaScript](/sdks/javascript) | [Python](/sdks/python)
* **Playground**: [Try in browser](https://app.cloudglue.dev/home/playground)