> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudglue.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Turn video data into a reusable, multimodal context layer for AI agents & applications.

## What is Cloudglue?

[**Cloudglue**](https://cloudglue.dev?ref=docs) is the video context layer for agents and applications.

Cloudglue lets developers search, ask questions, analyze, and extract structured data within individual video, and across thousands of hours of video as a unified queryable corpus.

Designed for simplicity, scale, and fidelity, Cloudglue unifies speech, diarization, visual understanding, sound, and on-screen text into simple, composable APIs—so you can enable **video Q\&A**, **semantic search**, and **structured data extraction with full citations** in just a few lines of code — without building your own video-understanding stack.

Whether you’re developing **AI agent workflows**, **creative tools**, or **analyzing meeting recordings**, Cloudglue makes video queryable and actionable for any AI system. Cloudglue handles the infrastructure so you can focus on building features that matter to your users.

## Get Started in 3 Steps

<Steps>
  <Step title="Get your API key">
    [Sign up for free](https://app.cloudglue.dev/auth/sign-up) and get your API
    key from the dashboard.
  </Step>

  <Step title="Install an SDK, or use MCP or Playground">
    <CardGroup cols={2}>
      <Card icon="python" href="/sdks/python" title="Python SDK">
        Install our Python SDK
      </Card>

      <Card icon="js" href="/sdks/javascript" title="JavaScript SDK">
        Install our JavaScript SDK
      </Card>

      <Card icon="robot" href="/getting-started/mcp-server" title="MCP Server">
        Use the MCP Server for Claude, Cursor, or Windsurf integration.
      </Card>

      <Card icon="square-dashed-circle-plus" href="https://app.cloudglue.dev/home/playground" title="Playground">
        Test directly in
        [Playground](https://app.cloudglue.dev/home/playground).
      </Card>
    </CardGroup>
  </Step>

  <Step title="Upload video and extract">
    Upload your first video, then extract structured data or chat across your
    videos in minutes.
  </Step>
</Steps>

## Core Features

### Video Document Parsing

Foundational APIs that transform unstructured video and audio into structured, queryable context.

<CardGroup cols={3}>
  <Card icon="sparkles" href="/api-reference/endpoint/describe" title="Describe">
    Get a comprehensive moment-by-moment description on a video, including
    transcript, diarization, visual descriptions, audio desecriptions, sound,
    on-screen text, and more. Perfect for getting every detail on a video.
  </Card>

  <Card icon="table" href="/api-reference/endpoint/extract/post" title="Extract">
    Extract structured data from videos at scale, across modalities, using a
    prompt or custom schema. Making videos easy to program against, query
    against, and categorize in your application.
  </Card>

  <Card icon="scissors" href="/api-reference/endpoint/segments/post" title="Segment">
    Split videos into meaningful parts with segmentation options like
    intelligent shot detection, and narrative (chapters). Turning videos into
    logical sequences.
  </Card>
</CardGroup>

### Video Reasoning

Higher-level APIs that enable multimodal search, chat, and reasoning directly over video content.

<CardGroup cols={3}>
  <Card icon="magnifying-glass" href="/api-reference/endpoint/search/post" title="Search">
    Add semantic search over videos and segments with natural-language queries.
    Enable this in your application with just a few lines of code.
  </Card>

  <Card icon="message-bot" href="/api-reference/endpoint/chat/completions" title="Chat Completion">
    Add conversational AI that can query, compare, and reason across hundreds of
    videos, complete with full citations, with just a few lines of code.
  </Card>

  <Card icon="comments" href="/deep-dives/responses-api" title="Responses API">
    Next-gen conversational API with streaming, entity-backed knowledge,
    multi-turn support, and background processing.
  </Card>
</CardGroup>

## What Makes Cloudglue Different

* **Multimodal AI**: We don't just transcribe speech — we understand across context including visual content, audio descriptions, on-screen text, and diarization.
  * *Prefer speech-only?* You can disable multimodality and use transcripts alone.
* **Scale**: Built to handle hour-long (or longer) videos, and reason across hundreds or even thousands of videos at once, all using the same simple primitives.
* **Developer-First**: Clean APIs, comprehensive SDKs, and tools built for developers.
* **Robust**: Designed for production workloads with reliable performance across large video datasets.
* **Real-time Integration**: Rich partner ecosystem for building integrations, including MCP server support for direct AI assistant integration.
* **Backed by Research:** Cloudglue continuously integrates the latest advancements in multimodal AI—so as foundational models improve, your application also get the latest. Our infrastructure is built and maintained by a team that actively publishes research in large-scale video and audio understanding.

## Quick Example

Here's how easy it is to extract structured data from any video:

<CodeGroup>
  ```python Python theme={null}
  from cloudglue import CloudGlue

  client = CloudGlue()

  # Upload and extract

  uploaded = client.files.upload(
  'path/to/local/video.mp4',
  wait_until_finish=True
  )

  extraction = client.extract.run(
  url=uploaded.uri,
  prompt="Extract all speakers and main topics discussed",
  schema={"speakers": ["string"], "topics": ["string"]}
  )

  print(extraction.data)

  # {"speakers": ["John Smith", "Sarah Johnson"], "topics": ["AI", "Marketing"]}
  ```

  ```typescript JavaScript theme={null}
  /// <reference lib="dom" />
  import { CloudGlue } from '@aviaryhq/cloudglue-js';
  import * as fs from 'fs';
  import * as path from 'path';

  const client = new CloudGlue();
  const filePath = 'path/to/video.mp4';

  const fileBuffer = await fs.promises.readFile(filePath);
  const file = new File([fileBuffer], path.basename(filePath));
  const uploadResult = await client.files.uploadFile({ file });
  const fileDetails = await client.files.waitForReady(uploadResult.id);

  const extraction = await client.extract.createExtract(fileDetails.uri, {
    prompt: 'Extract all speakers and main topics discussed',
    schema: { speakers: ['string'], topics: ['string'] },
  });
  const extractionResult = await client.extract.waitForReady(extraction.job_id);

  console.log(extractionResult.data);
  ```
</CodeGroup>

## Popular Use Cases

<CardGroup cols={3}>
  <Card icon="message-bot" href="/use-cases/build-a-video-qa-bot" title="Video Q&A Chatbots">
    Build intelligent chatbots that can answer questions about video content,
    perfect for training materials, meetings, and educational content.
  </Card>

  <Card icon="database" href="/use-cases/extract-structured-data-from-your-videos" title="Structured Data Extraction">
    Extract specific information like product details, people, locations, or any
    custom data schema from video content at scale.
  </Card>

  <Card icon="books" href="/use-cases/build-a-video-knowledge-base" title="Video Knowledge Bases">
    Create searchable knowledge bases on video recordings, making hours of
    content instantly accessible and queryable.
  </Card>
</CardGroup>

## Capabilities at a Glance

### Collections & Organization

* [**Entity Collections**](/core-concepts/entity-collection) - Process multiple videos with consistent schemas
* [**Media Description Collections**](/core-concepts/media-description-collection) - Organize videos for searchable multimodal transcriptions
* [**Collection Chat**](/core-concepts/chat-completions) - Have conversations across entire video libraries

### Integrations & Tools

* [**MCP Server**](/getting-started/mcp-server) - Direct integration with Claude Desktop and Cursor
* [**Playground**](https://app.cloudglue.dev/home/playground) - Test and experiment with video processing
* [**Schema Builder**](https://app.cloudglue.dev/tools/extract-schema-helper) - Visual tool for creating extraction schemas
* [**Webhooks**](/getting-started/webhooks) - Real-time processing notifications

### Multimodal Understanding

* **Speech Transcription** - Accurate speech-to-text with speaker identification
* **Visual Scene Analysis** - Detailed descriptions of what's happening visually
* **Scene Text Recognition** - Extract text visible on screen (captions, presentations, etc.)
* **Media Integrations** - Process videos directly from YouTube, TikTok, and Loom URLs
* **Audio Description** - Extract audio descriptions from video content
* **Face Detection and Matching** - Find videos with a given face

## Next Steps

Choose your path based on what you want to accomplish:

<CardGroup cols={2}>
  <Card href="/getting-started/setup-cloudglue" title="New to Cloudglue?">
    Start with our setup guide to get your API key and SDK installed.
  </Card>

  <Card href="/use-cases/build-a-video-qa-bot" title="See Examples">
    Explore detailed use cases with step-by-step implementations.
  </Card>

  <Card href="/api-reference/introduction" title="API Reference">
    Dive into the full API documentation and endpoint details.
  </Card>

  <Card href="https://app.cloudglue.dev/home/playground" title="Try the Playground">
    Test video processing directly in your browser without any code.
  </Card>
</CardGroup>

## Resources

* **API Documentation**: [Full API Reference](https://docs.cloudglue.dev/api-reference)
* **SDKs**: [JavaScript](https://docs.cloudglue.dev/sdks/javascript) • [Python](https://docs.cloudglue.dev/sdks/python)
* **Tools**: [Playground](https://app.cloudglue.dev/home/playground) • [Schema Builder](https://app.cloudglue.dev/tools/extract-schema-helper)
* **Need an SDK or integration?** [Let us know!](mailto:hello@cloudglue.dev)
