Cloudglue Python SDK is a wrapper around the Cloudglue API which makes it easy to turn video into LLM ready data
You can find the package on PyPI.
To install the Cloudglue Python SDK, use pip:
CLOUDGLUE_API_KEY
or pass the api key in as a parameter to the CloudGlue
class constructorHere’s an example of how to create an Cloudglue client:
To use most Cloudglue APIs you’ll need to operate on a file uploaded into Cloudglue.
Below are the basics for working with files:
Note: When uploading files, make sure to:
You can use our Transcribe
API to generate multimodal transcripts from videos. Getting started is easy, just specify the video you want to transcribe and the configuration of the transcription.
YouTube URLs are supported as input to this API.
Note that YouTube videos are currently limited to speech and metadata level understanding, for fully fledged multimodal video understanding please upload a file instead to the Cloudglue Files API and use that input instead.
Organizing information in structured entity schemas allow for response types that are easy to program AI applications against. Let’s get started with extracting structured entity information from videos.
Below we’ll show how to extract information from a video using natural language to guide entire process. This is particularly helpful during the exploratory phase where your entity structure may not be completely known yet.
YouTube URLs are supported as input to this API.
Note that YouTube videos are currently limited to speech and metadata level understanding, for fully fledged multimodal video understanding please upload a file instead to the Cloudglue Files API and use that input instead.
In Cloudglue you can direct the extraction process to get data in a specific format, which is helpful if your downstream application requires programming against specific fields or storing data in a database with specific structure.
We allow users to specify schemas either as an example/abbreviated json object or a fully fledged JSON object specification. For convenience we provide a graphical entity schema builder.
In food review videos for example let’s say we really want to know the restaurant name and some review blurb for our table, in which case your entity schema might look something like this
Now lets extract data using this schema
Our abstraction for that is called “Collections”. In a collection not only can you logically store related resources together under a single umbrella, you can also give the platform guidance on the types of information you want described or extracted as entities at rest for later usage.
Below are the basics for working with a collection
Once files are processed into a collection their video entities become available for future reference.
Similar to how chat gpt or claude operate we expose a chat completion API, with a couple extra parameters to allow you to interact with your video collections.
Namely we expose a collections list parameter, to specify which video collection to talk to and also allow some flags like force_search
which gives the underlying LLM a hint that we need you to always execute a search for the incoming message as well as include_citations
which tells the system to provide references for the information described by the chat generation.
Cloudglue Python SDK is a wrapper around the Cloudglue API which makes it easy to turn video into LLM ready data
You can find the package on PyPI.
To install the Cloudglue Python SDK, use pip:
CLOUDGLUE_API_KEY
or pass the api key in as a parameter to the CloudGlue
class constructorHere’s an example of how to create an Cloudglue client:
To use most Cloudglue APIs you’ll need to operate on a file uploaded into Cloudglue.
Below are the basics for working with files:
Note: When uploading files, make sure to:
You can use our Transcribe
API to generate multimodal transcripts from videos. Getting started is easy, just specify the video you want to transcribe and the configuration of the transcription.
YouTube URLs are supported as input to this API.
Note that YouTube videos are currently limited to speech and metadata level understanding, for fully fledged multimodal video understanding please upload a file instead to the Cloudglue Files API and use that input instead.
Organizing information in structured entity schemas allow for response types that are easy to program AI applications against. Let’s get started with extracting structured entity information from videos.
Below we’ll show how to extract information from a video using natural language to guide entire process. This is particularly helpful during the exploratory phase where your entity structure may not be completely known yet.
YouTube URLs are supported as input to this API.
Note that YouTube videos are currently limited to speech and metadata level understanding, for fully fledged multimodal video understanding please upload a file instead to the Cloudglue Files API and use that input instead.
In Cloudglue you can direct the extraction process to get data in a specific format, which is helpful if your downstream application requires programming against specific fields or storing data in a database with specific structure.
We allow users to specify schemas either as an example/abbreviated json object or a fully fledged JSON object specification. For convenience we provide a graphical entity schema builder.
In food review videos for example let’s say we really want to know the restaurant name and some review blurb for our table, in which case your entity schema might look something like this
Now lets extract data using this schema
Our abstraction for that is called “Collections”. In a collection not only can you logically store related resources together under a single umbrella, you can also give the platform guidance on the types of information you want described or extracted as entities at rest for later usage.
Below are the basics for working with a collection
Once files are processed into a collection their video entities become available for future reference.
Similar to how chat gpt or claude operate we expose a chat completion API, with a couple extra parameters to allow you to interact with your video collections.
Namely we expose a collections list parameter, to specify which video collection to talk to and also allow some flags like force_search
which gives the underlying LLM a hint that we need you to always execute a search for the incoming message as well as include_citations
which tells the system to provide references for the information described by the chat generation.