JavaScript
Cloudglue JavaScript SDK is a wrapper around the Cloudglue API which makes it easy to turn video into LLM ready data
Installation
To install the Cloudglue TypeScript/JavaScript SDK, use npm:
Usage
- Get an API key from cloudglue.dev
- Set the API key as an environment variable named
CLOUDGLUE_API_KEY
or pass the api key in as a parameter to theCloudGlue
class constructor
Here’s an example of how to create a CloudGlue client:
Working with Files
To use most CloudGlue APIs you’ll need to operate on a file uploaded into CloudGlue.
Below are the basics for working with files:
Note: When uploading files, make sure to:
- Handle the file reading properly with error checking
- Set the correct MIME type for your video file (e.g., ‘video/mp4’, ‘video/quicktime’, etc.)
- Include relevant metadata about the upload
- Handle potential upload errors appropriately
Extracting Structured Information from Videos
Organizing information in structured entity schemas allow for response types that are easy to program AI applications against. Let’s get started with extracting structured entity information from videos.
Prompt Only Extraction
Below we’ll show how to extract information from a video using natural language to guide entire process. This is particularly helpful during the exploratory phase where your entity structure may not be completely known yet.
Also YouTube URLs are supported as input to this API. Note that YouTube videos are currently limited to speech and metadata level understanding, for fully fledged multimodal video understanding please upload a file instead to the CloudGlue Files API and use that input instead.
Schema Driven Extraction
In CloudGlue you can direct the extraction process to get data in a specific format, which is helpful if your downstream application requires programming against specific fields or storing data in a database with specific structure.
We allow users to specify schemas either as an example/abbreviated json object or a fully fledged JSON object specification. For convenience we provide a graphical entity schema builder.
In food review videos for example let’s say we really want to know the restaurant name and some review blurb for our table, in which case your entity schema might look something like this:
Working with Collections
Our abstraction for that is called “Collections”. In a collection not only can you logically store related resources together under a single umbrella, you can also give the platform guidance on the types of information you wnat extracted as entities at rest for later usage.
Below are the basics for working with a collection:
Once files are processed into a collection their video entities become available for future reference.
Talking with Videos
Similar to how chat gpt or claude operate we expose a chat completion API, with a couple extra parameters to allow you to interact with your video collections.
Namely we expose a collections list parameter, to specify which video collection to talk to and also allow some flags like force_search
which gives the underlying LLM a hint that we need you to always execute a search for the incoming message as well as include_citations
which tells the system to provide references for the information described by the chat generation.