Video

Text-to-video, transcription, and automatic clip extraction.

The Video superskill is a complete AI video pipeline. Generate videos from text prompts or reference images using Veo 3.1, Kling V3 Pro, or Sora 2 Pro — in resolutions up to 4K. Upload existing videos and they are automatically transcribed, then the AI analyzes the transcript to identify and extract highlight clips. Clips can be reframed to portrait orientation for social media stories and reels, all handled by your agent end-to-end.

What your agent can do

Capabilities available out of the box.

  • Generate videos from text prompts using Veo 3.1, Kling V3 Pro, or Sora 2 Pro
  • Create image-to-video content with a reference image as the starting frame
  • Automatic transcription with timestamped segments on every upload
  • Automatically identify and extract highlight clips from longer videos
  • Reframe clips to portrait orientation for social media stories and reels
  • Choose resolution up to 4K with configurable duration and aspect ratio

Integrations

Platforms and services your agent connects to.

Veo 3.1Veo 3.1
Kling V3 ProKling V3 Pro
CloudinaryCloudinary
fal.aifal.ai

Commands

CLI commands available with this superskill.

List Videos

List videos in your library.

clawpow videos list
Get Video

Get details of a specific video including transcription status and metadata.

clawpow videos get <videoId>
Upload Video

Upload a video or audio file. Transcription starts automatically.

clawpow videos upload <file>
Remove Video

Delete a video and its clips from the library.

clawpow videos delete <videoId>
Create Clips

Analyze a transcription and create short-form clips with optional portrait reframing.

clawpow videos clip <videoId>
List Clips

List clips for a video with optional status filter.

clawpow videos clips <videoId>
Generate Video

Generate a video from a text prompt or reference image using AI.

clawpow videos generate
Get Generation

Get details and status of a video generation.

clawpow videos generation <generationId>
List Generations

List all video generations.

clawpow videos generations
Create Project

Create a new video project for composing timeline-based videos.

clawpow videos projects create
List Projects

List all video projects.

clawpow videos projects list
Get Project

Get project details including timeline elements.

clawpow videos projects get <projectId>
Delete Project

Delete a video project.

clawpow videos projects delete <projectId>
Add Element

Add a video, caption, graphic, or soundtrack element to a project timeline.

clawpow videos projects add <projectId>
Remove Element

Remove an element from a project timeline.

clawpow videos projects remove-element <projectId>
Update Element

Update properties of an existing element on the timeline.

clawpow videos projects update-element <projectId>
Render Project

Render a video project into a final MP4 file via Remotion Lambda.

clawpow videos projects render <projectId>
Create Actor

Generate a new AI actor with portrait (9:16) and landscape (16:9) images.

clawpow videos actors create
List Actors

List all available actors (platform and user-created).

clawpow videos actors list
Get Actor

Get details of a specific actor including all image URLs.

clawpow videos actors get <actorId>
Delete Actor

Delete a user-created actor.

clawpow videos actors delete <actorId>
Skill Reference
# Video

Upload videos, transcribe them, extract highlight clips, or generate entirely new videos from text prompts.

## When to Use
- User wants to upload and manage video content
- User needs a video transcribed
- User wants to extract highlight clips from a longer video
- User wants to generate a video from a text prompt or image
- User wants to repurpose long-form video into shorter clips for social media

## How It Connects
- **Social Media** → Attach video clips to social posts using `--media` with the asset ID.
- **Documents** → Video transcriptions are saved as document pages for reference and search.
- **Design** → For static images, use the design skill instead. Video handles moving content.
- **Blog** → Embed video content in blog articles or create written summaries from transcripts.
- **Newsletter** → Include video links or thumbnails in newsletter video blocks.

## Quick Start

List your videos:
```bash
clawpow videos list --json
```

Upload a video:
```bash
clawpow videos upload ./my-video.mp4 --title "My Video" --json
```

## Commands

### List Videos
```bash
clawpow videos list --json
```
List all videos in your library.

**Returns:** Array of `{_id, fileName, durationMs, transcriptionStatus, assetId, _creationTime}`

### Get Video
```bash
clawpow videos get <videoId> --json
```
Get details of a specific video including transcription status and metadata.

**Options:**
- `<videoId>` (required, positional) -- Video ID

**Returns:** Full video object with `{_id, fileName, mimeType, durationMs, transcriptionStatus, detectedLanguage, transcript, assetId, url, studioUrl, _creationTime}`

### Upload Video
```bash
clawpow videos upload ./my-video.mp4 --title "My Video" --json
clawpow videos upload ./podcast.mp3 --title "Episode 42" --json
```
Upload a video or audio file to your library. Transcription starts automatically after upload.

**Options:**
- `<file>` (required, positional) -- Path to video or audio file
- `--title <text>` (required) -- Title for the video
- `--language <code>` -- Language hint for transcription (e.g. "en", "es"). Auto-detected if omitted.

**Supported formats:** mp4, mov, webm, mpeg, avi, mp3, wav, m4a, ogg, flac

**Returns:** `{"videoId": "..."}`

### Remove Video
```bash
clawpow videos delete <videoId> --json
```
Delete a video and its clips from the library.

**Options:**
- `<videoId>` (required, positional) -- Video ID

**Returns:** `{"deleted": true, "videoId": "..."}`

### Create Clips
```bash
clawpow videos clip <videoId> --json
clawpow videos clip <videoId> --format portrait --max-clips 3 --json
clawpow videos clip <videoId> --min-clips 2 --max-clips 5 --max-duration 60 --json
```
Analyze a transcription and create short-form clips with optional portrait reframing. Requires the video to be transcribed first.

**Options:**
- `<videoId>` (required, positional) -- Video ID (must be transcribed)
- `--format <format>` -- Output format: portrait or landscape (default: portrait)
- `--min-clips <n>` -- Minimum number of clips to generate (1-10, default: 1)
- `--max-clips <n>` -- Maximum number of clips to generate (1-10, default: 5)
- `--max-duration <seconds>` -- Maximum clip duration in seconds (10-180, default: 90)

**Returns:** `{"videoId": "...", "status": "clipping", "nextCommand": "clawpow videos clips <videoId>"}`

### List Clips
```bash
clawpow videos clips <videoId> --json
clawpow videos clips <videoId> --status ready --json
```
List clips for a video with optional status filter.

**Options:**
- `<videoId>` (required, positional) -- Video ID
- `--status <status>` -- Filter by clip status

**Returns:** Array of `{_id, videoId, title, startTime, endTime, durationMs, format, status, assetId, url, _creationTime}`

### Generate Video
```bash
clawpow videos generate --prompt "A drone shot over mountains at sunset" --json
clawpow videos generate --prompt "Product reveal" --image ./product.png --model kling-v3-pro --json
clawpow videos generate --prompt "City timelapse" --model sora-2-pro --duration 15 --json
clawpow videos generate --prompt "Abstract art" --model veo-3.1 --resolution 4k --no-audio --json
```
Generate a video from a text prompt, or from a text prompt plus a reference image.

**Options:**
- `--prompt <text>` (required) -- Text prompt describing the video
- `--model <name>` -- Model: veo-3.1 (default), kling-v3-pro, sora-2-pro
- `--image <path>` -- Reference image (switches to image-to-video mode)
- `--resolution <res>` -- 720p, 1080p (default), 4k (veo-3.1 only)
- `--duration <seconds>` -- Duration in seconds (default: 5, clamped to model max)
- `--aspect-ratio <ratio>` -- 16:9 (default), 9:16, 1:1
- `--no-audio` -- Disable audio generation
- `--json` -- Structured output

**Returns:** `{"generationId": "...", "status": "queued", "studioUrl": "...", "nextCommand": "clawpow videos generation <generationId>"}`

**Model limits:**
| Model | Max Resolution | Max Duration |
|-------|---------------|-------------|
| veo-3.1 | 4K | 8s |
| kling-v3-pro | 1080p | 10s |
| sora-2-pro | 1080p | 25s |

### Get Generation
```bash
clawpow videos generation <generationId> --json
```
Get details and status of a video generation.

**Returns:** Full generation object with `{_id, status, mode, model, prompt, resolution, duration, audio, assetId, width, height, outputDurationMs, error, url, studioUrl, _creationTime}`

### List Generations
```bash
clawpow videos generations --json
clawpow videos generations --status ready --json
```
List all video generations with optional status filter.

**Options:**
- `--status <status>` -- Filter by status: pending, processing, uploading, ready, failed

**Returns:** Array of generation objects with `url` included when an asset is available

## Output Format

All commands support `--json` for structured output. Without `--json`, output is human-readable.

## Important Notes

- Always use `--json` for machine-parseable output
- Video IDs are Convex document IDs (e.g., "k57abc123def")
- Transcription and clip creation are paid operations (consume credits)
- Video listing, getting, uploading, and clip listing are free operations
- A video must be transcribed before clips can be created
- Clip generation uses AI to find the most engaging segments
- Portrait reframing automatically adjusts framing for vertical video formats
- Video generation is a paid operation (consumes credits based on model, resolution, duration, and audio)
- Available models: veo-3.1 (best quality, supports 4K), kling-v3-pro (best for cinematic/dialogue), sora-2-pro (longest duration up to 25s)

Related superskills

Works great together with these.