Features
Multimedia Processing
Multimedia Processing - Opulent documentation
Multimedia Processing
Generate images, produce video, embed audio, and analyze any media — all inside your workspace
Opulent works across every media type. Generate original images on demand, produce short video clips, embed audio and YouTube content into documents, and extract insights from images and video — all woven directly into your agent workflows and Document panel.
Capabilities Overview
| Capability | What It Does | Example Use |
|---|---|---|
| Image Generation | Create custom images from text descriptions | Product mockups, illustrations, slide visuals |
| Image Understanding | Analyze and extract info from any image | Document scanning, chart re-creation, visual QA |
| Video Generation | Generate short video clips from a description | Product teasers, social content, explainer clips |
| Audio Embed | Embed playable audio clips into documents | Narration, music, recorded walkthroughs |
| Voice Output | Convert text to natural spoken audio | Voiceovers, accessibility, podcast drafts |
| Speech to Text | Transcribe audio or video to text | Meeting notes, interview transcripts |
| YouTube Embed | Embed any YouTube video inline | Reference material, tutorials, demos |
| Video Understanding | Extract insights from uploaded video | Competitor analysis, meeting summaries |
Image Generation
Quick Start
"Generate a product mockup showing our dashboard on a MacBook Pro,
clean white background, professional photography style""Create a diagram showing our customer journey from first visit to
paying customer, flat design, brand colors""Generate a hero image for our landing page — abstract, dark theme,
glowing blue accents, wide format 1920x1080"Common Uses
Product Visuals:
- Product mockups and prototypes
- Feature illustrations and UI concepts
- App store screenshots and demos
Marketing Assets:
- Social media graphics (square, portrait, landscape)
- Blog post header images
- Ad creatives with specified aspect ratios
Presentations:
- Custom slide backgrounds per section
- Concept illustrations for abstract ideas
- Visual metaphors that replace stock photos
Diagrams & Architecture:
- System architecture diagrams
- Process and workflow flows
- Infographics combining icons and data
Tips for Better Images
Be specific about style:
- ✅
"Flat design illustration, bright colors, icon-style" - ✅
"Minimalist, modern, professional photography, shallow depth of field" - ❌
"Make it look good"
Describe composition:
- ✅
"Centered subject, blurred background, natural window lighting" - ❌
"A picture of our app"
Specify use case and format:
- ✅
"For Instagram post, square format, bold text overlay space at bottom" - ✅
"For presentation slide, 16:9 wide format, subtle texture background"
Image Understanding
Quick Start
"Analyze this screenshot and extract all visible text into a document"(Upload image)
"What products are shown in this catalog page? Extract names, prices,
and any SKU numbers visible."(Upload image)
"Recreate this chart as an editable data table with the exact values shown"(Upload image)
Common Uses
Document Processing:
- Extract text from screenshots and scans
- Parse receipts, invoices, and purchase orders
- Read handwritten notes and whiteboard photos
Visual Analysis:
- Re-create charts and graphs as live data
- Identify objects, logos, and UI elements
- Describe and annotate image content
Quality Control:
- Compare two product photos for differences
- Verify design specs are met in screenshots
- Check marketing assets for brand compliance
Example Tasks
"Extract all text from these 10 product images and organize into a spreadsheet""Analyze this architecture diagram and create a written description of
every component and how they connect""Compare these two app screenshots and list every UI difference"Video Generation
Opulent supports text-to-video generation for short clips and animations.
Quick Start
"Generate a 5-second product teaser showing our logo animating in
with a dark background and particle effects""Create a short explainer clip for this feature — professional style,
no text overlay, abstract motion graphics"Common Uses
- Product teasers and social media clips
- Abstract motion graphics for presentations
- Animated diagrams and process illustrations
- Brand intro sequences
Checking Video Status
Video generation runs asynchronously. After requesting a video:
"Check the status of my video generation"Once complete, the video is available in AI Drive and can be embedded directly into any Document.
Audio Embeds
Embed playable audio directly into documents and presentations — no external tools needed.
Quick Start
"Embed the audio file at this URL as a playable player in the document:
https://example.com/narration.mp3""Add background music to this document — label it 'Product Demo Walkthrough'"Using the toolbar: open the Document panel → click the ♪ Music icon → enter the audio URL.
Common Uses
- Meeting recordings embedded alongside notes
- Product demo narration in proposals
- Podcast clips embedded in research documents
- Accessibility versions of written content
Voice Output
Convert any text to a natural audio narration.
Quick Start
"Convert this blog post to an audio narration file, professional tone""Create a voiceover for this presentation script — friendly, energetic,
moderate pace""Generate audio descriptions of these 10 products for our website"Common Uses
Content Creation:
- Blog posts converted to audio
- Podcast scripts produced as audio files
- Presentation narration tracks
Accessibility:
- Audio versions of long documents
- Voiceover tracks for screen recordings
Marketing:
- Ad voiceover copy
- Product demo narration
- Social media audio content
Voice Options
Tone: Professional, friendly, casual, energetic, calm Pace: Fast, moderate, slow Style: Conversational, formal, educational, promotional
Speech to Text
Transcribe any audio or video file to text with speaker labels and timestamps.
Quick Start
"Transcribe this interview recording"(Upload audio file)
"Convert this 1-hour webinar recording to text. Include:
- Full transcript
- Executive summary
- Key takeaways
- Q&A section extracted separately""Transcribe these 20 customer support calls and identify the most
common issues mentioned across all of them"Common Uses
Meeting Notes:
- Transcribe calls and standup recordings
- Extract action items automatically
- Build searchable meeting archives
Content Repurposing:
- Convert podcasts to blog posts
- Create show notes from audio
- Extract quotable moments with timestamps
Research:
- Transcribe user interview recordings
- Analyze customer support calls at scale
- Process focus group audio
Features
- Speaker identification — Distinguish between multiple speakers
- Timestamps — Mark when each segment was said
- Formatting — Proper punctuation, paragraphs, and capitalization
- Accuracy — High accuracy across accents, background noise, and technical terminology
YouTube Embeds
Embed any YouTube video inline in documents and presentations.
Quick Start
"Embed this YouTube video in the document: https://youtu.be/VIDEO_ID"Using the toolbar: open the Document panel → click the ▶ YouTube icon → paste the URL or video ID.
Common Uses
- Embed tutorial videos alongside written instructions
- Add reference demos to proposals
- Include competitor videos in research documents
- Add explainer videos to onboarding documents
Combining Multiple Modes
Opulent can combine all media capabilities in a single workflow:
Example 1: Video to Document
"Watch this product demo video, transcribe it, extract key features
and pricing mentioned, generate a summary diagram, and write a
blog post with all findings embedded in the Document panel"Example 2: Presentation with Full Media
"Create a 10-slide presentation about our Q4 results. Generate a custom
illustration for each section. Add a voiceover narration track for the
full deck. Embed a reference YouTube tutorial in the appendix slide."Example 3: Bulk Image Analysis to Report
"Analyze these 50 product photos, extract visible text and product
details from each, generate a comparison chart, and create a
slide deck with findings and side-by-side image comparisons"Quick Use Cases
| Use Case | Input | Output |
|---|---|---|
| Product Mockups | Description | Generated image |
| Meeting Notes | Video / audio recording | Transcript + summary |
| Blog Audio | Text article | Audio narration file |
| Document Scanning | Photo of document | Extracted text |
| Video Analysis | Competitor video URL | Feature comparison |
| Podcast Show Notes | Audio file | Transcript + summary |
| Slide Visuals | Topic description | Custom generated images |
| Demo Narration | Presentation script | Voiceover audio |
| YouTube Reference | URL | Embedded player in doc |
Common Questions
What image formats are supported? PNG, JPG, WEBP, GIF, SVG for uploads. Generated images are delivered as PNG or JPEG at requested dimensions.
What video formats work for upload? MP4, MOV, WEBM, and most common formats. Generated videos are delivered as MP4.
What audio formats work for transcription? MP3, WAV, M4A, WEBM, OGG, and most common audio formats.
Can I generate images in specific sizes?
Yes. Specify dimensions: "Generate a 1920x1080 image..." or "Square format, 1:1 ratio for Instagram".
How accurate is speech transcription? Very high accuracy, including multiple speakers, accents, and technical vocabulary.
Can I embed audio from an external URL? Yes. Any direct audio URL (mp3, wav, ogg) can be embedded as a playable player in Documents.
Bottom line: Opulent handles every media type natively — generate images, produce video, embed audio, transcribe speech, and analyze visual content — all integrated into your documents, slides, and agent workflows without leaving the platform.