roadmap · parked

Visual Pairing System.

Audio → visual identity layer for Hook Objects. Each visual pairing should make the song feel more immediate, more artist-authored, and more shareable.

statusparked

ownerproduct + eng

last updated2026-05-11

Casset's thesis is artist homes for audiovisual identity in the generative media era. Visual Pairing is useful only when it strengthens the Hook Object: the audio moment, lyric timing, visual identity, share artifact, provenance, and return path. It should not become a standalone visual marketplace or generated-content layer before the world format is loved.

1. Product overview

What it is

A system that lets every track on Casset carry multiple visual interpretations. Today a track has one cover image; going forward it has a first-class library of visuals— short vertical loops (5–30s) that can be paired with a hook and exported as a 9:16 video.

Sources of visuals:

Artist-uploaded — the artist's own aesthetic (photos, stills, disposable-camera imagery, BTS, short clips).
Release-native — cover art, saved stills, and Shader Lab treatments already attached to the hook.
Community-created future — filmmakers, motion designers, and 3D artists submit visuals to a track or to a marketplace.
Feature-flagged generation future — architecture stays parked for later premium packs or alternate pressings, but does not define V1.

Core artist actions

Browse visuals associated with their track.
Select a visual as the "active" pairing.
Export the hook as a 1080×1920 video using that visual.
Share — TikTok / IG / X / Casset link.

Why it matters

Identity over decoration. Most artists don't have a director on retainer. A cover image is thin. A moving visual is a brand.
Share-worthy by default. Vertical video is the unit of distribution. A static cover on a hook clip loses to any competitor with motion.
Creator flywheel. Visual makers currently have no way to build a career around music. Casset becomes the exchange.
Network effect per track. Every track becomes surface area for multiple creators → more shares, more variants, more discovery.

2. Foundation — already shipped

We are not building this from scratch. The pairing system is mostly a composition of systems already in production. Every row below is a primitive we reuse in V1–V3.

Track upload + hook selection

Track.previewStartSec, lib/hook-constants.ts, components/hooks/

shipped

1080×1920 client-side video export

lib/tiktok-video.ts — generateTikTokVideo(). Canvas + WebCodecs + mp4-muxer + Meyda. No server ffmpeg.

shipped

Waveform rendering pipeline

Meyda-driven 48-bar visualizer with accent gradient; analyzeAudioAsync() + renderFrame() in lib/tiktok-video.ts

shipped

Quality presets (high/standard/low)

pickQualityPreset() in lib/tiktok-video.ts — auto-fallback on mobile

shipped

Export progress UI

app/preview/ExportProgressModal.tsx

shipped

Session cache for exports

videoCache keyed on trackId:start:duration[:shareUrl]

shipped

Profile-based identity

Artist, User models; /u/[username]; preview pages

shipped

Stripe monetization + Connect payouts

Artist.stripeAccountId, User.stripeAccountId, /api/checkout/*, bounty payout pipeline in lib/bounty-payout.ts

shipped

Sharing + short URLs

HookShare model, /s/{shareId}, referral attribution

shipped

Generic media upload (video accepted)

/api/upload/media — mp4/mov/webm up to 500MB, magic-byte validation in lib/magic-bytes.ts

shipped

Artist bonus media (pattern)

ArtistMedia model — per-artist; useful precedent, not the target surface

shipped

Fan-submitted video pattern

HookVideoSubmission — per-track, PENDING/APPROVED/REJECTED moderation + upload source + attribution. Schema-level blueprint for community visuals.

shipped

Winner selection + Stripe transfer

Bounty pipeline — one-winner-per-track, Stripe Connect transfer. Direct analog to future visual-creator rev share.

shipped

Generation pipeline abstraction (parked)

VisualGenerationJob, visual-pack metadata, and Casset Studios runtime hooks remain feature-flagged for a later expansion path.

shipped

Implication: V1 of Visual Pairing is a schema addition + a ~150-line extension to the existing generateTikTokVideo() loop + a single upload UI. No new infra primitives required.

3. User flows

A. Artist flow

Artist opens track in Studio
  → "Visuals" tab (new)
  → V1: Upload one visual (MP4/MOV/WebM, ≤ 30s, 9:16 recommended)
  → V2+: Browse attached visuals (own + community + AI) — filter by mood/BPM/tag
  → Tap visual to preview with the hook (real-time, in-app)
  → Select "Use this visual" → becomes active pairing for the track
  → Export hook video (reuses existing TikTok-ready pipeline, now with visual layer)
  → Share sheet → TikTok / IG / X / copy link / Casset share URL

B. Visual creator flow (V3)

Creator signs up → claims handle → completes VisualCreator profile
  → Upload a visual (loop) — mp4/mov/webm, ≤ 30s
  → Tag: mood (chill / dark / euphoric / nostalgic / ...), BPM range, genre affinity,
         aesthetic (film / 3D / anime / generative)
  → Either:
     (a) Attach directly to a specific track (if invited / open submission)
     (b) Submit to marketplace — discoverable by any artist
  → Optional: set licensing tier (free / paid / rev-share)
  → Dashboard: impressions, pairings, exports, shares, earnings

C. Viewer / fan flow

Viewer lands on a hook preview (feed card, share link, profile page)
  → Sees the artist's currently active visual looping behind the hook
  → Taps the "visual switcher" affordance (bottom-left chip, say)
  → Swipes through alternative visuals for this track
  → Can share "this version" — the share URL encodes the visual choice
  → Recipient opens the link → same track, same hook, that visual → exportable as their own share

Fans don't edit — they curate. The act of sharing a specific pairing is a feature.

4. Feature breakdown (V1 → V3)

V1 — MVP (ship fast, 2–3 weeks) MVP

Goal: prove the creative loop (upload visual → export video → share) end-to-end with minimum surface.

Artist can upload a single visual per track. One VisualAsset row, kind = ARTIST, isActive = true.
Visual replaces the static cover background in the export pipeline. Waveform bars, avatar, title typography, watermark all remain.
Preview plays the visual behind the audio hook on the track detail page and in the export modal.
Fallback: tracks without a visual render exactly as today (no regression).
No marketplace, no AI, no tags, no moderation queue — artist-only uploads are implicitly trusted (same trust model as their audio upload today).
Storage: Vercel Blob via existing /api/upload/media. MP4/MOV/WebM, ≤ 30s, ≤ 500MB already enforced.

Out of scope for V1: multiple visuals per track, fan-facing switcher, generated visuals, any revenue share, moderation UI.

V2 — Multi-visual + browse V2

Goal: each track becomes an actual library, and the fan switcher exists.

Multiple VisualAsset rows per track; one marked isActive.
Artist UI: "Add visual" → upload or select from their previous uploads. Reorder, archive.
Tag system (lightweight): mood, genre, optional bpmMin/bpmMax. Free-text allowed but normalized via a small allowlist server-side.
Fan visual switcher on preview + feed cards. Swipe between visuals without interrupting audio.
Share URL encodes visual selection. ?v={visualAssetId} resolves server-side so the exported video inherits the chosen visual.
Export cache key updated to include visual ID.

V3 — Marketplace layer V3

Goal: creator ecosystem with attribution + optional payout.

VisualCreatorProfile — distinct identity surface (reuses User; adds profile fields + optional Stripe payout via existing User.stripeAccountId).
VisualSubmission — creator-submitted visuals with PENDING/APPROVED/REJECTED (modeled directly on HookVideoSubmission).
Marketplace browse — artists search visuals by tag, mood, BPM, creator. Attach with one tap → creates a VisualAsset referencing the submission.
Attribution: exported video frames carry a subtle visuals by @handle watermark line below the existing watermark. Always shown — cannot be removed.
Metrics per visual: impressions (fan views), pairings (times selected by an artist), exports, shares.
Optional rev-share — creators mark a visual as paid (one-time unlock cents) or rev-share (% of future bounty pool). Payouts ride the existing Stripe Connect transfer pipeline (lib/bounty-payout.ts).
Feature-flagged visual generation future: optional alternate visual packs seeded by hook identity, palette, lyrics, and artist visual DNA. Stored with provenance metadata, but kept out of the primary Studios workflow until quality and positioning are ready.

5. Data model

All new tables. Prisma-shaped — aligned to existing Casset conventions (cuid, createdAt/updatedAt, camelCase, @@index where queried, app-layer enforcement of single-active matching the Bounty winner pattern).

enum VisualKind {
  ARTIST       // uploaded by the track's artist
  COMMUNITY    // submitted by a VisualCreator, approved
  AI           // feature-flagged future generation provider
}

enum VisualStatus {
  PROCESSING   // upload/render in flight
  READY        // usable in pairings + exports
  FAILED       // transcode or generation error
  ARCHIVED     // soft-hidden by owner
}

enum VisualSubmissionStatus { PENDING  APPROVED  REJECTED }

model VisualAsset {
  id            String       @id @default(cuid())
  trackId       String
  track         Track        @relation(fields: [trackId], references: [id], onDelete: Cascade)

  kind          VisualKind
  status        VisualStatus @default(PROCESSING)

  // Source
  uploaderUserId    String?
  uploader          User?    @relation("VisualUploader", fields: [uploaderUserId], references: [id], onDelete: SetNull)
  submissionId      String?  @unique
  submission        VisualSubmission? @relation(fields: [submissionId], references: [id])

  // AI provenance (kind = AI only)
  sourceModel       String?
  sourcePromptHash  String?
  sourcePromptText  String?  @db.VarChar(2000)

  // Media
  videoUrl          String   @db.VarChar(2048)
  posterImageUrl    String?  @db.VarChar(2048)
  durationSec       Float
  widthPx           Int
  heightPx          Int
  fps               Int?

  // Pairing
  isActive          Boolean  @default(false)

  // Metrics (denormalized; authoritative counts via events)
  impressions       Int      @default(0)
  pairings          Int      @default(0)
  exports           Int      @default(0)
  shares            Int      @default(0)

  tags              VisualTag[] @relation("VisualAssetTags")

  createdAt         DateTime @default(now())
  updatedAt         DateTime @updatedAt

  @@index([trackId, status])
  @@index([trackId, isActive])
  @@index([uploaderUserId, createdAt])
  @@index([kind, status, createdAt])
}

model VisualTag {
  id        String  @id @default(cuid())
  slug      String  @unique       // "chill", "dark", "film", "3d", ...
  label     String
  category  String                 // "mood" | "aesthetic" | "genre"
  assets    VisualAsset[] @relation("VisualAssetTags")

  @@index([category])
}

model VisualSubmission {
  id               String                  @id @default(cuid())
  submitterUserId  String
  submitter        User                    @relation(fields: [submitterUserId], references: [id], onDelete: Cascade)

  targetTrackId    String?
  targetTrack      Track?                  @relation(fields: [targetTrackId], references: [id], onDelete: SetNull)

  videoUrl         String  @db.VarChar(2048)
  posterImageUrl   String? @db.VarChar(2048)
  durationSec      Float
  widthPx          Int
  heightPx          Int

  note             String? @db.VarChar(280)
  pitchTagsJson    Json?

  status           VisualSubmissionStatus @default(PENDING)
  reviewerUserId   String?
  reviewedAt       DateTime?
  rejectedReason   String? @db.VarChar(280)

  // Licensing
  licenseKind      String  @default("FREE")  // "FREE" | "PAID_ONE_TIME" | "REV_SHARE"
  priceCents       Int?
  revSharePct      Int?

  createdAt        DateTime @default(now())
  updatedAt        DateTime @updatedAt

  asset            VisualAsset?

  @@index([status, createdAt])
  @@index([targetTrackId, status])
  @@index([submitterUserId, createdAt])
}

model VisualCreatorProfile {
  id                String   @id @default(cuid())
  userId            String   @unique
  user              User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  displayName       String?
  bio               String?  @db.VarChar(500)
  toolsJson         Json?
  websiteUrl        String?
  reelUrl           String?

  // Aggregates (computed, not source-of-truth)
  totalSubmissions   Int   @default(0)
  totalApprovals     Int   @default(0)
  totalPairings      Int   @default(0)
  totalExports       Int   @default(0)
  totalEarningsCents Int   @default(0)

  createdAt          DateTime @default(now())
  updatedAt          DateTime @updatedAt
}

Relationships

Track 1…N VisualAsset (cascade on delete)
Track 1…N VisualSubmission (nullable target for marketplace-only submissions)
VisualAsset N…N VisualTag
VisualSubmission 1…0/1 VisualAsset (approval creates the asset)
User 1…N VisualAsset (as uploader)
User 1…0/1 VisualCreatorProfile

Why not reuse `ArtistMedia` or `HookVideoSubmission`?

ArtistMedia is per-artist, not per-track, and is positioned as bonus download material — conceptually and relationally wrong surface.
HookVideoSubmission is per-track but represents fan-made clips of the hook (i.e. another artifact — people reacting to the track), not reusable visual beds. Keeping them separate avoids overloaded semantics in queries, metrics, and payouts.

6. API design

All routes under /api/visuals/* and /api/tracks/[trackId]/visuals/*. Auth + rate limits follow existing patterns (getUserIdFromRequest, rateLimit).

Upload + create

POST /api/upload/media                          ← already exists, reused as-is
  → returns { url, sizeBytes, mime, category: "video" }

POST /api/tracks/:trackId/visuals               ← new
  body: { videoUrl, posterImageUrl?, durationSec, widthPx, heightPx,
          kind: "ARTIST", tagSlugs?: string[] }
  → creates VisualAsset (PROCESSING → READY after server-side probe)
  → 403 if caller is not the track's artist (unless submission flow)

Browse / fetch

GET  /api/tracks/:trackId/visuals               ← new
  ?kind=ARTIST|COMMUNITY|AI&active=true|false&limit=&cursor=
  → paginated VisualAssets (READY only) with tags, creator, metrics
  → anonymous callers allowed (same visibility as hook preview)

GET  /api/visuals/:id                           ← new
  → single asset with creator profile expanded

Selection (set active pairing)

PATCH /api/tracks/:trackId/visuals/:id/active   ← new
  body: { active: true }
  → artist-only; transactionally clears previous active, sets this one active
  → pattern mirrors /api/hooks/submissions/[id]/winner/route.ts

Export

POST /api/hooks/tiktok-video                    ← already exists
  body: { trackId, hookStartSec, hookDurationSec, visualAssetId?, shareUrl? }
  → server returns signed URLs + metadata; encoding runs client-side
    via generateTikTokVideo()

Submission flow (V3)

POST /api/visuals/submissions                   ← new (marketplace OR targeted)
  body: { targetTrackId?, videoUrl, ..., licenseKind,
          priceCents?, revSharePct?, pitchTagsJson }

GET  /api/visuals/submissions                   ← new (admin + submitter's own list)
POST /api/visuals/submissions/:id/approve       ← new (admin) → creates VisualAsset
POST /api/visuals/submissions/:id/reject        ← new (admin) → sets rejectedReason

Attach (artist accepts a community visual)

POST /api/tracks/:trackId/visuals/attach        ← new (V3)
  body: { submissionId }
  → creates a VisualAsset linked to the submission
  → if licenseKind=PAID_ONE_TIME: creates a Stripe PaymentIntent (artist pays)
  → if REV_SHARE: records licenseKind on the asset for future payout split

Metrics ingestion

POST /api/visuals/:id/event                     ← new, fire-and-forget
  body: { type: "impression" | "pairing" | "export" | "share" }
  → rate-limited, client-fired on actual UX events
  → increments denormalized counters, emits to an analytics sink

7. Export system integration

The highest-leverage integration point. The existing pipeline in lib/tiktok-video.ts already does 95% of the work. We're not rebuilding — we're swapping one draw call.

Current architecture

generateTikTokVideo() runs entirely in the browser:

Fetch cover image + avatar + audio in parallel.
Decode audio via OfflineAudioContext.
Run Meyda on the mono downmix to produce per-frame waveform bars.
Pre-render a static background canvas (cover image + gradient + overlays).
Per frame: drawImage(bgCanvas) → draw waveform → draw avatar/text/watermark.
Encode via VideoEncoder, mux to MP4 via mp4-muxer.

The static background is the extension point.

V1 integration: video background

Accept optional visualAssetId in TikTokVideoParams.
Resolve to a videoUrl server-side (signed if needed); pass to the client.
Client-side: instead of pre-rendering a static bg canvas, load an offscreen <video> element, seek to 0, play muted + looped, and drawImage(videoEl, ...) on each frame in the encoding loop.
The waveform, avatar, typography, and watermark all layer on top unchanged.

Aspect ratio handling (9:16)

Target surface is 1080×1920 logical. If the source visual is:
9:16 already → cover-fill, no crop.
16:9 or 1:1 → center-crop with a slight zoom (preserve the "identity" feel vs letterboxing).
Portrait but not 9:16 → fit with blurred backdrop extension (reuse the cover-image treatment we already have).
Decision function lives in lib/visuals/fit.ts (new), returns { sx, sy, sw, sh, dx, dy, dw, dh } consumed by drawImage.

Watermarking

Existing watermark (Casset logo + optional share URL) stays unchanged.
V3 only: when the active visual is kind != ARTIST, append a second line: visuals by @creatorHandle, rendered in the same watermark block at slightly reduced opacity. Always present — this is the attribution guarantee that keeps creators contributing.

Waveform rendering

No changes to Meyda analysis.
Contrast guard: sample the visual's mean luminance across the hook duration. If the visual is very bright, darken the waveform container with a 30–40% bottom gradient scrim. Keeps bars legible without muddying the visual.

Performance

Per-frame drawImage on a decoded video is cheap on modern hardware (WebCodecs + VideoFrame from HTMLVideoElement is the fast path).
Existing quality presets (high/standard/low) already auto-downgrade on mobile — no new branching.
Cache key gains visualAssetId: ${trackId}:${start}:${duration}:${visualAssetId}:${shareUrl}.
Budget target: export time stays under 30s on mid-tier mobile for a 30s hook with visual.

Fallback

Track with no active visual → existing codepath, zero behavioral change.
Visual fails to decode / load → falls back to static cover bg (same codepath as today), non-fatal. Log + surface a toast.

8. Matching system (future-facing)

We start dumb, we end smart. The V1 data model must not close the door on V3 matching.

Phase 1 — Tags (V2)

Artist visuals and community submissions both carry tags.
UI exposes tag filters: mood, aesthetic, genre, BPM range.
"Suggested visuals" on a track = intersection(track genre tags, visual tags) ranked by recency + pairing count.
Zero ML. Zero embeddings. Ship this.

Phase 2 — Audio-feature suggestion (V2.5)

Reuse Meyda to extract per-track features on upload (RMS, spectral centroid, tempo estimate, mode).
Persist a compact TrackFeatures row (separate from this doc's schema).
Score VisualAsset tags against track features with a hand-tuned weight table. Good enough for "this is lofi → surface muted, slow visuals".

Phase 3 — Embedding-based matching (V3+)

Embed visuals via a CLIP-style model (frames sampled every 1s → mean-pooled).
Embed tracks via an audio embedding model (prototype with existing features → swap for a dedicated API when available).
Joint match: score = cosine(trackEmb, visualEmb) + tag_prior + recency_boost + creator_quality.
Recommender endpoint: GET /api/tracks/:id/visuals/recommendations.
Design constraint: every recommendation must be explainable to the artist ("matched on: moody, nocturnal, 80–90 BPM"). No black-box.

Reusing the Intelligence Layer

Casset already has lib/intelligence/ with an OpenAI + rule-based fallback and a coach-voice tone system. The matching recommender fits this layer cleanly — same fallback ladder, same UI voice.

9. Risks & constraints

Content moderation

Risk: community visuals and any future generated packs can carry NSFW, copyrighted, or violent content.
V1 (artist-only) has the same trust profile as existing audio uploads → no new surface.
V2: no community uploads yet.
V3: mandatory moderation queue (VisualSubmission.status = PENDING), modeled on HookVideoSubmission. Auto-screen with a provider (AWS Rekognition / Hive / Cloudflare Images moderation), human review for edge cases.
Report-to-moderation action on every played visual. One-strike takedown.

Copyright / ownership

Risk: creators submit visuals containing third-party footage they don't own.
Upload agreement (ToS checkbox) — explicit representation of ownership/license.
DMCA handler endpoint + takedown workflow (reuse the bounty takedown pattern).
V3: payout release delayed 7 days post-approval to allow challenges before funds move.

Low-quality spam visuals

Soft rate limit: 3 pending submissions per creator.
Require tag completeness + poster frame.
Auto-dedupe on sourcePromptHash for AI kind; perceptual-hash on video for uploads.
Creator reputation (reuse PromoterReputation pattern): approval rate, pairing rate, flag count.

Performance / cost of video rendering

Client-side path (V1–V3) reuses the existing budget. Only new cost is the extra drawImage(video) per frame — negligible vs encode.
AI generation is the cost center. Gate behind credits (Casset already has CreditLedger).
Server-side export considered only if WebCodecs coverage drops or the quality ceiling becomes binding. Not on the roadmap.

Storage

Vercel Blob. Visuals ≤ 30s at reasonable bitrate (≤ 10MB typical). A track with 20 visuals is ≤ 200MB.
Lifecycle rule: ARCHIVED assets drop to cold storage after 30 days; FAILED assets purge after 7.

Export file size

Existing MAX_OUTPUT_BYTES = 15 * 1024 * 1024 (15MB) ceiling in lib/tiktok-video.ts already bounds output. Adding a video bg doesn't change this — dynamic bitrate already adapts to duration.

10. UX principles

Visuals are identity, not decoration. The pairing UI lives next to the track, not under "extras". The export preview shows the visual as the dominant surface, not a thumbnail.
One-tap everything. Selection = tap. Preview = instant. Export = one tap → progress modal (already built) → share sheet.
Motion is the default. A track without a visual shows the existing static cover. A track with a visual should feel like a new object category on Casset.
Attribution is non-negotiable. Community visuals always carry the creator's handle on the exported video. This is the trust contract with creators.
Never block the artist. Failed visual load → fallback to cover. Unmoderated submission → invisible to fans but visible to admin + submitter. The artist's export always works.
Swipe, don't menu. Fan switcher on feed cards is a horizontal swipe, not a dropdown. Mirrors TikTok intuition.
Explainable matching. When we recommend visuals, we say why — in plain language, using the same coach voice as the Intelligence Layer.

11. Strategic impact

Increases sharing

A hook video with motion outperforms a static cover on every social platform's algorithm. We already export 1080×1920 — this is the multiplier, not the foundation.
The visual switcher creates multiple distinct share artifacts per track. One track, N videos, N share chances.

Creates a creator ecosystem

Visual artists (motion designers, VJs, 3D artists, AI operators) have no native home in music today. Album art is one-shot and mostly static. TikTok effects are for memes.
Casset becomes the first platform where a filmmaker's reel has a direct revenue path via music pairings.
Reputation compounds — the same primitives we already have for promoters (PromoterReputation, trust badges, streaks) port directly to visual creators.

Differentiates from feed + music platforms

Linktree / Beacons: no world layer, no audio-native participation.
Spotify / Apple Music: one cover per track, no identity surface, no fan-side mutation.
Bandcamp: static artifacts, no short-video pipeline.
TikTok / Reels: consumption layer only. No identity layer for the artist. No creator-to-artist pairing market.
Casset + Visual Pairing: Hook Object + identity + share artifact in one loop.

Second-order effects

New acquisition surface: visual creators import their audience to Casset.
New engagement loop: fans who can't make music but can make visuals now have a reason to sign up.
New Drop variants: artists can run "visual drops" — crowdsource the visual bed, pick a winner, ride the existing bounty rail.

Appendices

A. Rollout

Pre-V1: feature flag visual_pairing_v1 scoped to internal + 10 launch-partner artists.
V1 launch: open to all artists. No fan-facing UI beyond the export carrying the visual.
V2 launch: fan switcher behind visual_pairing_v2 flag; graduate on retention delta.
V3 launch: marketplace behind visual_marketplace flag; graduate once moderation SLA + payout flow are green for 2 weeks.

B. Success metrics

V1: % of active artists with at least one visual attached; export completion rate with vs without visual; share-through rate from exported video.
V2: visuals-per-track distribution; fan switcher interaction rate; share URL visual-retention (does the recipient keep the selected visual).
V3: approved submissions / week; creator activation (first pairing → first export); rev-share payout volume; creator retention week-4.

C. Non-goals (explicit)

Full video editor in-app (trimming, effects, transitions) — not a Casset product.
User-generated visuals on top of other users' exported videos — this is a remix feature, separately scoped.
Live / real-time visuals tied to playback — out of scope for Visual Pairing V1, not for Casset's broader Hook Object runtime.

Markdown source: docs/roadmap/visual-pairing-system.md. This page and the markdown stay in sync — edit one, edit both.

← Back to roadmap → Docs home

roadmap · parked

Visual Pairing System.

Audio → visual identity layer for Hook Objects. Each visual pairing should make the song feel more immediate, more artist-authored, and more shareable.

statusparked

ownerproduct + eng

last updated2026-05-11

1. Product overview

What it is

Sources of visuals:

Artist-uploaded — the artist's own aesthetic (photos, stills, disposable-camera imagery, BTS, short clips).
Release-native — cover art, saved stills, and Shader Lab treatments already attached to the hook.
Community-created future — filmmakers, motion designers, and 3D artists submit visuals to a track or to a marketplace.
Feature-flagged generation future — architecture stays parked for later premium packs or alternate pressings, but does not define V1.

Core artist actions

Browse visuals associated with their track.
Select a visual as the "active" pairing.
Export the hook as a 1080×1920 video using that visual.
Share — TikTok / IG / X / Casset link.

Why it matters

Identity over decoration. Most artists don't have a director on retainer. A cover image is thin. A moving visual is a brand.
Share-worthy by default. Vertical video is the unit of distribution. A static cover on a hook clip loses to any competitor with motion.
Creator flywheel. Visual makers currently have no way to build a career around music. Casset becomes the exchange.
Network effect per track. Every track becomes surface area for multiple creators → more shares, more variants, more discovery.

2. Foundation — already shipped

We are not building this from scratch. The pairing system is mostly a composition of systems already in production. Every row below is a primitive we reuse in V1–V3.

Track upload + hook selection

Track.previewStartSec, lib/hook-constants.ts, components/hooks/

shipped

1080×1920 client-side video export

lib/tiktok-video.ts — generateTikTokVideo(). Canvas + WebCodecs + mp4-muxer + Meyda. No server ffmpeg.

shipped

Waveform rendering pipeline

Meyda-driven 48-bar visualizer with accent gradient; analyzeAudioAsync() + renderFrame() in lib/tiktok-video.ts

shipped

Quality presets (high/standard/low)

pickQualityPreset() in lib/tiktok-video.ts — auto-fallback on mobile

shipped

Export progress UI

app/preview/ExportProgressModal.tsx

shipped

Session cache for exports

videoCache keyed on trackId:start:duration[:shareUrl]

shipped

Profile-based identity

Artist, User models; /u/[username]; preview pages

shipped

Stripe monetization + Connect payouts

Artist.stripeAccountId, User.stripeAccountId, /api/checkout/*, bounty payout pipeline in lib/bounty-payout.ts

shipped

Sharing + short URLs

HookShare model, /s/{shareId}, referral attribution

shipped

Generic media upload (video accepted)

/api/upload/media — mp4/mov/webm up to 500MB, magic-byte validation in lib/magic-bytes.ts

shipped

Artist bonus media (pattern)

ArtistMedia model — per-artist; useful precedent, not the target surface

shipped

Fan-submitted video pattern

HookVideoSubmission — per-track, PENDING/APPROVED/REJECTED moderation + upload source + attribution. Schema-level blueprint for community visuals.

shipped

Winner selection + Stripe transfer

Bounty pipeline — one-winner-per-track, Stripe Connect transfer. Direct analog to future visual-creator rev share.

shipped

Generation pipeline abstraction (parked)

VisualGenerationJob, visual-pack metadata, and Casset Studios runtime hooks remain feature-flagged for a later expansion path.

shipped

Implication: V1 of Visual Pairing is a schema addition + a ~150-line extension to the existing generateTikTokVideo() loop + a single upload UI. No new infra primitives required.

3. User flows

A. Artist flow

Artist opens track in Studio
  → "Visuals" tab (new)
  → V1: Upload one visual (MP4/MOV/WebM, ≤ 30s, 9:16 recommended)
  → V2+: Browse attached visuals (own + community + AI) — filter by mood/BPM/tag
  → Tap visual to preview with the hook (real-time, in-app)
  → Select "Use this visual" → becomes active pairing for the track
  → Export hook video (reuses existing TikTok-ready pipeline, now with visual layer)
  → Share sheet → TikTok / IG / X / copy link / Casset share URL

B. Visual creator flow (V3)

Creator signs up → claims handle → completes VisualCreator profile
  → Upload a visual (loop) — mp4/mov/webm, ≤ 30s
  → Tag: mood (chill / dark / euphoric / nostalgic / ...), BPM range, genre affinity,
         aesthetic (film / 3D / anime / generative)
  → Either:
     (a) Attach directly to a specific track (if invited / open submission)
     (b) Submit to marketplace — discoverable by any artist
  → Optional: set licensing tier (free / paid / rev-share)
  → Dashboard: impressions, pairings, exports, shares, earnings

C. Viewer / fan flow

Viewer lands on a hook preview (feed card, share link, profile page)
  → Sees the artist's currently active visual looping behind the hook
  → Taps the "visual switcher" affordance (bottom-left chip, say)
  → Swipes through alternative visuals for this track
  → Can share "this version" — the share URL encodes the visual choice
  → Recipient opens the link → same track, same hook, that visual → exportable as their own share

Fans don't edit — they curate. The act of sharing a specific pairing is a feature.

4. Feature breakdown (V1 → V3)

V1 — MVP (ship fast, 2–3 weeks) MVP

Goal: prove the creative loop (upload visual → export video → share) end-to-end with minimum surface.

Artist can upload a single visual per track. One VisualAsset row, kind = ARTIST, isActive = true.
Visual replaces the static cover background in the export pipeline. Waveform bars, avatar, title typography, watermark all remain.
Preview plays the visual behind the audio hook on the track detail page and in the export modal.
Fallback: tracks without a visual render exactly as today (no regression).
No marketplace, no AI, no tags, no moderation queue — artist-only uploads are implicitly trusted (same trust model as their audio upload today).
Storage: Vercel Blob via existing /api/upload/media. MP4/MOV/WebM, ≤ 30s, ≤ 500MB already enforced.

Out of scope for V1: multiple visuals per track, fan-facing switcher, generated visuals, any revenue share, moderation UI.

V2 — Multi-visual + browse V2

Goal: each track becomes an actual library, and the fan switcher exists.

Multiple VisualAsset rows per track; one marked isActive.
Artist UI: "Add visual" → upload or select from their previous uploads. Reorder, archive.
Tag system (lightweight): mood, genre, optional bpmMin/bpmMax. Free-text allowed but normalized via a small allowlist server-side.
Fan visual switcher on preview + feed cards. Swipe between visuals without interrupting audio.
Share URL encodes visual selection. ?v={visualAssetId} resolves server-side so the exported video inherits the chosen visual.
Export cache key updated to include visual ID.

V3 — Marketplace layer V3

Goal: creator ecosystem with attribution + optional payout.

VisualCreatorProfile — distinct identity surface (reuses User; adds profile fields + optional Stripe payout via existing User.stripeAccountId).
VisualSubmission — creator-submitted visuals with PENDING/APPROVED/REJECTED (modeled directly on HookVideoSubmission).
Marketplace browse — artists search visuals by tag, mood, BPM, creator. Attach with one tap → creates a VisualAsset referencing the submission.
Attribution: exported video frames carry a subtle visuals by @handle watermark line below the existing watermark. Always shown — cannot be removed.
Metrics per visual: impressions (fan views), pairings (times selected by an artist), exports, shares.
Optional rev-share — creators mark a visual as paid (one-time unlock cents) or rev-share (% of future bounty pool). Payouts ride the existing Stripe Connect transfer pipeline (lib/bounty-payout.ts).
Feature-flagged visual generation future: optional alternate visual packs seeded by hook identity, palette, lyrics, and artist visual DNA. Stored with provenance metadata, but kept out of the primary Studios workflow until quality and positioning are ready.

5. Data model

enum VisualKind {
  ARTIST       // uploaded by the track's artist
  COMMUNITY    // submitted by a VisualCreator, approved
  AI           // feature-flagged future generation provider
}

enum VisualStatus {
  PROCESSING   // upload/render in flight
  READY        // usable in pairings + exports
  FAILED       // transcode or generation error
  ARCHIVED     // soft-hidden by owner
}

enum VisualSubmissionStatus { PENDING  APPROVED  REJECTED }

model VisualAsset {
  id            String       @id @default(cuid())
  trackId       String
  track         Track        @relation(fields: [trackId], references: [id], onDelete: Cascade)

  kind          VisualKind
  status        VisualStatus @default(PROCESSING)

  // Source
  uploaderUserId    String?
  uploader          User?    @relation("VisualUploader", fields: [uploaderUserId], references: [id], onDelete: SetNull)
  submissionId      String?  @unique
  submission        VisualSubmission? @relation(fields: [submissionId], references: [id])

  // AI provenance (kind = AI only)
  sourceModel       String?
  sourcePromptHash  String?
  sourcePromptText  String?  @db.VarChar(2000)

  // Media
  videoUrl          String   @db.VarChar(2048)
  posterImageUrl    String?  @db.VarChar(2048)
  durationSec       Float
  widthPx           Int
  heightPx          Int
  fps               Int?

  // Pairing
  isActive          Boolean  @default(false)

  // Metrics (denormalized; authoritative counts via events)
  impressions       Int      @default(0)
  pairings          Int      @default(0)
  exports           Int      @default(0)
  shares            Int      @default(0)

  tags              VisualTag[] @relation("VisualAssetTags")

  createdAt         DateTime @default(now())
  updatedAt         DateTime @updatedAt

  @@index([trackId, status])
  @@index([trackId, isActive])
  @@index([uploaderUserId, createdAt])
  @@index([kind, status, createdAt])
}

model VisualTag {
  id        String  @id @default(cuid())
  slug      String  @unique       // "chill", "dark", "film", "3d", ...
  label     String
  category  String                 // "mood" | "aesthetic" | "genre"
  assets    VisualAsset[] @relation("VisualAssetTags")

  @@index([category])
}

model VisualSubmission {
  id               String                  @id @default(cuid())
  submitterUserId  String
  submitter        User                    @relation(fields: [submitterUserId], references: [id], onDelete: Cascade)

  targetTrackId    String?
  targetTrack      Track?                  @relation(fields: [targetTrackId], references: [id], onDelete: SetNull)

  videoUrl         String  @db.VarChar(2048)
  posterImageUrl   String? @db.VarChar(2048)
  durationSec      Float
  widthPx          Int
  heightPx          Int

  note             String? @db.VarChar(280)
  pitchTagsJson    Json?

  status           VisualSubmissionStatus @default(PENDING)
  reviewerUserId   String?
  reviewedAt       DateTime?
  rejectedReason   String? @db.VarChar(280)

  // Licensing
  licenseKind      String  @default("FREE")  // "FREE" | "PAID_ONE_TIME" | "REV_SHARE"
  priceCents       Int?
  revSharePct      Int?

  createdAt        DateTime @default(now())
  updatedAt        DateTime @updatedAt

  asset            VisualAsset?

  @@index([status, createdAt])
  @@index([targetTrackId, status])
  @@index([submitterUserId, createdAt])
}

model VisualCreatorProfile {
  id                String   @id @default(cuid())
  userId            String   @unique
  user              User     @relation(fields: [userId], references: [id], onDelete: Cascade)

  displayName       String?
  bio               String?  @db.VarChar(500)
  toolsJson         Json?
  websiteUrl        String?
  reelUrl           String?

  // Aggregates (computed, not source-of-truth)
  totalSubmissions   Int   @default(0)
  totalApprovals     Int   @default(0)
  totalPairings      Int   @default(0)
  totalExports       Int   @default(0)
  totalEarningsCents Int   @default(0)

  createdAt          DateTime @default(now())
  updatedAt          DateTime @updatedAt
}

Relationships

Track 1…N VisualAsset (cascade on delete)
Track 1…N VisualSubmission (nullable target for marketplace-only submissions)
VisualAsset N…N VisualTag
VisualSubmission 1…0/1 VisualAsset (approval creates the asset)
User 1…N VisualAsset (as uploader)
User 1…0/1 VisualCreatorProfile

Why not reuse `ArtistMedia` or `HookVideoSubmission`?

ArtistMedia is per-artist, not per-track, and is positioned as bonus download material — conceptually and relationally wrong surface.
HookVideoSubmission is per-track but represents fan-made clips of the hook (i.e. another artifact — people reacting to the track), not reusable visual beds. Keeping them separate avoids overloaded semantics in queries, metrics, and payouts.

6. API design

All routes under /api/visuals/* and /api/tracks/[trackId]/visuals/*. Auth + rate limits follow existing patterns (getUserIdFromRequest, rateLimit).

Upload + create

POST /api/upload/media                          ← already exists, reused as-is
  → returns { url, sizeBytes, mime, category: "video" }

POST /api/tracks/:trackId/visuals               ← new
  body: { videoUrl, posterImageUrl?, durationSec, widthPx, heightPx,
          kind: "ARTIST", tagSlugs?: string[] }
  → creates VisualAsset (PROCESSING → READY after server-side probe)
  → 403 if caller is not the track's artist (unless submission flow)

Browse / fetch

GET  /api/tracks/:trackId/visuals               ← new
  ?kind=ARTIST|COMMUNITY|AI&active=true|false&limit=&cursor=
  → paginated VisualAssets (READY only) with tags, creator, metrics
  → anonymous callers allowed (same visibility as hook preview)

GET  /api/visuals/:id                           ← new
  → single asset with creator profile expanded

Selection (set active pairing)

PATCH /api/tracks/:trackId/visuals/:id/active   ← new
  body: { active: true }
  → artist-only; transactionally clears previous active, sets this one active
  → pattern mirrors /api/hooks/submissions/[id]/winner/route.ts

Export

POST /api/hooks/tiktok-video                    ← already exists
  body: { trackId, hookStartSec, hookDurationSec, visualAssetId?, shareUrl? }
  → server returns signed URLs + metadata; encoding runs client-side
    via generateTikTokVideo()

Submission flow (V3)

POST /api/visuals/submissions                   ← new (marketplace OR targeted)
  body: { targetTrackId?, videoUrl, ..., licenseKind,
          priceCents?, revSharePct?, pitchTagsJson }

GET  /api/visuals/submissions                   ← new (admin + submitter's own list)
POST /api/visuals/submissions/:id/approve       ← new (admin) → creates VisualAsset
POST /api/visuals/submissions/:id/reject        ← new (admin) → sets rejectedReason

Attach (artist accepts a community visual)

POST /api/tracks/:trackId/visuals/attach        ← new (V3)
  body: { submissionId }
  → creates a VisualAsset linked to the submission
  → if licenseKind=PAID_ONE_TIME: creates a Stripe PaymentIntent (artist pays)
  → if REV_SHARE: records licenseKind on the asset for future payout split

Metrics ingestion

POST /api/visuals/:id/event                     ← new, fire-and-forget
  body: { type: "impression" | "pairing" | "export" | "share" }
  → rate-limited, client-fired on actual UX events
  → increments denormalized counters, emits to an analytics sink

7. Export system integration

The highest-leverage integration point. The existing pipeline in lib/tiktok-video.ts already does 95% of the work. We're not rebuilding — we're swapping one draw call.

Current architecture

generateTikTokVideo() runs entirely in the browser:

Fetch cover image + avatar + audio in parallel.
Decode audio via OfflineAudioContext.
Run Meyda on the mono downmix to produce per-frame waveform bars.
Pre-render a static background canvas (cover image + gradient + overlays).
Per frame: drawImage(bgCanvas) → draw waveform → draw avatar/text/watermark.
Encode via VideoEncoder, mux to MP4 via mp4-muxer.

The static background is the extension point.

V1 integration: video background

Accept optional visualAssetId in TikTokVideoParams.
Resolve to a videoUrl server-side (signed if needed); pass to the client.
Client-side: instead of pre-rendering a static bg canvas, load an offscreen <video> element, seek to 0, play muted + looped, and drawImage(videoEl, ...) on each frame in the encoding loop.
The waveform, avatar, typography, and watermark all layer on top unchanged.

Aspect ratio handling (9:16)

Target surface is 1080×1920 logical. If the source visual is:
9:16 already → cover-fill, no crop.
16:9 or 1:1 → center-crop with a slight zoom (preserve the "identity" feel vs letterboxing).
Portrait but not 9:16 → fit with blurred backdrop extension (reuse the cover-image treatment we already have).
Decision function lives in lib/visuals/fit.ts (new), returns { sx, sy, sw, sh, dx, dy, dw, dh } consumed by drawImage.

Watermarking

Existing watermark (Casset logo + optional share URL) stays unchanged.
V3 only: when the active visual is kind != ARTIST, append a second line: visuals by @creatorHandle, rendered in the same watermark block at slightly reduced opacity. Always present — this is the attribution guarantee that keeps creators contributing.

Waveform rendering

No changes to Meyda analysis.
Contrast guard: sample the visual's mean luminance across the hook duration. If the visual is very bright, darken the waveform container with a 30–40% bottom gradient scrim. Keeps bars legible without muddying the visual.

Performance

Per-frame drawImage on a decoded video is cheap on modern hardware (WebCodecs + VideoFrame from HTMLVideoElement is the fast path).
Existing quality presets (high/standard/low) already auto-downgrade on mobile — no new branching.
Cache key gains visualAssetId: ${trackId}:${start}:${duration}:${visualAssetId}:${shareUrl}.
Budget target: export time stays under 30s on mid-tier mobile for a 30s hook with visual.

Fallback

Track with no active visual → existing codepath, zero behavioral change.
Visual fails to decode / load → falls back to static cover bg (same codepath as today), non-fatal. Log + surface a toast.

8. Matching system (future-facing)

We start dumb, we end smart. The V1 data model must not close the door on V3 matching.

Phase 1 — Tags (V2)

Artist visuals and community submissions both carry tags.
UI exposes tag filters: mood, aesthetic, genre, BPM range.
"Suggested visuals" on a track = intersection(track genre tags, visual tags) ranked by recency + pairing count.
Zero ML. Zero embeddings. Ship this.

Phase 2 — Audio-feature suggestion (V2.5)

Reuse Meyda to extract per-track features on upload (RMS, spectral centroid, tempo estimate, mode).
Persist a compact TrackFeatures row (separate from this doc's schema).
Score VisualAsset tags against track features with a hand-tuned weight table. Good enough for "this is lofi → surface muted, slow visuals".

Phase 3 — Embedding-based matching (V3+)

Embed visuals via a CLIP-style model (frames sampled every 1s → mean-pooled).
Embed tracks via an audio embedding model (prototype with existing features → swap for a dedicated API when available).
Joint match: score = cosine(trackEmb, visualEmb) + tag_prior + recency_boost + creator_quality.
Recommender endpoint: GET /api/tracks/:id/visuals/recommendations.
Design constraint: every recommendation must be explainable to the artist ("matched on: moody, nocturnal, 80–90 BPM"). No black-box.

Reusing the Intelligence Layer

Casset already has lib/intelligence/ with an OpenAI + rule-based fallback and a coach-voice tone system. The matching recommender fits this layer cleanly — same fallback ladder, same UI voice.

9. Risks & constraints

Content moderation

Risk: community visuals and any future generated packs can carry NSFW, copyrighted, or violent content.
V1 (artist-only) has the same trust profile as existing audio uploads → no new surface.
V2: no community uploads yet.
V3: mandatory moderation queue (VisualSubmission.status = PENDING), modeled on HookVideoSubmission. Auto-screen with a provider (AWS Rekognition / Hive / Cloudflare Images moderation), human review for edge cases.
Report-to-moderation action on every played visual. One-strike takedown.

Copyright / ownership

Risk: creators submit visuals containing third-party footage they don't own.
Upload agreement (ToS checkbox) — explicit representation of ownership/license.
DMCA handler endpoint + takedown workflow (reuse the bounty takedown pattern).
V3: payout release delayed 7 days post-approval to allow challenges before funds move.

Low-quality spam visuals

Soft rate limit: 3 pending submissions per creator.
Require tag completeness + poster frame.
Auto-dedupe on sourcePromptHash for AI kind; perceptual-hash on video for uploads.
Creator reputation (reuse PromoterReputation pattern): approval rate, pairing rate, flag count.

Performance / cost of video rendering

Client-side path (V1–V3) reuses the existing budget. Only new cost is the extra drawImage(video) per frame — negligible vs encode.
AI generation is the cost center. Gate behind credits (Casset already has CreditLedger).
Server-side export considered only if WebCodecs coverage drops or the quality ceiling becomes binding. Not on the roadmap.

Storage

Vercel Blob. Visuals ≤ 30s at reasonable bitrate (≤ 10MB typical). A track with 20 visuals is ≤ 200MB.
Lifecycle rule: ARCHIVED assets drop to cold storage after 30 days; FAILED assets purge after 7.

Export file size

Existing MAX_OUTPUT_BYTES = 15 * 1024 * 1024 (15MB) ceiling in lib/tiktok-video.ts already bounds output. Adding a video bg doesn't change this — dynamic bitrate already adapts to duration.

10. UX principles

Visuals are identity, not decoration. The pairing UI lives next to the track, not under "extras". The export preview shows the visual as the dominant surface, not a thumbnail.
One-tap everything. Selection = tap. Preview = instant. Export = one tap → progress modal (already built) → share sheet.
Motion is the default. A track without a visual shows the existing static cover. A track with a visual should feel like a new object category on Casset.
Attribution is non-negotiable. Community visuals always carry the creator's handle on the exported video. This is the trust contract with creators.
Never block the artist. Failed visual load → fallback to cover. Unmoderated submission → invisible to fans but visible to admin + submitter. The artist's export always works.
Swipe, don't menu. Fan switcher on feed cards is a horizontal swipe, not a dropdown. Mirrors TikTok intuition.
Explainable matching. When we recommend visuals, we say why — in plain language, using the same coach voice as the Intelligence Layer.

11. Strategic impact

Increases sharing

A hook video with motion outperforms a static cover on every social platform's algorithm. We already export 1080×1920 — this is the multiplier, not the foundation.
The visual switcher creates multiple distinct share artifacts per track. One track, N videos, N share chances.

Creates a creator ecosystem

Visual artists (motion designers, VJs, 3D artists, AI operators) have no native home in music today. Album art is one-shot and mostly static. TikTok effects are for memes.
Casset becomes the first platform where a filmmaker's reel has a direct revenue path via music pairings.
Reputation compounds — the same primitives we already have for promoters (PromoterReputation, trust badges, streaks) port directly to visual creators.

Differentiates from feed + music platforms

Linktree / Beacons: no world layer, no audio-native participation.
Spotify / Apple Music: one cover per track, no identity surface, no fan-side mutation.
Bandcamp: static artifacts, no short-video pipeline.
TikTok / Reels: consumption layer only. No identity layer for the artist. No creator-to-artist pairing market.
Casset + Visual Pairing: Hook Object + identity + share artifact in one loop.

Second-order effects

New acquisition surface: visual creators import their audience to Casset.
New engagement loop: fans who can't make music but can make visuals now have a reason to sign up.
New Drop variants: artists can run "visual drops" — crowdsource the visual bed, pick a winner, ride the existing bounty rail.

Appendices

A. Rollout

Pre-V1: feature flag visual_pairing_v1 scoped to internal + 10 launch-partner artists.
V1 launch: open to all artists. No fan-facing UI beyond the export carrying the visual.
V2 launch: fan switcher behind visual_pairing_v2 flag; graduate on retention delta.
V3 launch: marketplace behind visual_marketplace flag; graduate once moderation SLA + payout flow are green for 2 weeks.

B. Success metrics

V1: % of active artists with at least one visual attached; export completion rate with vs without visual; share-through rate from exported video.
V2: visuals-per-track distribution; fan switcher interaction rate; share URL visual-retention (does the recipient keep the selected visual).
V3: approved submissions / week; creator activation (first pairing → first export); rev-share payout volume; creator retention week-4.

C. Non-goals (explicit)

Full video editor in-app (trimming, effects, transitions) — not a Casset product.
User-generated visuals on top of other users' exported videos — this is a remix feature, separately scoped.
Live / real-time visuals tied to playback — out of scope for Visual Pairing V1, not for Casset's broader Hook Object runtime.

Markdown source: docs/roadmap/visual-pairing-system.md. This page and the markdown stay in sync — edit one, edit both.

← Back to roadmap → Docs home

Visual Pairing System.

1. Product overview

What it is

Core artist actions

Why it matters

2. Foundation — already shipped

3. User flows

A. Artist flow

B. Visual creator flow (V3)

C. Viewer / fan flow

4. Feature breakdown (V1 → V3)

V1 — MVP (ship fast, 2–3 weeks) MVP

V2 — Multi-visual + browse V2

V3 — Marketplace layer V3

5. Data model

Relationships

Why not reuse ArtistMedia or HookVideoSubmission?

6. API design

Upload + create

Browse / fetch

Selection (set active pairing)

Export

Submission flow (V3)

Attach (artist accepts a community visual)

Metrics ingestion

7. Export system integration

Current architecture

V1 integration: video background

Aspect ratio handling (9:16)

Watermarking

Waveform rendering

Performance

Fallback

8. Matching system (future-facing)

Phase 1 — Tags (V2)

Phase 2 — Audio-feature suggestion (V2.5)

Phase 3 — Embedding-based matching (V3+)

Reusing the Intelligence Layer

9. Risks & constraints

Content moderation

Copyright / ownership

Low-quality spam visuals

Performance / cost of video rendering

Storage

Export file size

10. UX principles

11. Strategic impact

Increases sharing

Creates a creator ecosystem

Differentiates from feed + music platforms

Second-order effects

Appendices

A. Rollout

B. Success metrics

C. Non-goals (explicit)

Visual Pairing System.

1. Product overview

What it is

Core artist actions

Why it matters

2. Foundation — already shipped

3. User flows

A. Artist flow

B. Visual creator flow (V3)

C. Viewer / fan flow

4. Feature breakdown (V1 → V3)

V1 — MVP (ship fast, 2–3 weeks) MVP

V2 — Multi-visual + browse V2

V3 — Marketplace layer V3

5. Data model

Relationships

Why not reuse ArtistMedia or HookVideoSubmission?

6. API design

Upload + create

Browse / fetch

Selection (set active pairing)

Export

Submission flow (V3)

Attach (artist accepts a community visual)

Metrics ingestion

Why not reuse `ArtistMedia` or `HookVideoSubmission`?

Why not reuse `ArtistMedia` or `HookVideoSubmission`?