casset/docs
FeaturesOpen app
docs indexreference
00Overview01Thesis02Architecture03System reality04Roadmap05Investor brief06Technical brief07Full tech HTML08API reference09Playback10Audio pipeline11Commerce12Base anchoring13Hook system14Music video15Theming16Creator guide17Glossary
the shape of the product

How hooks work.

The hook is the doorway into a Hook Object. Here's exactly how it's stored, served, rendered, and shared.

Every track on Casset has two views: the full song (unlocked, gated) and the hook — a 30-second window the artist chooses that plays instantly for any visitor. The product direction is to turn that hook into a Hook Object: audio, lyrics, visuals, provenance, presence, participation, release context, and proof of being early.

Anatomy of a hook

A hook is just two numbers on a track record:

  • previewStartSec — where the 30-second window begins in the full track. If unset, it defaults to ~35s in (past most intros).
  • hookDurationSec — optional per-track override. Defaults to HOOK_DURATION_SEC (30s) from lib/hook-constants.ts.

The artist picks these in edit mode by dragging the preview scrubber. The numbers are persisted to the DB, cached on the artist record, and served back with the rest of the casset data.

Hook vs. Hook Object

The hook is the audio window. The Hook Object is the full experience built around it: synchronized lyric moments, realtime visual state, shader treatment, provenance, room presence, fan traces, unlock state, release-context references, and share/export identity.

That distinction keeps the product honest. A hook can be copied by any preview player. A Hook Object is the artist-owned audiovisual object where the song gets meaning before and after release, while the canonical release layer keeps manifest, permission, and provenance state underneath it.

Playback gating

Public visitors can only hear the hook. When a casset page loads for a visitor who hasn’t unlocked it:

  • The audio element is pointed at /api/audio/[trackId] with a short-lived signed token.
  • Playback seeks to previewStartSec and loops back when it reaches previewStartSec + hookDurationSec.
  • When the loop wraps, we do a smooth fade-out/seek-to-start so the hook feels intentional rather than truncated.

Unlocked listeners (those who paid via Apple Pay or joined a free casset) get the full track with the same token endpoint — the server validates entitlement before signing.

The fade-and-loop

The "loops back" step in the previous section is more careful than it sounds. Cutting audio mid-word is the fastest way to make a preview feel broken, so the player does a micro-fade around every wrap:

  • Watch timeupdate events. When the current offset gets within ~0.5s of the end of the hook window, pre-empt the wrap.
  • Pin audio.volume to 0, seek() to the hook start, then ramp volume back over ~50ms.
  • The audiovisual surface keeps its resolved frame through the fade, so lyrics and atmosphere do not flicker — the loop reads as "a restart", not a crash.

On tracks shorter than the hook window, we skip the fade and rely on the browser's native loop attribute.

The waveform endpoint

The preview scrubber would be painful to use without a waveform, and decoding audio on every page load would be expensive. So we pre-compute:

  • On upload, a background job decodes the file and downsamples it to ~500 normalized peaks using RMS per bucket.
  • Peaks are stored alongside the track and served from GET /api/audio/waveform/[trackId].
  • The scrubber renders them as vertical bars and overlays the 30s hook window as an iridescent band — the same visual treatment used in the marketing hero, so it feels consistent.

Waveform JSON is small (~2 KB) and cached aggressively on the edge. The scrubber paints it immediately, even before the audio element has loaded metadata.

Proxy window (enforcing the 30s on the wire)

Client-side playback gating is for UX. Server-side byte-range truncation is for security. Two constants in lib/audio-access.ts shape what the server will serve to an unentitled listener:

  • PREVIEW_START_SEC (= 35) — default offset ifpreviewStartSec isn't set.
  • AUDIO_PROXY_WINDOW_SEC (= 30) — max seconds past the start that the server will proxy. Same as the hook duration by design, but kept separate so we can open it up for experiments.

Seconds → bytes uses ESTIMATED_BITRATE_BYTES_PER_SEC (~16 KB/s for 128 kbps MP3). Even an aggressive Range: bytes=0- request gets a clipped response. Full detail in the audio pipeline doc.

The audiovisual surface

The hook does not drive a separate decorative visualizer. It drives a resolved playback frame: lyric cue, phrase arrival, beat intensity, vocal energy, atmosphere, transition intensity, and visual ramp.

  • lib/playback-clock.ts reads the global audio element through a drift-corrected clock.
  • lib/hook-playback-timeline.ts precomputes phrase, word, breath, and waveform events for the hook window.
  • ShaderLab visuals, lyric subtitles, atmosphere, and audio-reactive treatments all read from that same frame.

The difference is subtle but important: Casset is not asking the page to "react" to sound after the fact. It is resolving where the listener is inside the phrase and letting the environment move with that moment. Full detail lives in the audiovisual playback doc.

How hooks travel

Preview cards

When someone pastes casset.fm/yourname into iMessage, Slack, or Twitter, the social crawler hits /[slug]/opengraph-image, which renders a 1200×630 branded card (pfp + display name + first track title) on the fly using next/og.

The card is cached for an hour and regenerates when the artist edits their casset, so previews always reflect the current state.

TikTok share videos

Tap the share icon on any track. The client uses MediaRecorder to burn a 30-second 1080×1920 vertical video with:

  • The hook audio (from the signed endpoint)
  • The album art + artist display name
  • A waveform + progress ring synced to playback
  • A “use this sound → casset.fm/yourname” CTA baked in

The full pipeline: request a one-shot audio token from POST /api/audio/token; draw frame-by-frame onto a canvas; combine canvas.captureStream() with the audio element’s captureStream(); feed both into MediaRecorder for 30s; write the resulting blob to a download link. No server video encoding, no queue — everything happens in the listener’s browser.

The result downloads to the user’s device as an MP4. Upload straight to TikTok / Reels / Shorts.

Copy link

The plain link copy button always writes the canonical URL (casset.fm/<slug>). We explicitly avoid /share/<slug> for chat-app previews, because some crawlers don’t follow redirects and you end up with a bare URL instead of a card.

Why we don’t ship raw audio

The raw audio file never appears in the HTML or in any public URL. A listener’s browser requests /api/audio/[trackId], we check their entitlement (cookie-based session → purchase record), mint a short-lived signed URL for the object store, and proxy the bytes back.

GET /api/audio/<trackId>       # returns signed stream URL
Cookie: session=<jwt>          # must belong to an entitled user
                               # (or anyone, for preview-only playback
                               # which truncates server-side at the
                               # hook window)

This means a determined visitor can only ever capture the 30-second hook, not the full file. For unlocked listeners, the token expires in under a minute, so hotlinking is pointless.

Why 30 seconds

The whole product is built around one observation: songs don’t go viral because someone streamed them — they go viral because someone heard the right 30 seconds and couldn’t stop thinking about it.

Giving artists a first-class place to curate which 30 seconds is the wedge. The larger bet is that the hook can become an audiovisual object fans enter, help shape, support, share, and return to.

← Back to the creator guide→ API reference
© Casset 2026
PrivacyTrustTerms