How hooks work.
The hook is the doorway into a Hook Object. Here's exactly how it's stored, served, rendered, and shared.
Every track on Casset has two views: the full song (unlocked, gated) and the hook — a 30-second window the artist chooses that plays instantly for any visitor. The product direction is to turn that hook into a Hook Object: audio, lyrics, visuals, provenance, presence, participation, release context, and proof of being early.
Anatomy of a hook
A hook is just two numbers on a track record:
previewStartSec— where the 30-second window begins in the full track. If unset, it defaults to ~35s in (past most intros).hookDurationSec— optional per-track override. Defaults toHOOK_DURATION_SEC(30s) fromlib/hook-constants.ts.
The artist picks these in edit mode by dragging the preview scrubber. The numbers are persisted to the DB, cached on the artist record, and served back with the rest of the casset data.
Hook vs. Hook Object
The hook is the audio window. The Hook Object is the full experience built around it: synchronized lyric moments, realtime visual state, shader treatment, provenance, room presence, fan traces, unlock state, release-context references, and share/export identity.
That distinction keeps the product honest. A hook can be copied by any preview player. A Hook Object is the artist-owned audiovisual object where the song gets meaning before and after release, while the canonical release layer keeps manifest, permission, and provenance state underneath it.
Playback gating
Public visitors can only hear the hook. When a casset page loads for a visitor who hasn’t unlocked it:
- The audio element is pointed at
/api/audio/[trackId]with a short-lived signed token. - Playback seeks to
previewStartSecand loops back when it reachespreviewStartSec + hookDurationSec. - When the loop wraps, we do a smooth fade-out/seek-to-start so the hook feels intentional rather than truncated.
Unlocked listeners (those who paid via Apple Pay or joined a free casset) get the full track with the same token endpoint — the server validates entitlement before signing.
The fade-and-loop
The "loops back" step in the previous section is more careful than it sounds. Cutting audio mid-word is the fastest way to make a preview feel broken, so the player does a micro-fade around every wrap:
- Watch
timeupdateevents. When the current offset gets within ~0.5s of the end of the hook window, pre-empt the wrap. - Pin
audio.volumeto0,seek()to the hook start, then ramp volume back over ~50ms. - The audiovisual surface keeps its resolved frame through the fade, so lyrics and atmosphere do not flicker — the loop reads as "a restart", not a crash.
On tracks shorter than the hook window, we skip the fade and rely on the browser's native loop attribute.
The waveform endpoint
The preview scrubber would be painful to use without a waveform, and decoding audio on every page load would be expensive. So we pre-compute:
- On upload, a background job decodes the file and downsamples it to ~500 normalized peaks using RMS per bucket.
- Peaks are stored alongside the track and served from
GET /api/audio/waveform/[trackId]. - The scrubber renders them as vertical bars and overlays the 30s hook window as an iridescent band — the same visual treatment used in the marketing hero, so it feels consistent.
Waveform JSON is small (~2 KB) and cached aggressively on the edge. The scrubber paints it immediately, even before the audio element has loaded metadata.
Proxy window (enforcing the 30s on the wire)
Client-side playback gating is for UX. Server-side byte-range truncation is for security. Two constants in lib/audio-access.ts shape what the server will serve to an unentitled listener:
PREVIEW_START_SEC(=35) — default offset ifpreviewStartSecisn't set.AUDIO_PROXY_WINDOW_SEC(=30) — max seconds past the start that the server will proxy. Same as the hook duration by design, but kept separate so we can open it up for experiments.
Seconds → bytes uses ESTIMATED_BITRATE_BYTES_PER_SEC (~16 KB/s for 128 kbps MP3). Even an aggressive Range: bytes=0- request gets a clipped response. Full detail in the audio pipeline doc.
The audiovisual surface
The hook does not drive a separate decorative visualizer. It drives a resolved playback frame: lyric cue, phrase arrival, beat intensity, vocal energy, atmosphere, transition intensity, and visual ramp.
lib/playback-clock.tsreads the global audio element through a drift-corrected clock.lib/hook-playback-timeline.tsprecomputes phrase, word, breath, and waveform events for the hook window.- ShaderLab visuals, lyric subtitles, atmosphere, and audio-reactive treatments all read from that same frame.
The difference is subtle but important: Casset is not asking the page to "react" to sound after the fact. It is resolving where the listener is inside the phrase and letting the environment move with that moment. Full detail lives in the audiovisual playback doc.
How hooks travel
Preview cards
When someone pastes casset.fm/yourname into iMessage, Slack, or Twitter, the social crawler hits /[slug]/opengraph-image, which renders a 1200×630 branded card (pfp + display name + first track title) on the fly using next/og.
The card is cached for an hour and regenerates when the artist edits their casset, so previews always reflect the current state.
TikTok share videos
Tap the share icon on any track. The client uses MediaRecorder to burn a 30-second 1080×1920 vertical video with:
- The hook audio (from the signed endpoint)
- The album art + artist display name
- A waveform + progress ring synced to playback
- A “use this sound → casset.fm/yourname” CTA baked in
The full pipeline: request a one-shot audio token from POST /api/audio/token; draw frame-by-frame onto a canvas; combine canvas.captureStream() with the audio element’s captureStream(); feed both into MediaRecorder for 30s; write the resulting blob to a download link. No server video encoding, no queue — everything happens in the listener’s browser.
The result downloads to the user’s device as an MP4. Upload straight to TikTok / Reels / Shorts.
Copy link
The plain link copy button always writes the canonical URL (casset.fm/<slug>). We explicitly avoid /share/<slug> for chat-app previews, because some crawlers don’t follow redirects and you end up with a bare URL instead of a card.
Why we don’t ship raw audio
The raw audio file never appears in the HTML or in any public URL. A listener’s browser requests /api/audio/[trackId], we check their entitlement (cookie-based session → purchase record), mint a short-lived signed URL for the object store, and proxy the bytes back.
GET /api/audio/<trackId> # returns signed stream URL
Cookie: session=<jwt> # must belong to an entitled user
# (or anyone, for preview-only playback
# which truncates server-side at the
# hook window)This means a determined visitor can only ever capture the 30-second hook, not the full file. For unlocked listeners, the token expires in under a minute, so hotlinking is pointless.
Why 30 seconds
The whole product is built around one observation: songs don’t go viral because someone streamed them — they go viral because someone heard the right 30 seconds and couldn’t stop thinking about it.
Giving artists a first-class place to curate which 30 seconds is the wedge. The larger bet is that the hook can become an audiovisual object fans enter, help shape, support, share, and return to.