Work detail
AI content system
Multi-modal AI content platform with SSE streaming, credit-based payments, and admin panel.
AI content generation platform with a multi-modal composer, chat/studio UX, credit-based usage, real-time SSE streaming, media uploads, and multi-gateway payment integration.
- Role
- Founding Engineer / Frontend Architect
- Year
- 2026
- Status
- building
- Website
- Private project
01
Problem
The client needed a production-ready AI content generation platform from zero — multi-provider AI routing (Anthropic, OpenAI, Gemini), real-time token streaming, credit-based monetization with dual payment gateways, media storage, and a full admin panel. The stack needed to be modern and maintainable: full-stack TypeScript monorepo with type-safe DB access, shipped to production.
02
What I built
- 1.SSE streaming AI composer: NestJS interceptor streams token chunks; React client hook manages the EventSource lifecycle, handles cancellation, and appends tokens to a ref-backed buffer to avoid re-rendering on every token arrival.
- 2.Credit-based usage system with Paddle (international) and YooKassa (CIS) — webhook handling, balance reconciliation, and optimistic UI that deducts balance on generation start and reconciles after stream completion.
- 3.Admin panel with model management (enable/disable providers, set per-model pricing), usage analytics by user and model, and credit adjustment tooling.
- 4.Cloudflare R2 media pipeline: presigned PUT URL generation on the API, direct client upload to R2, metadata stored in Postgres after upload confirmation.
- 5.Multi-modal composer UI: compose mode (structured prompt builder), chat mode (multi-turn), and history panel — sharing one generation state machine across all modes.
- 6.Full-stack monorepo: TanStack Start frontend + NestJS API, sharing a Drizzle ORM schema package with type-safe migrations and inferred query types.
03
Architecture
- 1.TanStack Start (SSR + SPA hybrid) over Next.js for its more explicit data loading model and tighter TanStack Query integration — avoids the RSC mental model where server/client data boundaries are implicit.
- 2.Generation lifecycle managed by a Zustand store (not React Query) — tracks whether a generation is running, its ID, and cancellation state. React Query handles only server state: history, user data, balance.
- 3.SSE stream state lives in a React ref, not useState — prevents 30+ re-renders per second on token arrival. The ref syncs to a display state on each animation frame via requestAnimationFrame batching.
- 4.Credit atomicity: Postgres advisory locks prevent double-spend when the same user opens multiple tabs and starts concurrent generations.
- 5.Drizzle ORM for type-safe queries and migrations — no ORM abstraction overhead, just a thin query builder with full TypeScript inference on joined queries.
04
UX decisions
- 1.Optimistic credit deduction on generation start — balance updates immediately in the UI, reconciles after stream ends. Avoids the awkward "checking balance…" state users would otherwise see on every generation.
- 2.Streaming output renders progressively — paragraphs reflow as tokens arrive. The composer shows a skeleton for the next expected content block while streaming is in progress.
- 3.Cancel support: user can abort mid-generation; partial output is preserved and saved as a "partial" history entry with a retry affordance.
- 4.Separate compose and chat modes with different affordances — compose is structured (template slots, output format selector), chat is conversational. Both share the same streaming infrastructure.
- 5.Admin analytics served from read-only materialized views — no live queries on the generations table, which can be millions of rows.
05
Challenges
- 1.SSE auto-reconnects on network drop (browser behavior) — NestJS had to track active connection IDs and skip re-sending tokens already delivered. Solved with a per-generation event cursor stored in Redis.
- 2.TanStack Start was in public beta during development; the public API changed twice. Maintained forward compatibility with an adapter layer abstracting router-specific data fetching.
- 3.Multi-tab credit consistency: two concurrent tabs starting generations simultaneously. Advisory locks handle DB-level atomicity; the UI subscribes to credit update events via a background SSE channel separate from generation streams.
06
Outcome
Platform shipped to production. Handles multi-tenant AI generation workflows across web and API channels. Credit system processes payments via Paddle (international) and YooKassa (CIS markets).
Discuss a project→