Inside voltage-kalshi: Building a Kalshi BTC Volatility Bot

Kalshi is a regulated prediction market. You're not trading BTC itself — you're trading binary contracts on whether BTC closes above or below a specified level within a given window. Most participants treat these like sports bets. I treat them like a structured volatility product with extractable edge.

The key insight: forecasting volatility is a different problem than forecasting price. A directional model asks "will BTC go up or down?" A volatility model asks "will BTC move more than X% in the next N hours?" The second question has more stable signal. Markets are less efficient at pricing volatility events than directional bets, and the asymmetry in contract payoffs creates exploitable opportunities.

This is what voltage-kalshi is built around.

The Data Pipeline

You need two data streams running in parallel: Kalshi's orderbook and Kraken's OHLCV feed.

Kalshi exposes a WebSocket API for real-time orderbook updates. Each tick carries best bid, best ask, last price, and volume on both sides. The relevant signal here is order flow imbalance — when the buy side consistently dominates at the ask, someone informed is positioning.

Kraken's WebSocket provides tick-level OHLCV for BTC/USD. I subscribe to both the 1-minute and 5-minute channels. The raw feed looks like this:

// Kraken WS subscription
{
  "event": "subscribe",
  "pair": ["XBT/USD"],
  "subscription": {
    "name": "ohlc",
    "interval": 1   // 1-min candles
  }
}

The first engineering problem: WebSocket connections drop. Kraken will silently disconnect under load. The fix is a reconnection manager that tracks last heartbeat timestamp and reconnects with exponential backoff. If you miss candles during reconnection, you backfill via the REST API before resuming live subscriptions. Never assume the stream is healthy.

Feature Engineering for LightGBM

LightGBM needs tabular features. I compute these in a rolling window over the incoming OHLCV stream:

Realised volatility — Parkinson estimator over 5m, 15m, and 1h windows. More efficient than close-to-close vol because it uses high/low information.
Return momentum — signed 5m and 15m price returns, and their ratio (momentum persistence).
Volume Z-score — current volume normalised against the rolling 20-period mean and std. Volume spikes precede volatility spikes.
Order flow imbalance — from the Kalshi orderbook: (bid_size - ask_size) / (bid_size + ask_size). Values near ±1 signal strong directional conviction.
Time features — hour of day, day of week, encoded cyclically with sin/cos. BTC volatility has strong intraday seasonality around US market open and London overlap.

Why Parkinson vol? Close-to-close volatility wastes information. The Parkinson estimator uses (ln(H/L))² / (4 ln 2) to incorporate the high-low range, making it 5x more efficient for i.i.d. log-normal prices. For short windows on crypto, this matters.

LightGBM trains in minutes, handles missing data gracefully, and produces well-calibrated probability estimates with objective: binary + is_unbalance: true. It's the right tool for this feature set.

PatchTST: The Transformer Component

LightGBM is good at cross-sectional patterns but captures temporal structure only indirectly through lagged features. PatchTST — Patch Time Series Transformer — addresses this directly.

The key idea: instead of feeding the transformer one timestep at a time, you divide the sequence into non-overlapping patches (like ViT does for image patches) and apply attention over those patches. This dramatically reduces the effective sequence length and lets the model capture long-range dependencies that lagged features would miss.

# Patch sequence construction
def patchify(x, patch_len=16, stride=8):
    # x: [batch, seq_len, features]
    n_patches = (x.shape[1] - patch_len) // stride + 1
    patches = torch.stack([
        x[:, i*stride : i*stride + patch_len, :]
        for i in range(n_patches)
    ], dim=1)
    # patches: [batch, n_patches, patch_len * features]
    return patches.flatten(2)

I train PatchTST on 60-candle sequences (1 hour of 1-min data) predicting a binary volatility event label. The model outputs a probability independently for each channel (price, volume, order flow) and they're averaged in the output head.

Ensemble and Position Sizing

At inference time, LightGBM and PatchTST each produce a probability estimate for the volatility event. I combine them with a learned weighted average, where weights are calibrated on a held-out validation set using isotonic regression. Typically the weights end up around 0.55 LightGBM / 0.45 PatchTST, though this shifts during high-regime-change periods.

Position sizing uses the Kelly criterion:

// Kelly fraction
const kelly = (b * p - q) / b;
// b = net odds (contract payout ratio)
// p = model's estimated P(win)
// q = 1 - p

// Use fractional Kelly (0.25x) to account for model uncertainty
const position = bankroll * kelly * 0.25;

Full Kelly is theoretically optimal but practically suicidal when your model is wrong. 0.25x Kelly gives up some expected growth in exchange for dramatically lower drawdowns. With a real capital account, surviving a bad week matters more than maximising a good one.

What Breaks in Production

The gap between a backtest and a live system is the gap between theory and grief. Here's what I encountered that no test suite caught:

Stale orderbook state. If the Kalshi WS lags even 200ms, your order flow imbalance feature is wrong. Add a staleness check before inference — if the last orderbook update is older than 500ms, skip the trade.
Kalshi API rate limits. The order submission endpoint throttles at 10 req/s. Under fast market conditions you'll hit this and miss entries. Queue submissions with a rate limiter, don't fire-and-forget.
Model drift. BTC volatility regimes change. A model trained on a low-vol period will underfit during a high-vol regime change. I retrain weekly on a rolling 90-day window and monitor calibration error daily.
The TypeScript async trap. Race conditions in async position management are hard to reproduce. Use a single async queue for all order mutations — never let two concurrent paths touch the same position state.

The most important lesson: live trading is a distributed systems problem disguised as a machine learning problem. The model is 30% of the work. The operational infrastructure is 70%.