> WHATS_NEW
> TECH_KUDOS
The stack that fits all of this on one tiny VPS:
- .NET 9 — the C# whale ingestor
- SQLite — the single-file durable store
- systemd — service orchestration + the timers that drive the watchdog and the 4-hour surface publisher
- nginx — TLS termination + reverse proxy
- Python + Flask + gunicorn — the small private ops view
- Pushover — the actual page-to-phone delivery
- Cursor + Claude — the pair-programmer that wrote most of the watchdog while I argued about thresholds
> WHAT_THIS_IS
WhaleFlows is a hobby project that surfaces large on-chain ETH, BTC, and LINK transactions plus net exchange flows across Binance, Coinbase, Kraken, and Bitfinex hot/cold wallets (with Bybit and Gemini live on the ETH side), then self-publishes a snapshot to the surface log on THE_DECK every four hours on the UTC clock. The data is stated; the interpretation is yours. There is intentionally no compose box and no algorithm in the middle.
Sources
All on-chain & price feeds, end to end. No news, no social, no off-chain sentiment shaping anything on the deck.
- Hero-strip ticker prices — live median of three US-regulated spot venues, refreshed every 60s: api.binance.us latest-trade ticker, api.coinbase.com /v2/prices/spot, and api.kraken.com /0/public/Ticker. Each ticker cell shows which venues responded this tick + their individual prices, so the median is auditable inline. Partial-venue ticks still render: median is computed over whichever venues responded.
- Price candles (history + charts + 1h/4h/1D deltas) — Binance.US public klines REST endpoint, 1-minute granularity, polled every 60s by the .NET worker into
price_candles. We store binance.us as the single historical source because charting median candles would require minute-aligned synchronization across three venues; not worth the complexity vs. clarity. - Ethereum whale txs (≥ 1,000 ETH) — Etherscan block data, scanned forward from the last cursor each cycle.
- Bitcoin whale txs (≥ 100 BTC) — mempool.space block data, scanned forward from the last cursor each cycle.
- ERC-20 whale txs — LINK first (≥ 50,000 LINK, ~$1M) — Etherscan
eth_getLogson the token'sTransferevent, with a per-token block cursor. Adding the next ERC-20 (UNI, MKR, AAVE, …) is one config row inApp.Whales.Erc20Tokens— no code change. LINK whales contribute to the SURFACE_LOG/TREND grid today; a dedicated LINK panel on NET_24H_EXCHANGE_FLOWS lands once we’ve mapped enough verified ETH-side LINK custody addresses for the net-flow math to mean something. - Address-history stream — Etherscan
account/txlistfor ETH addresses and mempool.space/address/{a}/txsfor BTC addresses, swept every 6h for every address in the cex_address_book. NO whale threshold — captures every tx in/out of every tagged wallet. Each address anchors at the chain tip on first sight, no historical backfill. - Fear & Greed Index — alternative.me daily reading.
- Exchange-wallet labels — a single committed JSON address book (currently 27 ETH + 7 BTC addresses across 8 exchanges) cross-referenced from public block-explorer tags (Etherscan, Bitinfocharts, BlockExplorer, Spark Privacy Audit, Arkham) and each exchange's published proof-of-reserves attestations. Every entry carries a source note and a verified flag — unverified rows are tag-only and held out of the net-flow math.
- Weather (background mood only) — open-meteo.com current-weather endpoint, queried with the viewer's browser geolocation (only after the viewer grants permission — if denied, the background just falls back to a moonlit-ocean default). 30-minute cache. Affects only the page background; data sections never change wording based on weather.
Why two ticker tiers? (and do they change?)
The hero strip splits tickers into two visual tiers because they're backed by two very different levels of coverage. The badges do change — price-only tickers graduate to flow+price as we ship per-chain on-chain coverage. It's a one-way escalator: a coin never drops back.
- flow+price (BTC, ETH) — full pipeline: whale-tx ingestion, exchange-wallet tagging, net-flow aggregation, address-history stream, AND historical 1m candles powering 1h/4h/1D deltas + the candle charts. These two also feed the NET_24H_EXCHANGE_FLOWS card and the TREND sparklines.
- flow+price (partial) (LINK) — LINK graduated in pass 11.0c: a per-token Etherscan
getLogsingestor (erc20.whales) is now live with a 50,000-LINK whale threshold (~$1M), and LINK whales feed the SURFACE_LOG/TREND grid (1h, 4h, and 24h windows) on every four-hour publish. It’s only "partial" because the verified ERC-20 custody address book is still small: net-flow math needs hand-confirmed exchange wallets per token before LINK gets its own card on NET_24H_EXCHANGE_FLOWS. Adding the next ERC-20 token is a single config row, not a code push. - price-only (SOL, XRP, DOGE, ADA, AVAX) — we pull median spot from the same three venues every 60s, plus 1h/4h/1D candles from Binance.US, so the chart + price are honest. But we do NOT yet run a chain explorer for those networks — no whale ingestion, no exchange-wallet tagging, no entry in the net-flow card. The next chain we light up flips its tier to flow+price the moment its ingestor goes live in production.
What's stopping us from doing flow tracking on more chains? Honestly, just dev time per chain — each one is a ~2–4 hour build (explorer integration + chain entry in cex_address_book + per-chain whale ingestor + per-chain address-history sweep). The schema is already chain-keyed for this; adding rows for SOL/XRP/etc. is a one-liner once the corresponding ingestor is online. Order roughly follows market cap + how cleanly the public explorer APIs cooperate.
What about NET_24H_EXCHANGE_FLOWS + TREND for other coins?
Honest answer: partially auto-populated, partially still gated. The TREND sparklines already cover BTC, ETH, AND LINK across 1h / 4h / 24h windows on every four-hour publish — pass 11.0c lit up an ERC-20 whale ingestor (erc20.whales) that scans Etherscan’s getLogs endpoint per-token, and the trend grid is data-driven, so adding the next ERC-20 token = one config row. The NET_24H_EXCHANGE_FLOWS card is more conservative: it only shows ETH + BTC today because those two chains have a hand-verified exchange address book sized for net-flow math to be honest. LINK breaches feed TREND but won’t join the flow card until we’ve hand-confirmed enough verified ERC-20 custody addresses per CEX. New chains (SOL via solscan.io, XRP via xrpscan.com, …) land in both surfaces once their per-chain ingestor + verified address book ship. So the wait is build-time per chain, not a config flip.
What counts as a whale
Per-asset thresholds are tunable in appsettings.json and visible inline on each NET_24H_EXCHANGE_FLOWS card. Today’s policy:
- BTC: ≥ 100 BTC per tx — ~$10M at $100k/BTC. Source: mempool.space.
- ETH: ≥ 1,000 ETH per tx — ~$3.5M at $3.5k/ETH. Source: Etherscan block scan.
- LINK: ≥ 50,000 LINK per tx — ~$1M at $20/LINK. Source: Etherscan
getLogson the LINKTransferevent.
These are static cuts today — sane defaults that catch the meaningful tail without drowning the pipeline. Roadmap: a data-driven analyzer that proposes per-asset thresholds from the observed transfer-size distribution (so BTC’s threshold tightens automatically as the mean tx size grows), plus runtime tuning dials on the Bridge so changes don’t require a worker rebuild. Tracked in NEXT.md.
Honest caveats
- Exchange flow coverage is Binance + Coinbase + Kraken + Bitfinex on both chains, plus Bybit and Gemini on ETH. Several additional wallets at those venues — plus initial entries for OKX and Bitstamp — are currently in tag-only mode: their txs are labeled at ingest for row-level context but they're held out of the net-flow math until each address has been hand-confirmed against Etherscan / mempool.space. We'd rather under-report than mis-attribute. Each wallet graduates from tag-only → counted as soon as a human walks its explorer page and flips the
verifiedflag — the same row keeps its history, it just starts contributing to the net-flow card from that moment forward. So this number grows every time we ship a verification batch. - Coinbase BTC tagging is intentionally shallow. Coinbase rotates ~36 million BTC deposit and holding addresses (per Arkham), and a static address list cannot keep up. We tag a single well-documented historic Coinbase BTC hot wallet for directional analysis on the txs we do catch; real Coinbase BTC flow is materially larger than what shows up here. ETH side is denser because Coinbase's ETH custody concentrates in fewer addresses.
- LINK whales surface in TREND, not yet in the flow card. The
erc20.whalesingestor catches every LINK tx ≥ 50,000 LINK and tags it against the existing ETH-sidecex_address_book, so directional context is there at the row level. But the flow card needs a verified per-token custody mapping to do the inflow-minus-outflow math honestly — many CEX hot wallets that hold native ETH don’t hold LINK and vice-versa. Until that mapping ships, LINK lives on the TREND grid only. - "Biggest 24h whale tx" with no destination means the from/to addresses didn't match any wallet we currently tag — not that the tx is anonymous, just that our address book hasn't been told who they are yet.
- 1m / 5m candles are noisy. Useful for live tape-watching, useless for trading decisions at our 4-hour cadence.
- This is not financial advice and the site never posts calls or predictions — surfacings are stats roll-ups, not signals. We'll never DM you about an "investment opportunity"; if someone claiming to be us does, it's a scam.
Roadmap
New features land roughly weekly. Near-term targets (newest priority first):
-
>_ d33p th0u9ht
LIVE
— your AI co-captain is alive and on the bridge.
Running on its own server, ready to pilot logged-in captains to their destination
as fast and efficiently as possible. Ask it anything about the on-chain whale data —
it surfaced every number on this page and can explain all of it.
Ask it how the site works; it built it. Use
/whale,/flows,/fngfor live data snapshots. Report a bug and it queues a fix for review. Board now → (login required) - Google OAuth & SendGrid production smoke. Google sign-in routes are live; final smoke test confirming
https://whaleflows.io/auth/google/callbackround-trips cleanly in the production browser session. SendGrid transactional email (password-reset, welcome) needs a live-send confirmation against the DNS-configured domain identity. - Cross-VPS chat server logs on the Bridge. The admin bridge can already read every chat message from the d33p th0u9ht node via the internal API. Next: surface the raw application logs (gunicorn errors, Flask traces) from the chat server directly inside the Bridge — filterable by user — so debugging a bad session never requires SSHing into a second machine.
- Whale-threshold tuning dials on the Bridge. Today the per-asset cuts (100 BTC, 1,000 ETH, 50,000 LINK) live in
appsettings.jsonand require a worker restart to change. Step 1: a Bridge card with a per-asset numeric input + apply button that rewrites the live config. Step 2: a data-driven analyzer that scans the recent observed transfer-size distribution and proposes a sane threshold (e.g. “P95 over the last 30 days = 87 BTC; tighten to that?”). Removes the last “edit JSON, restart worker” step from operating the whale book. - Verify the OKX / Bitstamp / extra Bitfinex / Bybit / Gemini wallets currently in tag-only mode and promote confirmed ones into the net-flow math.
- Surface a daily count of
address_transactionsrows ingested so the sub-whale address-history stream is visibly doing its job. - Per-channel reliability badge. If a venue, chain explorer, or feed is stale, rate-limited, or returning suspicious data, the relevant panel turns amber with a hover/tap tooltip: which feed, why flagged, expected recovery. Quiet honesty over silent corruption.
- Per-exchange flow breakdown card (Binance vs Coinbase vs Kraken vs Bitfinex etc. at a glance) after the expanded address book settles.
- LINK net-flow card on the deck once enough verified ERC-20 custody addresses are mapped per exchange for the math to be honest.
- Down the road: extend the same address-book + ingestor pattern to more chains (SOL, XRP, …) so the site is a single place for every cryptocurrency’s whale picture.
Contact
Feedback, feature requests, "this site sucks and here's why" — all welcome at culewis.j@gmail.com. Replies aren't guaranteed but everything gets read.