cyberhybridhub/TODO.md
2026-05-31 12:40:54 -05:00

749 lines
32 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# TODO — Rolling 7-Day Market Data Window + Cleanup
> **Next milestone:** ET morning/afternoon session half bars (195-minute aggregates)
> — see [`TODO-SESSION-HALF-BARS.md`](./TODO-SESSION-HALF-BARS.md).
Companion to [`TRADING_DEVELOPMENT_PLAN.md`](./TRADING_DEVELOPMENT_PLAN.md) and
[`TRADING_TDD_PLAN.md`](./TRADING_TDD_PLAN.md).
**Goal:** maintain a rolling **7-day history** of market data for **all active
tradable assets** so the question pipeline can generate obfuscated
*guessing-game* questions about market movement, while pruning (or archiving)
anything older than the window.
**TDD rhythm (mandatory for every step):**
1. **Red** — write the failing test(s) first; commit if you like.
2. **Green** — minimum implementation that turns every test in this step green.
3. **Refactor** — tidy without changing behavior; rerun tests.
4. **Confirm** — run the full step-level confirm command listed in the step.
5. **Log** — check the box, add a row to [§12 Progress log](#12-progress-log).
> Do not skip the Red phase. Do not start the next step while any test in the
> current step is failing or pending. No live Alpaca calls in default
> `dart test` jobs — guard with `@Tags(['alpaca'])`.
---
## 0. Scope & design constraints
- **Window:** rolling 7 calendar days, UTC. Configurable via
`MARKET_HISTORY_WINDOW_DAYS` (default `7`).
- **Granularity (Phase 1):** `1Day` bars for every active tradable, plus the
existing `last_trade` / `prev_close` snapshots for watchlist symbols.
- **Granularity (Phase 2):** `1Hour` bars for the union of all enabled users'
watchlist symbols (≤30 on Alpaca Basic).
- **Universe source of truth:** Alpaca `/v2/assets?status=active&tradable=true`,
refreshed daily, cached in Postgres (`tradable_assets`).
- **Idempotency:** repeated backfill of the same
`(symbol, metric, timeframe, as_of)` MUST NOT create duplicate rows.
- **Cleanup vs. archive:** rows with `as_of < now() - window` are either
hard-deleted (Phase 1) or moved to `market_data_archive` (Phase 2).
- **Worker isolation:** historical sync + cleanup run on their own cadence
(default once per day), not on every 60s per-user tick.
- **Rate-limit safety:** batch symbols (Alpaca `bars` accepts multi-symbol);
cap concurrent symbols; never call Alpaca in tests
(`QUESTION_PIPELINE_TEST_MODE=true`).
- **No Flutter changes** required for this milestone.
---
## 1. Schema additions (migration `005_market_history.sql`)
### 1.1 Red — failing tests first
- [x] Create `server/test/integration/market_history_schema_test.dart`:
- [x] Test: `INSERT` two snapshots with the same
`(symbol, metric, timeframe, as_of)` → second one **upserts**,
does not duplicate (current schema lacks the unique constraint, so
this MUST fail Red).
- [x] Test: `timeframe` defaults to `'tick'` for existing rows; new rows
accept `'1Min' | '1Hour' | '1Day'`.
- [x] Test: `tradable_assets` PK rejects duplicate symbol; query by
`(status='active', tradable=true)` uses the new index (verify via
`EXPLAIN` returning `Index Scan`).
- [x] Test: `market_data_sync_runs` records `kind`, `started_at`,
`finished_at`, `rows_written`, `rows_removed`, `error` shape.
### 1.2 Green — minimum migration
- [x] Write `server/migrations/005_market_history.sql`:
- [x] `ALTER TABLE market_data_snapshots ADD COLUMN timeframe TEXT NOT NULL
DEFAULT 'tick'`.
- [x] `ALTER TABLE market_data_snapshots ADD CONSTRAINT
market_data_snapshots_unique_obs UNIQUE
(symbol, metric, timeframe, as_of)`.
- [x] `CREATE INDEX market_data_snapshots_asof_idx
ON market_data_snapshots (as_of DESC)`.
- [x] `CREATE TABLE tradable_assets (…)` with columns
`symbol PK, asset_class, exchange, name, tradable, fractionable,
status, raw JSONB, refreshed_at`.
- [x] `CREATE INDEX tradable_assets_status_idx
ON tradable_assets (status, tradable)`.
- [x] `CREATE TABLE market_data_sync_runs (…)` (see §0 plan).
- [ ] (Phase 2 stub, commented) `CREATE TABLE market_data_archive (…)` —
deferred to §4.2 (the migration runner splits on `;`, which would
slice a commented stub mid-block; the archive table will be added
in §4.2.2 when it is actually wired up).
### 1.3 Refactor
- [x] Confirm `MarketDataDb._rowToSnapshot` still reads correctly with the
new column (read-side back-compat — no test changes needed, just verify
existing `market_data_db_test.dart` still passes).
- [x] Move shared SQL fragments into the migration runner if duplication
appeared. _(none observed in 005; nothing to extract yet.)_
### 1.4 Confirm
- [x] `cd server && dart test test/integration/migration_test.dart
test/integration/market_history_schema_test.dart` — green.
- [x] `psql cyberhybridhub_test -c '\d market_data_snapshots'` shows the
unique constraint and new column.
---
## 2. Tradable-asset universe sync
**Files (new):** `server/lib/alpaca/alpaca_assets_client.dart`,
`server/lib/trading/tradable_assets_db.dart`,
`server/lib/trading/tradable_assets_sync.dart`.
### 2.1 Alpaca assets client
#### 2.1.1 Red
- [x] Add fixture `server/test/fixtures/alpaca_assets_active.json` (≥5
representative assets, mix of `tradable=true/false` and
`fractionable=true/false`).
- [x] Add `server/test/alpaca/alpaca_assets_client_test.dart`:
- [x] Test: `listActiveTradable()` issues `GET` to
`${tradingBaseUrl}/v2/assets?status=active&asset_class=us_equity`
with `APCA-API-KEY-ID` + `APCA-API-SECRET-KEY` headers.
- [x] Test: parses fixture into `List<AlpacaAsset>` — verifies symbol,
exchange, fractionable, tradable, status fields.
- [x] Test: 401 / 500 → throws `AlpacaAssetsException` with status code
and body in the message.
- [x] Test: empty response array → returns `[]`, does not throw.
#### 2.1.2 Green
- [x] Add `AlpacaAsset` model in `server/lib/alpaca/alpaca_models.dart`.
- [x] Implement `AlpacaAssetsClient` with injectable `http.Client`
(mirror `AlpacaMarketDataClient` shape).
- [x] Add `AlpacaAssetsException`.
#### 2.1.3 Refactor
- [x] Extract a private `_authHeaders` helper if duplicated across
Alpaca clients (DRY — but only if you actually duplicate). _(Lifted to
`AlpacaEnv.authHeaders`; now reused by all three Alpaca clients.)_
#### 2.1.4 Confirm
- [x] `dart test test/alpaca/alpaca_assets_client_test.dart` — green.
- [x] Tagged live test
`server/test/alpaca/alpaca_assets_live_test.dart`
(`@Tags(['alpaca'])`) — returns >100 symbols when keys present;
skipped otherwise. Run manually:
`dart test --tags=alpaca test/alpaca/alpaca_assets_live_test.dart`.
### 2.2 Universe persistence + diff
#### 2.2.1 Red
- [x] Create `server/test/integration/tradable_assets_db_test.dart`:
- [x] Test: `upsertAll([A, B, C])` inserts 3 rows.
- [x] Test: re-running `upsertAll([B*, C, D])` updates `B`, leaves `C`
unchanged-by-content but `refreshed_at` bumped, inserts `D`, and
marks `A` as `tradable=false, status='inactive'` (we never delete
history).
- [x] Test: `listActiveTradableSymbols()` returns only
`tradable=true AND status='active'`.
#### 2.2.2 Red — sync orchestration
- [x] Create `server/test/integration/tradable_assets_sync_test.dart`:
- [x] Test: `TradableAssetsSync.runOnce()` with mocked client returning
the fixture → DB rows match; one row in `market_data_sync_runs`
with `kind='universe'` and non-null `finished_at`.
- [x] Test: client throws → sync run row recorded with `error` populated,
`finished_at` non-null, and `rows_written = 0`.
- [x] Test: two consecutive runs are safe (idempotent counts).
#### 2.2.3 Green
- [x] Implement `TradableAssetsDb.upsertAll`,
`TradableAssetsDb.listActiveTradableSymbols`.
- [x] Implement `TradableAssetsSync.runOnce()` that writes a
`market_data_sync_runs` row around the upsert.
#### 2.2.4 Refactor
- [x] Pull "wrap a closure with a `sync_runs` audit row" into a small
helper (`SyncRunRecorder.record(kind, body)`); reuse in §3 and §4.
_(Landed at `server/lib/trading/sync_run_recorder.dart`;
`TradableAssetsSync` already consumes it.)_
#### 2.2.5 Confirm
- [x] `dart test test/integration/tradable_assets_db_test.dart
test/integration/tradable_assets_sync_test.dart` — green.
---
## 3. Historical backfill (1Day bars × 7 days)
**Files:** extend `server/lib/alpaca/alpaca_market_data_client.dart`,
new `server/lib/trading/market_data_history.dart`,
extend `server/lib/trading/market_data_db.dart`.
### 3.1 Alpaca client — time-range bars with pagination
#### 3.1.1 Red
- [x] Add fixtures:
- [x] `server/test/fixtures/alpaca_bars_7d_multi_page1.json` — includes
`next_page_token`.
- [x] `server/test/fixtures/alpaca_bars_7d_multi_page2.json` — final
page, `next_page_token: null`.
- [x] Extend `server/test/alpaca/alpaca_market_data_client_test.dart`:
- [x] Test: `getBarsRange(['SPY','AAPL'], timeframe: '1Day',
start, end)` builds correct query string (`start`, `end`,
`timeframe`, `feed`, `symbols`, `limit`).
- [x] Test: follows pagination — when page1 returns
`next_page_token='abc'`, client issues second request with
`page_token=abc`; merges both pages' bars per symbol.
- [x] Test: stops after a configurable `maxPages` (default 20) to
prevent runaway loops.
- [x] Test: 429 → throws `AlpacaMarketDataException` containing the
word `rate` (so caller can detect & back off).
#### 3.1.2 Green
- [x] Implement `Future<AlpacaBarsResponse> getBarsRange({
List<String> symbols, String timeframe, DateTime start, DateTime end,
int maxPages = 20})` on `AlpacaMarketDataClient`.
- [x] Extend `AlpacaBarsResponse` with a `merge(AlpacaBarsResponse other)`
method so paginated chunks combine cleanly.
#### 3.1.3 Refactor
- [x] If the pagination loop is non-trivial, extract a private
`_paginate<T>(initialUri, parsePage)` generic to reuse later for
orders/positions endpoints. _(Loop kept inline in `getBarsRange` —
~25 lines, clear enough; extract when a second consumer appears.)_
#### 3.1.4 Confirm
- [x] `dart test test/alpaca/alpaca_market_data_client_test.dart` — green.
- [x] Tagged live test
`server/test/alpaca/alpaca_market_data_history_live_test.dart`
fetches 7-day bars for `SPY` and asserts ≥3 bars.
### 3.2 `MarketDataDb` — idempotent upsert + range query
#### 3.2.1 Red
- [x] Extend `server/test/integration/market_data_db_test.dart`:
- [x] Test: `upsertSnapshot(symbol:'SPY', metric:'bar',
timeframe:'1Day', as_of:T, price:500)` then re-upsert with
`price:505` → exactly **one** row remains; price is `505`; `raw`
is overwritten (volume also overwritten).
- [x] Test: `barsForSymbol(symbol, timeframe, since, until)` returns
rows ordered by `as_of ASC`; range is inclusive of `since`,
exclusive of `until`.
- [x] Test: `barsForSymbol` returns `[]` when no rows match; does not
throw.
- [x] Test: `latestSyncedAsOf(symbol, timeframe)` returns the newest
`as_of` or `null`.
#### 3.2.2 Green
- [x] Implement `MarketDataDb.upsertSnapshot(...)` using
`ON CONFLICT (symbol, metric, timeframe, as_of) DO UPDATE
SET price = EXCLUDED.price, volume = EXCLUDED.volume,
raw = EXCLUDED.raw`.
- [x] Implement `MarketDataDb.barsForSymbol(...)` and
`MarketDataDb.latestSyncedAsOf(...)`.
#### 3.2.3 Refactor
- [x] Replace existing `insertSnapshot` call sites in
`market_data_ingest.dart` with `upsertSnapshot` (tick data has
`timeframe='tick'`; same call shape). Re-run
`test/integration/market_data_ingest_test.dart` — still green.
#### 3.2.4 Confirm
- [x] `dart test test/integration/market_data_db_test.dart
test/integration/market_data_ingest_test.dart` — green.
### 3.3 `MarketDataHistorySync`
#### 3.3.1 Red
- [x] Add fixture
`server/test/fixtures/alpaca_bars_7d_3symbols.json` — 7 bars × 3
symbols (SPY/AAPL/MSFT), realistic timestamps.
- [x] Add `server/test/integration/market_data_history_sync_test.dart`:
- [x] Test: with mocked Alpaca returning the fixture → 21 rows upserted
with `metric='bar'`, `timeframe='1Day'`; sync run row written.
- [x] Test: re-running with the same fixture → still 21 rows; zero
duplicates; `rows_written` reflects rows touched (not inserted).
- [x] Test: partial outage — Alpaca returns 200 for batch 1
(AAPL/MSFT), 500 for batch 2 (SPY) → AAPL/MSFT rows persisted;
sync run row has `error` mentioning SPY; method does NOT throw.
- [x] Test: respects `HISTORY_SYNC_MAX_SYMBOLS` cap (set to 2 → only
first 2 symbols fetched).
- [x] Test: batching — with `HISTORY_SYNC_BATCH_SIZE=2` and 5 symbols,
Alpaca is called 3 times (mock call counter).
#### 3.3.2 Green
- [x] Implement `MarketDataHistorySync.runOnce({int windowDays = 7})`:
- [x] Reads symbols from
`TradableAssetsDb.listActiveTradableSymbols()`.
- [x] Batches into `HISTORY_SYNC_BATCH_SIZE` groups; calls
`getBarsRange` per batch.
- [x] Upserts via `MarketDataDb.upsertSnapshot`.
- [x] Captures per-batch errors without aborting; aggregates them into
the sync run row (`SyncRunCounts.error`).
#### 3.3.3 Refactor
- [x] Extract batching helper if used by §3.4 incremental path too.
_(Landed as `chunkList` in `market_data_history.dart`.)_
#### 3.3.4 Confirm
- [x] `dart test test/integration/market_data_history_sync_test.dart`
— green.
### 3.4 Incremental daily catch-up
#### 3.4.1 Red
- [x] Extend `market_data_history_sync_test.dart`:
- [x] Test: with prior `latestSyncedAsOf(symbol)` = `T-2d`, sync issues
bars with `start = T-2d` (not `T-7d`); mock HTTP call records
the requested start.
- [x] Test: with prior sync `T-10d` (outside window), `start` is
clamped to `T-windowDays`.
- [x] Test: cold start (no prior sync) → `start = T-windowDays`.
#### 3.4.2 Green
- [x] Compute per-symbol `start` using `latestSyncedAsOf`; pass to
`getBarsRange`.
#### 3.4.3 Refactor
- [x] If per-symbol starts vary inside a batch, fall back to
`min(starts)` for the batched call and let `upsertSnapshot`
dedupe the overlap — document the tradeoff in a code comment.
#### 3.4.4 Confirm
- [x] `dart test test/integration/market_data_history_sync_test.dart`
— green.
---
## 4. Retention & cleanup (older than 7 days)
**Files (new):** `server/lib/trading/market_data_retention.dart`.
### 4.1 Hard-delete (Phase 1)
#### 4.1.1 Red
- [x] Create `server/test/integration/market_data_retention_test.dart`:
- [x] Test: seed 10 snapshots spanning 14 days →
`runCleanup({windowDays: 7})` deletes rows with
`as_of < now() - 7d`, keeps the rest; returns `rowsRemoved`
matching deleted count.
- [x] Test: empty table → returns `rowsRemoved = 0`, does not throw.
- [x] Test: `batchSize` honored — with 5000 rows older than window and
`batchSize=1000`, the underlying `DELETE` is issued ≥5 times
(use a counting wrapper around `_connection.execute`).
- [x] Test: each invocation appends a `market_data_sync_runs` row
with `kind='cleanup'`, `rows_removed` populated.
- [x] Test: rows within window are NEVER touched (assert specific IDs
survive).
#### 4.1.2 Green
- [x] Implement
`MarketDataRetention.runCleanup({int windowDays = 7,
int batchSize = 5000})`:
- [x] Loop: `DELETE FROM market_data_snapshots WHERE as_of < $cutoff
LIMIT $batchSize` (use CTE if Postgres version requires it),
return rows removed; repeat until 0.
- [x] Write a `market_data_sync_runs` row around the operation.
#### 4.1.3 Refactor
- [x] Reuse `SyncRunRecorder` from §2.2.4.
#### 4.1.4 Confirm
- [x] `dart test test/integration/market_data_retention_test.dart`
— green.
### 4.2 Archive (Phase 2 — opt-in)
#### 4.2.1 Red
- [x] Extend `market_data_retention_test.dart`:
- [x] Test: with `archiveEnabled: true`, expired rows are copied into
`market_data_archive` with `archived_at = now()` BEFORE being
deleted; archive count grows by exactly `rowsRemoved`.
- [x] Test: archive run is transactional — if archive `INSERT` fails,
no `DELETE` happens; sync run row records the error.
- [x] Test: `archiveEnabled: false` (default) → archive table
untouched.
#### 4.2.2 Green
- [x] Uncomment `market_data_archive` table in migration 005 (or add it
now if you deferred it). _(Added `006_market_data_archive.sql`.)_
- [x] Implement
`MarketDataRetention.runArchiveAndCleanup({int windowDays})`
with explicit `BEGIN; INSERT SELECT …; DELETE …; COMMIT`.
#### 4.2.3 Refactor
- [x] Consider a single unified entry point
`MarketDataRetention.run({int windowDays, bool archive})` that
dispatches; only do this if it doesn't muddy the failure-isolation
story.
#### 4.2.4 Confirm
- [x] `dart test test/integration/market_data_retention_test.dart`
— green.
---
## 5. Scheduler — daily cadence inside the worker
**Files:** new `server/lib/workers/market_history_scheduler.dart`,
extend `server/lib/workers/question_background_worker.dart`,
extend `server/bin/server.dart`.
### 5.1 Red
- [x] Add `server/test/integration/market_history_scheduler_test.dart`:
- [x] Test: cold start → `runIfDue(now=T0)` runs all 3 stages
(`universe`, `backfill`, `cleanup`) in that order;
`market_data_sync_runs` has 3 rows.
- [x] Test: same-day re-run (`now=T0 + 1h`) → no stages run; zero new
sync rows.
- [x] Test: next day (`now=T0 + 24h`) → all 3 stages run again.
- [x] Test: per-stage cadence — set
`MARKET_UNIVERSE_REFRESH_HOURS=48`, `MARKET_HISTORY_SYNC_HOURS=24`,
`MARKET_HISTORY_CLEANUP_HOURS=24`; at T0+24h only backfill +
cleanup run.
- [x] Test: failure isolation — backfill throws → cleanup still runs;
both stages logged in `market_data_sync_runs` (one with `error`,
one without).
- [x] Test: `MARKET_HISTORY_SYNC_HOUR_UTC=10` (optional alignment) →
scheduler runs only when local UTC hour ≥ 10 AND last run was on
a prior UTC day.
- [x] Add `server/test/integration/market_history_worker_wireup_test.dart`:
- [x] Test: `QuestionBackgroundWorker._tick` invokes
`MarketHistoryScheduler.runIfDue` **before** the
`TradingOrchestrator` per-user loop. Use a spy scheduler that
records the call order.
- [x] Test: scheduler exception is caught — worker tick continues into
the orchestrator loop; stderr contains the error.
### 5.2 Green
- [x] Implement `MarketHistoryScheduler` with `runIfDue(DateTime now)`,
reading the last `finished_at` per `kind` from
`market_data_sync_runs`.
- [x] Wire `QuestionBackgroundWorker` to accept an optional
`MarketHistoryScheduler` and call it at the top of `_tick`.
- [x] Wire `bin/server.dart` to construct the scheduler only when
`MARKET_HISTORY_SYNC_ENABLED=true && TRADING_ENABLED=true`.
### 5.3 Refactor
- [x] If the three stages each need similar before/after logic, abstract
a small `_runStage(kind, body)` inside the scheduler.
(`_maybeRunStage` — no further refactor needed.)
### 5.4 Confirm
- [x] `dart test test/integration/market_history_scheduler_test.dart
test/integration/market_history_worker_wireup_test.dart` — green.
---
## 6. Question pipeline — "guess the move" rule
**Files:** extend `server/lib/trading/rule_engine.dart`,
extend `server/lib/trading/trading_pipeline.dart`,
new `server/lib/trading/market_history_query.dart`.
The guessing game uses the rolling 7-day window — questions must reveal
just enough for the user to guess (obfuscated symbol/price/direction).
**No trade is placed for this rule** — answers feed scoring only.
### 6.1 Red — `MarketHistoryQuery`
- [x] Add `server/test/integration/market_history_query_test.dart`:
- [x] Test: `weeklyMovers({minBars: 5, asOf})` returns only symbols
with ≥5 daily bars in the window; each entry exposes
`(symbol, openClose, currentClose, days)`.
- [x] Test: deterministic — supply a `random: Random(42)` and assert a
stable selection order across runs.
- [x] Test: symbols with stale data (newest bar > 2d old) are
excluded.
### 6.2 Red — rule engine extension
- [x] Add `server/test/trading/rule_engine_guess_weekly_move_test.dart`:
- [x] Test: rule kind `guess_weekly_move` with mocked
`MarketHistoryQuery` returning SPY {ref=500, current=510, days=5}
→ produces a `RuleEvaluation` with:
- obfuscated `symbol_token='ASSET_A'`,
- `correct_answer = 10` (up direction),
- `question_text` substituting `{{token}}`, `{{ref_price}}`,
`{{ref_days_ago}}`, NEVER `{{symbol}}`.
- [x] Test: down move (ref=510, current=500) → `correct_answer = -10`.
- [x] Test: insufficient bars → no fire.
- [x] Test: `questions.metadata.guess_symbol` is set to real symbol
(server-side only) when the question is created in §6.3.
### 6.3 Red — pipeline wiring
- [x] Add
`server/test/integration/trading_pipeline_guess_weekly_move_test.dart`:
- [x] Test: end-to-end with seeded 7 daily bars for SPY → pipeline
creates a question with obfuscated text; `metadata.guess_symbol
= 'SPY'`; `pipeline_key='trading'`,
`pipeline_step='guess_weekly_move:await_answer'`.
- [x] Test: `onAnswerSubmitted` with matching direction (e.g., +10 on
an up move) records `score_delta = +1` in
`user_trading_state.context.guess_score`; non-matching records
`score_delta = -1`.
- [x] Test: `TradeActuator.processPendingOrders` is **NEVER called**
for `guess_weekly_move` answers (assert via spy).
- [x] Test: cooldown — after a fire, the same symbol is not re-picked
for `GUESS_COOLDOWN_HOURS` (default 24).
### 6.4 Green
- [x] Implement `MarketHistoryQuery.weeklyMovers({...})`.
- [x] Add rule kind to `RuleEngine` with the new template tokens.
- [x] Extend `TradingPipeline.evaluate` + `onAnswerSubmitted` for the
new rule kind, including the cooldown bookkeeping.
### 6.5 Refactor
- [x] If the token mapping (real symbol ↔ `ASSET_A`/`ASSET_B`/…) is used
in multiple places, lift it into a `SymbolObfuscator` helper with
its own focused unit test.
### 6.6 Confirm
- [x] `dart test test/integration/market_history_query_test.dart
test/trading/rule_engine_guess_weekly_move_test.dart
test/integration/trading_pipeline_guess_weekly_move_test.dart`
— green.
---
## 7. Env additions (`server/.env.example`)
```bash
# Rolling history feature gate
MARKET_HISTORY_SYNC_ENABLED=false
MARKET_HISTORY_WINDOW_DAYS=7
MARKET_HISTORY_RETENTION_DAYS=7
MARKET_HISTORY_ARCHIVE_ENABLED=false
# Cadence (hours)
MARKET_UNIVERSE_REFRESH_HOURS=24
MARKET_HISTORY_SYNC_HOURS=24
MARKET_HISTORY_CLEANUP_HOURS=24
MARKET_HISTORY_SYNC_HOUR_UTC=10 # optional alignment hour
# Batching / safety
HISTORY_SYNC_BATCH_SIZE=50
HISTORY_SYNC_MAX_SYMBOLS=2000 # hard cap; Alpaca Basic-friendly
MIN_BARS_FOR_GUESS=5
GUESS_COOLDOWN_HOURS=24
```
### 7.1 Red
- [x] Add `server/test/env/market_history_env_test.dart`:
- [x] Test: defaults parsed when env empty (`enabled=false`,
`windowDays=7`, etc.).
- [x] Test: `MARKET_HISTORY_SYNC_ENABLED=true` while
`TRADING_ENABLED=false` → `Env.assertConsistent()` throws.
- [x] Test: `MARKET_HISTORY_WINDOW_DAYS=0` or negative → throws.
- [x] Test: `MARKET_HISTORY_SYNC_HOUR_UTC=24` → throws (valid range
`0..23`).
### 7.2 Green
- [x] Extend `server/lib/env.dart` to load and validate these vars.
- [x] Append the block above to `server/.env.example`.
### 7.3 Refactor
- [x] If `Env` has grown unwieldy, split market-history vars into
`MarketHistoryEnv` (typed value object) and have `Env` expose it.
### 7.4 Confirm
- [x] `dart test test/env/market_history_env_test.dart` — green.
- [x] Document each var in `server/README.md` under a new
**"Market history window"** subsection.
---
## 8. Operational tooling
### 8.1 Red
- [ ] Add `server/test/bin/sync_market_history_smoke_test.dart`:
- [ ] Test: imports `bin/sync_market_history.dart` `main` function and
runs it with `QUESTION_PIPELINE_TEST_MODE=true` + a fake DB →
exits 0 and emits the expected one-line log.
- [ ] Add equivalent smoke test for `bin/cleanup_market_history.dart`.
### 8.2 Green
- [ ] Add CLI `server/bin/sync_market_history.dart` with
`--window=<days>` flag (default 7); honors test mode.
- [ ] Add CLI `server/bin/cleanup_market_history.dart` with
`--window=<days>` and `--archive` flags.
- [ ] Add structured one-line log:
`kind=… symbols=… rows_written=… rows_removed=… duration_ms=… error=…`.
### 8.3 Refactor
- [ ] Share argument parsing between the two CLIs if duplicated.
### 8.4 Confirm
- [ ] `dart test test/bin/` — green.
- [ ] Manual: `dart run server:bin/sync_market_history.dart --window=7`
against `cyberhybridhub_test` works end-to-end.
### 8.5 Optional admin endpoint (defer until needed)
- [ ] Behind Firebase admin auth, `POST /v1/admin/market-data/resync?window=7`
enqueues a sync run; **not exposed to Flutter**.
---
## 9. Test pyramid for this milestone
| Layer | Test files |
|------|------------|
| Unit | `test/alpaca/alpaca_assets_client_test.dart` |
| Unit | `test/alpaca/alpaca_market_data_client_test.dart` (extended) |
| Unit | `test/trading/rule_engine_guess_weekly_move_test.dart` |
| Unit | `test/env/market_history_env_test.dart` |
| DB integration | `test/integration/market_history_schema_test.dart` |
| DB integration | `test/integration/tradable_assets_db_test.dart` |
| DB integration | `test/integration/tradable_assets_sync_test.dart` |
| DB integration | `test/integration/market_data_db_test.dart` (extended) |
| DB integration | `test/integration/market_data_history_sync_test.dart` |
| DB integration | `test/integration/market_data_retention_test.dart` |
| DB integration | `test/integration/market_history_query_test.dart` |
| DB integration | `test/integration/trading_pipeline_guess_weekly_move_test.dart` |
| Worker integration | `test/integration/market_history_scheduler_test.dart` |
| Worker integration | `test/integration/market_history_worker_wireup_test.dart` |
| Bin smoke | `test/bin/sync_market_history_smoke_test.dart` |
| Bin smoke | `test/bin/cleanup_market_history_smoke_test.dart` |
| Tagged (`alpaca`) | `test/alpaca/alpaca_assets_live_test.dart` |
| Tagged (`alpaca`) | `test/alpaca/alpaca_market_data_history_live_test.dart` |
**CI gating:**
```bash
# default job — no Alpaca keys, must pass on every PR
cd server && dart test
# nightly / manual — requires ALPACA_API_KEY_ID / ALPACA_API_SECRET_KEY
cd server && dart test --tags=alpaca
```
---
## 10. Acceptance criteria (Gate H — Rolling history)
- [ ] `market_data_snapshots` contains rows for every active tradable with
`as_of` within the last 7 days, and no rows older.
- [ ] Re-running backfill is a no-op (zero duplicate rows; deterministic
`rows_written` count when nothing changed upstream).
- [ ] Cleanup removes only rows older than the window and never touches
newer rows.
- [ ] Worker performs one full cycle (universe → backfill → cleanup) per
day with stage isolation; failure in one stage does not block the
others.
- [ ] A `guess_weekly_move` question can be generated end-to-end from
pure DB data — no live Alpaca call at evaluation time.
- [ ] `dart test` is green; `dart test --tags=alpaca` is green when keys
are present.
- [ ] `MARKET_HISTORY_SYNC_ENABLED=false` is the default; nothing runs
unless explicitly enabled.
- [ ] Safety: `MARKET_HISTORY_SYNC_ENABLED=true` without
`TRADING_ENABLED=true` fails fast at server boot.
---
## 11. Risks & mitigations
| Risk | Mitigation |
|------|------------|
| Alpaca rate limits on full-universe pull | Batched `bars` calls (`HISTORY_SYNC_BATCH_SIZE`); per-batch error isolation; 429 → exception logged in sync run, retry next day. |
| Migration deadlocks on large `market_data_snapshots` | Cleanup batches via `LIMIT` + loop; unique constraint added with `NOT VALID` then `VALIDATE CONSTRAINT` if existing dataset is huge (document in migration). |
| Duplicate Alpaca asset entries between runs | `upsertAll` PK-on-symbol; we mark missing symbols inactive instead of deleting. |
| Guessing game leaks the real symbol | Question text uses tokens only; real symbol lives in `questions.metadata` (server side); add a regex test that scans `question_text` for any known ticker. |
| Backfill blowing past disk budget | Hard caps via `HISTORY_SYNC_MAX_SYMBOLS` and `MARKET_HISTORY_WINDOW_DAYS`; retention deletes daily so steady-state size is bounded. |
---
## 12. Progress log
<!-- Newest entries on top. One line per completed step. -->
| Date | Step | Result |
|------|------|--------|
| 2026-05-26 | §7 Env additions | Green: 6/6 env tests; `dart test` 133/133. `MarketHistoryEnv.fromMap` + `assertConsistent`; `ServerEnv.marketHistory`; wired scheduler/sync/retention/guess; `server/.env.example`; README **Market history window**. |
| 2026-05-26 | §6 Guess-the-move rule | Green: 12 new tests; `dart test` 127/127. `MarketHistoryQuery.weeklyMovers`; `RuleEngine.evaluateGuessWeeklyMove`; `SymbolObfuscator`; `TradingPipeline` scoring + per-symbol cooldown; `questions.metadata` migration `007`; no pending orders on guess answers. |
| 2026-05-26 | §5 Scheduler (worker cadence) | Green: 8/8 scheduler + wireup tests; `dart test` 115/115. `MarketHistoryScheduler.runIfDue` (per-kind cadence + optional `syncHourUtc`); worker calls scheduler before pipeline/trading; `server.dart` wires universe→backfill→cleanup when `MARKET_HISTORY_SYNC_ENABLED` + real Alpaca; `ServerEnv.marketHistorySyncEnabled`; `SyncRunRecorder` uses injected `now` for `finished_at`. |
| 2026-05-26 | §4 Retention & cleanup | Green: 8/8 retention tests; `dart test` 107/107. `MarketDataRetention.runCleanup` (batched hard-delete via CTE+RETURNING); `runArchiveAndCleanup` (transactional archive-then-delete); unified `run(archive:)`; migration `006_market_data_archive.sql`; reuses `SyncRunRecorder` kind=`cleanup`. |
| 2026-05-26 | §3 Historical backfill (1Day × 7d) | Green: 17 new tests (6 client + 4 db + 8 sync); `dart test` 99/99; live `alpaca_market_data_history_live_test` ≥3 SPY bars. `getBarsRange` + pagination; `upsertSnapshot`/`barsForSymbol`/`latestSyncedAsOf`; `MarketDataHistorySync` with incremental catch-up + partial batch errors via `SyncRunCounts.error`. Defaults in `MarketHistoryConfig` (batch=100). |
| 2026-05-26 | §2 Tradable-asset universe sync | Green: 11/11 §2 tests pass (5 client + 3 db + 3 sync); `dart test` 82/82 green; tagged live `alpaca_assets_live_test` returned >100 active us_equity assets. Refactor 2.1.3 lifted auth headers to `AlpacaEnv.authHeaders`; 2.2.4 lifted `SyncRunRecorder` for §3/§4 reuse. |
| 2026-05-26 | §1 Schema additions (migration `005_market_history.sql`) | Green: 5/5 schema tests pass; `dart test` 70/70 green; `\d market_data_snapshots` shows `timeframe` col + `market_data_snapshots_unique_obs` unique constraint. Archive stub deferred to §4.2 to keep `;`-split migration runner happy. |
---
## 13. References
- Existing snapshot writer: `server/lib/trading/market_data_ingest.dart`
- Existing snapshot DB: `server/lib/trading/market_data_db.dart`
- Existing migration to extend: `server/migrations/004_trading.sql`
- Orchestrator hook point: `server/lib/trading/trading_orchestrator.dart`
- Worker hook point: `server/lib/workers/question_background_worker.dart`
- Plans: [`TRADING_DEVELOPMENT_PLAN.md`](./TRADING_DEVELOPMENT_PLAN.md),
[`TRADING_TDD_PLAN.md`](./TRADING_TDD_PLAN.md)
- Alpaca docs: [Market Data](https://docs.alpaca.markets/docs/market-data-api),
[Trading / Assets](https://docs.alpaca.markets/docs/trading-api),
[Bars](https://docs.alpaca.markets/reference/stockbars)
---
*Document version: 1.0 — Rolling 7-day market data window, cleanup, and
guessing-game question integration.*