Founding Quant Engineer

Event Horizon Labs

Date listed

1 month ago

Employment Type

Full time

Found on:

YCombinator Startups

Keywords: python llm ml ai docker sql numpy

Why this role exists

We’re building two distinct AI-first businesses:

a long/short equity hedge fund (agent-driven research at scale)
a prop trading operation running higher-frequency crypto strategies

Across both, the edge comes from tight iteration loops—propose → test → select → ship → monitor → iterate—and we’re hiring someone to contribute materially to making our systems fast, correct, and production-grade.

What you’ll work on (three tracks across two businesses)

Track A — Agent-driven long/short equity research (Hedge Fund)

Scale evaluation for thousands of agent proposals (deterministic runs, artifact tracking, lineage)
Build point-in-time and leakage-resistant feature/data workflows (validation, staleness checks, idempotent jobs)
Improve scoring, baselines, and ablations so “wins” are real and reproducible
Package signals behind clean interfaces; add telemetry + drift/performance dashboards

Track B — Statistical trading R&D with Alpha-Evolve workflows (Prop Trading)

Harden the backtesting/simulation stack for higher-frequency statistical strategies
Implement realistic cost models (fees/spreads/slippage), turnover/capacity constraints, strict time alignment
Build system for LLM powered algorithm evolution: variant generation → evaluation → selection, with fitness metrics + constraints that account for potential overfitting
Build experiment tooling: comparisons, leaderboards, regression gates, promotion/rollback safety

Track C — Autonomous market making buildout (Prop Trading)

Market making is an unusually good fit for autonomous iteration because it natively provides tight feedback: quote → get filled (or not) → observe P&L/inventory/adverse selection → update logic → repeat. The rapidly measurable rewards make it an ideal optimization target for long-running multi-agent coding systems.

Help build the multi-agent harness that can run for days/weeks:
- Planners generate and refine tasks, workers implement, judges/CI gate merges and reset cycles (inspired by long-running agent coordination patterns)
Wire changes into trading reality: commit → sim/paper → metrics → new tasks
Strengthen coordination and safety: avoid duplicate work, prevent drift, ensure “done” means tested + observable
Build guardrails: risk limits, kill-switches, monitoring, and post-trade diagnostics

Day-to-day

Ship improvements that stay correct under scale (many assets, many runs, many variants)
Profile and optimize Python hot paths (vectorization, IO/layout, Polars/NumPy where it counts)
Write deterministic tests (unit + property-based) around point-in-time joins, feature lags, fills, and cost models
Add guardrails that prevent leakage, stale data, and “too-good-to-be-true” results
Partner with infra to move outputs to staging → live with metrics, alerts, and SLOs

Must-haves

Strong Python + practical SQL; you can ship robust systems, not just notebooks
Experience contributing to end-to-end data/ML/quant pipelines (ingest → compute → test → deploy/operate)
You understand evaluation correctness:
- leakage prevention (point-in-time data, walk-forward, embargo/purged CV where applicable)
- realistic transaction/borrow costs, turnover/capacity constraints
Comfortable with Docker + CI; you treat reproducibility and auditability as product features