Skip to main content
Validraft

Independent validation desk

You have a tradinghypothesis.We validate it.

A backtest that looks good isn’t one you can trust yet. Send us your hypothesis — we run it through an institutional, multi-gate validation gauntlet built to expose overfitting, then hand back an independent report. Typically within 48 hours for Tier A scope.

Validation report

False-breakout 20-day reversal · US equities · Swing

Live sample

Page 1 /

Open full sample report →

Real PDF from the validation pipeline · sanitized for public release

What you can validate

Backtest almost any trading strategy

Backtesting is how a trading idea earns trust: replaying a strategy over decades of historical market data to measure how it would have behaved — before any capital is at risk. Validraft is built for rigorous, reproducible quantitative research. You don’t need a fully coded algo: discretionary, systematic, and hybrid hypotheses all start the same way — a brief in plain English. We translate your rules into a testable specification, then run it through the same institutional engine with walk-forward and out-of-sample validation by default.

The strategy families we routinely backtest span technical, fundamental, event-driven, and alternative-data approaches — on futures, US equities, FX, and crypto. Pick a family to see concrete example backtests, or describe your own idea and we’ll scope it.

  • Futures
  • US equities
  • FX
  • Crypto
  • Cross-asset
Browse all strategy families

Don’t see your exact approach? Discretionary playbooks and partial rules are exactly what the brief is for — we scope what can be tested honestly before any compute runs.

How it works

Inside the validation engine

Every brief travels the same eight-stage pipeline — from immutable, point-in-time data to a signed-off report with full lineage. Step through it, or let it walk you through itself.

Validation pipeline

STAGE 01 / 08

Brief & scope

  1. You describe the asset, timeframe, rules, and what success looks like — in plain English. We translate it into a precise, testable specification and confirm data coverage and feasibility, so the engagement is scoped to what the data lake can genuinely support.

    • Hypothesis intake
    • Feasibility review
    • Tier scoping

Validation framework

Evidence, limitations, and lineage in one audit-ready PDF

Each engagement ends with a structured validation report: what was tested, which controls ran, where the edge held up, where it failed, and what the result does not prove. No black-box score. No recommendation language.

Behind every report sits a scoped validation profile drawn from 66+ controls: data integrity, statistical validation, stress testing, risk measurement, benchmarking, and reproducibility.

Control stack

Ordered by what can break a backtest

Hover to scan

Data

Point-in-time inputs

Survivorship, leakage, joins, gaps, and source lineage checked before the run.

Stats

Overfitting controls

OOS, walk-forward, DSR, PBO, haircut Sharpe, null tests, and purging.

Stress

Regime behavior

Crisis windows, liquidity stress, Monte Carlo path risk, and regime splits.

Risk

Performance evidence

Drawdown family, tail risk, attribution, benchmarks, sizing, and limits.

Audit

Reproducible report

Run manifest, versioned inputs, model card, methodology, and limitations.

  • Phase 0Formalized economic hypothesisEconomic rationale & pre-registration
  • Phase 0Edge classificationEconomic rationale & pre-registration
  • Phase 0Ex-ante expected valueEconomic rationale & pre-registration
  • Phase 0Publication bias & factor decay reviewEconomic rationale & pre-registration
  • Phase 0Causal robustness checklistEconomic rationale & pre-registration
  • Phase 1Survivorship-bias correctionData integrity & point-in-time
  • Phase 1Point-in-time validationData integrity & point-in-time
  • Phase 1Corporate-actions adjustmentData integrity & point-in-time
  • Phase 1Data-quality auditData integrity & point-in-time
  • Phase 1Timezone & holiday alignmentData integrity & point-in-time
  • Phase 1Pipeline isolationData integrity & point-in-time
  • Phase 1Investable-universe constructionData integrity & point-in-time
  • Phase 2 & 3Look-ahead bias detectionSignal & feature integrity
  • Phase 2 & 3Feature leakage detectionSignal & feature integrity
  • Phase 2 & 3Overfitting diagnosisSignal & feature integrity
  • Phase 2 & 3Stationarity tests (ADF / KPSS)Signal & feature integrity
  • Phase 2 & 3Information Coefficient (IC / ICIR)Signal & feature integrity
  • Phase 2 & 3Feature stability & redundancySignal & feature integrity
  • Phase 2 & 3Parameter sensitivity analysisSignal & feature integrity
  • Phase 3.5Bid-ask spread simulationExecution & cost realism
  • Phase 3.5Commission & financing costsExecution & cost realism
  • Phase 3.5Market-impact modelingExecution & cost realism
  • Phase 3.5Turnover analysisExecution & cost realism
  • Phase 3.5Capacity analysisExecution & cost realism
  • Phase 3.5Short-borrow costExecution & cost realism
  • Phase 4Out-of-sample testingStatistical validation
  • Phase 4Walk-forward analysisStatistical validation
  • Phase 4Combinatorial Purged CV (CPCV)Statistical validation
  • Phase 4Deflated Sharpe Ratio (DSR)Statistical validation
  • Phase 4Probability of Backtest Overfitting (PBO)Statistical validation
  • Phase 4Haircut Sharpe (Harvey–Liu)Statistical validation
  • Phase 4Permutation / null-edge testsStatistical validation
  • Phase 4Walk-forward permutation testStatistical validation
  • Phase 4Multiple-testing correctionStatistical validation
  • Phase 4Autocorrelation (Ljung–Box)Statistical validation
  • Phase 4Purging & embargoStatistical validation
  • Phase 4Minimum backtest length (MinBTL)Statistical validation
  • Phase 4.5Historical crisis scenariosStress testing & regime analysis
  • Phase 4.5Regime-conditional analysisStress testing & regime analysis
  • Phase 4.5Regime contract (ex-ante vs ex-post)Stress testing & regime analysis
  • Phase 4.5Hidden Markov & change-pointStress testing & regime analysis
  • Phase 4.5Monte Carlo path riskStress testing & regime analysis
  • Phase 4.5Perturbation & synthetic scenariosStress testing & regime analysis
  • Phase 4.5Liquidity stressStress testing & regime analysis
  • Phase 5Risk-adjusted returnsPerformance metrics
  • Phase 5Drawdown familyPerformance metrics
  • Phase 5Tail risk (CVaR / CDaR)Performance metrics
  • Phase 5Distribution shapePerformance metrics
  • Phase 5Equity-curve stabilityPerformance metrics
  • Phase 5.5Buy-and-hold benchmarkBenchmarking & attribution
  • Phase 5.5Alpha / beta decompositionBenchmarking & attribution
  • Phase 5.5Multi-factor attribution (FF5 + UMD)Benchmarking & attribution
  • Phase 5.5Information ratioBenchmarking & attribution
  • Phase 5.5Tail-risk correlationBenchmarking & attribution
  • Phase 5.5Market-neutral residual testBenchmarking & attribution
  • Phase 6Ruin & drawdown analysisRisk, sizing & portfolio
  • Phase 6Position sizing (Kelly / vol target)Risk, sizing & portfolio
  • Phase 6Risk-parity & constrained allocationRisk, sizing & portfolio
  • Phase 6Correlation & crowding (internal)Risk, sizing & portfolio
  • Phase 6Drawdown limits & kill-switchRisk, sizing & portfolio
  • Phase 8–10Pre-registration & sign-offGovernance, lineage & reproducibility
  • Phase 8–10Versioned data & codeGovernance, lineage & reproducibility
  • Phase 8–10End-to-end data lineageGovernance, lineage & reproducibility
  • Phase 8–10Audit trail of decisionsGovernance, lineage & reproducibility
  • Phase 8–10Model cardGovernance, lineage & reproducibility
  • Phase 8–10ReproducibilityGovernance, lineage & reproducibility
  • Phase 0Formalized economic hypothesisEconomic rationale & pre-registration
  • Phase 0Edge classificationEconomic rationale & pre-registration
  • Phase 0Ex-ante expected valueEconomic rationale & pre-registration
  • Phase 0Publication bias & factor decay reviewEconomic rationale & pre-registration
  • Phase 0Causal robustness checklistEconomic rationale & pre-registration
  • Phase 1Survivorship-bias correctionData integrity & point-in-time
  • Phase 1Point-in-time validationData integrity & point-in-time
  • Phase 1Corporate-actions adjustmentData integrity & point-in-time
  • Phase 1Data-quality auditData integrity & point-in-time
  • Phase 1Timezone & holiday alignmentData integrity & point-in-time
  • Phase 1Pipeline isolationData integrity & point-in-time
  • Phase 1Investable-universe constructionData integrity & point-in-time
  • Phase 2 & 3Look-ahead bias detectionSignal & feature integrity
  • Phase 2 & 3Feature leakage detectionSignal & feature integrity
  • Phase 2 & 3Overfitting diagnosisSignal & feature integrity
  • Phase 2 & 3Stationarity tests (ADF / KPSS)Signal & feature integrity
  • Phase 2 & 3Information Coefficient (IC / ICIR)Signal & feature integrity
  • Phase 2 & 3Feature stability & redundancySignal & feature integrity
  • Phase 2 & 3Parameter sensitivity analysisSignal & feature integrity
  • Phase 3.5Bid-ask spread simulationExecution & cost realism
  • Phase 3.5Commission & financing costsExecution & cost realism
  • Phase 3.5Market-impact modelingExecution & cost realism
  • Phase 3.5Turnover analysisExecution & cost realism
  • Phase 3.5Capacity analysisExecution & cost realism
  • Phase 3.5Short-borrow costExecution & cost realism
  • Phase 4Out-of-sample testingStatistical validation
  • Phase 4Walk-forward analysisStatistical validation
  • Phase 4Combinatorial Purged CV (CPCV)Statistical validation
  • Phase 4Deflated Sharpe Ratio (DSR)Statistical validation
  • Phase 4Probability of Backtest Overfitting (PBO)Statistical validation
  • Phase 4Haircut Sharpe (Harvey–Liu)Statistical validation
  • Phase 4Permutation / null-edge testsStatistical validation
  • Phase 4Walk-forward permutation testStatistical validation
  • Phase 4Multiple-testing correctionStatistical validation
  • Phase 4Autocorrelation (Ljung–Box)Statistical validation
  • Phase 4Purging & embargoStatistical validation
  • Phase 4Minimum backtest length (MinBTL)Statistical validation
  • Phase 4.5Historical crisis scenariosStress testing & regime analysis
  • Phase 4.5Regime-conditional analysisStress testing & regime analysis
  • Phase 4.5Regime contract (ex-ante vs ex-post)Stress testing & regime analysis
  • Phase 4.5Hidden Markov & change-pointStress testing & regime analysis
  • Phase 4.5Monte Carlo path riskStress testing & regime analysis
  • Phase 4.5Perturbation & synthetic scenariosStress testing & regime analysis
  • Phase 4.5Liquidity stressStress testing & regime analysis
  • Phase 5Risk-adjusted returnsPerformance metrics
  • Phase 5Drawdown familyPerformance metrics
  • Phase 5Tail risk (CVaR / CDaR)Performance metrics
  • Phase 5Distribution shapePerformance metrics
  • Phase 5Equity-curve stabilityPerformance metrics
  • Phase 5.5Buy-and-hold benchmarkBenchmarking & attribution
  • Phase 5.5Alpha / beta decompositionBenchmarking & attribution
  • Phase 5.5Multi-factor attribution (FF5 + UMD)Benchmarking & attribution
  • Phase 5.5Information ratioBenchmarking & attribution
  • Phase 5.5Tail-risk correlationBenchmarking & attribution
  • Phase 5.5Market-neutral residual testBenchmarking & attribution
  • Phase 6Ruin & drawdown analysisRisk, sizing & portfolio
  • Phase 6Position sizing (Kelly / vol target)Risk, sizing & portfolio
  • Phase 6Risk-parity & constrained allocationRisk, sizing & portfolio
  • Phase 6Correlation & crowding (internal)Risk, sizing & portfolio
  • Phase 6Drawdown limits & kill-switchRisk, sizing & portfolio
  • Phase 8–10Pre-registration & sign-offGovernance, lineage & reproducibility
  • Phase 8–10Versioned data & codeGovernance, lineage & reproducibility
  • Phase 8–10End-to-end data lineageGovernance, lineage & reproducibility
  • Phase 8–10Audit trail of decisionsGovernance, lineage & reproducibility
  • Phase 8–10Model cardGovernance, lineage & reproducibility
  • Phase 8–10ReproducibilityGovernance, lineage & reproducibility

Post-validation monitoring

Validation continues after delivery

A strategy that passed validation yesterday can drift tomorrow. For ongoing engagements, we keep watching rolling Sharpe, drawdown, signal decay, and regime behaviour — benchmarked against the same gates that produced your report.

Alerts and history live in your client workspace, with a direct line back to the desk when something needs a human read.

Client workspace · preview

Opening-range breakout — NQ

Healthy
Sharpe (full)
1.42
Sharpe (30d)
1.31
Max DD
−8.4%

Illustrative metrics — your live workspace reflects each delivered engagement.

Rolling Sharpe

30-day Sharpe tracked against the validation baseline. Watch and critical bands when performance drifts.

Drawdown tracking

Graded alerts as peak-to-trough loss approaches stress-test envelopes from your report.

Signal decay

Second-half vs first-half performance comparison — early warning when the edge starts to fade.

Regime shifts

Conditional correlation and behaviour across market states — flagged when regime context changes.

Data

Three decades of institutional market data

Every brief is tested against the datasets your hypothesis actually needs — price, fundamentals, macro, sentiment and alternative signals, on the same immutable, point-in-time lake our research desk runs. Coverage windows reflect what’s on disk today.

1990200020102020Now
  • Databento
    2012
  • Massive
    2003
  • FMP
    1990
  • FRED & ECB
    2000
  • News + FinBERT
    2015
  • Quiver
    2023
  • Price
  • Fundamentals
  • Macro
  • Sentiment
  • Alternative
DatabentoPrice

CME futures

since 2012

Continuous CME futures, point-in-time across calendar and volume rolls — from minute bars to full L2/L3 order book.

ESNQCLGCSINGRTYYMZBZNZTMESMNQ
  • OHLCV 1-minute bars · 13 continuous contracts2012
  • Trades — tick · full trade history2025
  • Order book — MBP-10 & MBO · ES, NQ2026
  • Definitions, statistics & status · contract metadata2026
MassivePrice

US equities

since 2003

The full US equity cross-section — trades and minute bars back to 2003, with fundamentals and corporate actions.

  • Trades — daily cross-section · ~10K US tickers2003
  • Minute aggregates · raw + corporate-action adjusted2003
  • Corporate actions · splits, dividends, IPOs
  • Fundamentals · balance sheet, cash flow, income, ratios, float, short interest
  • Reference universe · tickers & classifications
Financial Modeling PrepFundamentals

Fundamentals & multi-asset markets

since 1990

The widest feed in the lake: company fundamentals, estimates, ownership, index membership, FX, crypto and macro.

  • Company fundamentals · 19 statement types · ~8,900 companies2003
  • Earnings · calendar & surprises2003
  • Corporate actions · dividend & split calendars2003
  • Analyst coverage · estimates, grades & ratings · 5–8K symbols
  • Insider & institutional · Form 4 + 13F · ~8K / ~7.5K symbols
  • Index constituents · S&P 500, Nasdaq, Dow + historical
  • ETF holdings & weightings · 33+ ETFs
  • FX spot — EOD · 15 major pairs
  • Crypto — EOD · 10 assets
  • COT positioning · 64 futures markets2003
  • Treasury rates · US yield curve1990
  • Economic calendar · global releases2015
FRED and ECBMacro

Macroeconomic indicators

since 2000

Point-in-time macro: 230+ vintage series across the US, Europe, UK and Japan — no look-ahead on revisions.

  • Macro series — vintage · 230+ point-in-time series2000
  • Regions · US, Eurozone, UK, Japan, Global
  • Treasury & policy rates · via FRED / ALFRED
FinBERTSentiment

News & sentiment

since 2015

News and press releases scored for sentiment with FinBERT — five streams across equities, FX, crypto and macro.

  • Stock news2015
  • Press releases2019
  • Forex news2018
  • General macro news2020
  • Crypto news2021
  • FinBERT sentiment scoring · transformer NLP, full universe
Quiver QuantitativeAlternative

Congressional trading

since 2023

US congressional trading disclosures as an event ledger plus point-in-time holdings snapshots.

  • Congressional trades — event ledger2023
  • Holdings snapshots · point-in-time2026

Don’t see your market?

Coverage reflects the current research lake. If your hypothesis needs a feed we don’t hold yet, we scope it as part of the brief before any work begins.

Submit a brief

Provider names and marks are shown for identification only. Validraft is independent and not affiliated with or endorsed by the listed providers. Coverage shown reflects the current research lake; some feeds refresh on demand for scoped engagements.

Positioning

Rigorous without the grind. Independent without the boutique markup.

Validraft

  • Human-scoped validation with feasibility review upfront
  • 48-hour turnaround for standard Tier A hypotheses
  • Confidential — your idea never enters a public template library
  • Same infrastructure and gates as our internal research desk

DIY backtest tools

  • You own the full pipeline — and every lookahead bug
  • No deflated Sharpe, PBO, or crisis stress unless you build them
  • Easy to overfit; hard to know when you have

Generic quant consultants

  • Opaque methodology and inconsistent gate coverage
  • Prescriptive language that blurs research vs advice
  • Weeks of back-and-forth before a deliverable

Security & confidentiality

Your edge is yours. We treat it that way.

You are handing us a trading hypothesis — often the most valuable thing you own. Validraft is built around a simple rule: your strategies and client data are never shared, never pooled, and never repurposed.

From the first brief to the final PDF and any ongoing monitoring, every touchpoint is scoped to your engagement and accessible only to you and the desk team running the work.

Your hypothesis stays yours

Every brief, parameter set, and deliverable remains your intellectual property. We claim no rights to your idea — ever.

Per-engagement isolation

Each validation runs in a scoped workspace: your brief, runs, and reports are tied to your engagement — not mixed with other clients.

Never enters a public library

Client strategies are never published, templated, or reused for other engagements. What you send us stays between you and the desk.

Gated client workspace

Reports and monitoring live behind authenticated access in your private workspace — not on open links or shared folders.

Operational practices

  • Encrypted in transit (TLS) for brief intake, dashboard access, and report delivery
  • Least-privilege access — only the assigned desk team sees your engagement materials
  • No cross-client benchmarking, aggregation, or anonymised reuse of your logic
  • NDA available before material exchange on Tier B/C and custom scopes
  • Descriptive outputs only — research and simulation, not investment advice

Research notes

Learn how robust validation actually works.

Plain-English essays on overfitting, validation gates, data discipline, and the difference between an attractive simulation and a durable hypothesis.

View blog

First article

01 / Overfitting

The curve can be beautiful and still be wrong.

Backtest overfitting

How to Tell If Your Backtest Is Overfit

A practical checklist for spotting curve-fitting before a beautiful equity curve becomes an expensive mistake.

8 min read

Your hypothesis. Independently validated.

Send a brief. We confirm scope, run the validation, and deliver the report. Research and simulation only.