Evaluating an EA before you run it — the 8-point checklist that filters out 80% of bad purchases
Most EA purchases fail not because of bad strategy selection but because buyers don't have a systematic evaluation framework. This lesson gives you one — an 8-point checklist that works whether you built the EA yourself, bought it on MQL5.com, or received it from a signal provider.
Última revisión:
The 90-second version
The vast majority of EA failures are predictable before the first live trade. The evidence is almost always there in the product page, the backtest report, or the vendor's disclosure (or lack of it). You just need a framework to read it systematically.
- Confirmation bias causes buyers to stop evaluating an EA the moment they find evidence it works — they stop looking for evidence it doesn't
- A backtest with fewer than 100 trades is statistically meaningless — you cannot distinguish edge from luck
- No live track record from the vendor is not automatically disqualifying — but it means you carry 100% of the verification burden
- Grid and martingale EAs can produce equity curves that look excellent for years and then lose everything in a single week of adverse conditions
- The 8-point checklist produces a documented rationale, not a prediction — but a documented NO on a HIGH-importance question is a hard stop
In August 2024 the yen carry-trade unwind exposed a broad class of EAs that had been marketed on beautiful 2022–2023 equity curves — without any disclosed out-of-sample test and without configurable drawdown limits.
El 5 de agosto de 2024 USDJPY cayó 10 figuras en una sola sesión asiática. Los EAs que corrían sin kill-switch de drawdown absorbieron todo el movimiento. Los que tenían un tope de pérdida diaria de 3-5% salieron a tiempo y preservaron el capital.
FuenteEA Evaluator Checklist
Answer YES, NO, or UNCLEAR for each checkpoint. The verdict updates automatically as you work through the list.
Why systematic EA evaluation matters
Why most EA evaluations fail — the 3 cognitive biases at work
Most traders evaluate EAs the way they evaluate anything else they want to buy: they look for confirmation that it works. This is the first and most dangerous bias. Confirmation bias causes a buyer to read the backtest headline numbers (total profit, win rate), see something they like, and stop evaluating. They have found their yes. They stop looking for the no. The checklist forces you to look for the no first.
The second bias is authority bias — the tendency to weight vendor reputation or platform badge above evidence. An EA with 5 stars and 10,000 downloads on MQL5.com has exactly one thing proven: it is popular. Popularity and profitability are not the same variable. The crowd may be confirming each other's confirmation bias. What matters is whether the specific evidence you need — a live track record, an out-of-sample report, a stated strategy type — is present in this product's documentation. If it isn't, the star count is irrelevant.
The third bias is recency bias — the tendency to over-weight the most recent performance data. An EA that made 40% last year on EURUSD during a high-trending regime looks excellent on a chart shown to you today. But if the product page doesn't disclose that the strategy is a trend-following system, you may not realise that you're buying exposure to a specific market regime at the exact moment that regime may be ending. Recency bias makes the past look permanent. The checklist asks: what does this EA actually do, and is the current environment compatible with that approach?
Addressing all three biases requires a structured process, not willpower. The 8-point checklist provides that structure. It forces you to find the evidence for each criterion independently, rather than letting one piece of good evidence colour your reading of everything else. The process takes 20–40 minutes per EA. That is not a long time to spend before committing real capital.
The 8 evaluation checkpoints — what to look for and why
Checkpoint 1: Trade count above 100 (HIGH). A backtest with fewer than 100 completed trades does not contain enough data to separate strategy edge from random variance. At 30 trades, a strategy with zero edge has a statistically meaningful probability of producing a profitable backtest through chance alone. At 100+ trades, the probability of a false positive drops sharply. The minimum should be 100; 300+ is better. Count them in the Strategy Tester report — do not rely on the vendor's summary.
Checkpoint 2: Out-of-sample (OOS) period included (HIGH). A backtest optimised on the same data it was tested on is not a test — it is curve-fitting. An out-of-sample period means the EA's parameters were set on one slice of historical data, and the results were then measured on a separate slice that was not used for optimisation. Look for an OOS result in the product documentation or backtest report. If the vendor used the entire available history for optimisation and shows no OOS period, treat this as a major red flag. The Strategy Tester's walk-forward analysis produces a proper OOS test — ask whether the vendor used it.
Checkpoint 3: Live track record available (HIGH). A backtest proves that the EA's code would have performed well on historical data. A live track record proves it performs on real accounts with real fills, real spreads, and real broker conditions. These are different things. The vendor does not need to provide their own track record — a publicly verified MyFXBook or FX Blue link from any real-money account running the EA is sufficient evidence. If no live track record is available anywhere, the EA has not been verified in live conditions. That is not automatically disqualifying for a demo deployment, but it means your demo period becomes the first live test — plan accordingly.
Checkpoint 4: Drawdown reasonable for your risk tolerance (MEDIUM). Read the maximum drawdown figure in the backtest — not the average drawdown, the maximum. Then ask: if this EA repeated its worst historical drawdown on your live account, would that be acceptable? A strategy showing a 40% maximum drawdown is not inherently bad — but it requires a capital allocation where losing 40% is survivable. Most retail traders should not deploy an EA with a backtest max DD above 20% unless they have explicitly accepted that risk in writing.
Checkpoint 5: Strategy type stated and compatible with current market regime (MEDIUM). The product description should clearly state what the EA does: trend-following, mean-reversion, breakout, scalping, news-based, carry-based, or something else. This matters because different strategy types perform differently in different market regimes. A trend-following EA bought at the end of a multi-month trending regime, just as the market enters a ranging consolidation, will likely underperform — not because the EA is bad, but because the market is doing the opposite of what the EA is designed for. If the vendor does not disclose the strategy type, ask. If they won't answer, that tells you something.
Checkpoint 6: Risk parameters are user-configurable (MEDIUM). At minimum, the EA should expose a daily loss limit, a maximum drawdown parameter, and a position size control. If the risk architecture is fixed and hidden from the user, you have no ability to adapt the EA's behaviour to your account size, risk tolerance, or broker conditions. This is not optional. A fixed-lot EA with no risk inputs is appropriate only for users who have read and verified the source code.
Checkpoint 7: Backtest quality score above 90% (LOW). In MetaTrader 5's Strategy Tester, the 'Quality' figure measures how closely the test data resembles real tick data. A quality below 90% means the backtest used interpolated or modelled tick data, which can introduce unrealistic fills and spread values. This does not automatically disqualify an EA, but it means the historical performance numbers are less reliable. Real-tick quality (99%) is the gold standard. A quality of 60% or below is a material concern for any strategy whose profitability depends on precise entry timing.
Checkpoint 8: Grid or martingale mechanics fully understood (LOW, but contextual). Grid EAs and martingale EAs are not inherently bad — they are strategies with a specific risk profile: they tend to produce many small wins and occasional catastrophic losses. The catastrophic losses are not bugs; they are the mechanism through which the strategy eventually pays for its win-rate advantage. If you deploy a martingale EA without understanding this, you are not making an informed decision — you are benefiting from the upside of a trade you don't understand and absorbing the downside when it arrives. The question is not whether you run the EA. The question is whether you can honestly answer yes to: 'I understand this strategy can lose my entire account balance in a single adverse sequence, and I have sized my capital allocation accordingly.'
Working with the verdict — what deploy, investigate, and skip mean in practice
DEPLOY does not mean go live tomorrow. It means the EA has cleared all 8 checkpoints with satisfactory answers, and you have a documented rationale for why each one passed. The correct next step after a DEPLOY verdict is to deploy on demo for a minimum of 20–30 trades, confirm the live execution behaviour matches the backtest assumptions, and then move to a live account with a conservative initial lot size. DEPLOY is a green light to advance to the next phase of due diligence — not a green light to skip due diligence.
INVESTIGATE means there are one or more UNCLEAR answers on HIGH or MEDIUM checkpoints, or that a MEDIUM answer is NO. You have specific open questions that require additional research. Write down exactly what you need to find out: 'I need to verify whether a live track record exists anywhere — I'll search MyFXBook and FX Blue for this EA's name.' 'I need to clarify whether the stated OOS period used a separate optimisation window or was run in-sample.' The investigation has a deadline: if you cannot resolve the UNCLEAR items within 48 hours of targeted research, downgrade the verdict to SKIP. An EA that can't be verified in 48 hours of focused effort is not worth your capital.
SKIP means move on without guilt. The EA market contains thousands of products. There is no shortage of candidates. A SKIP verdict based on a systematic checklist is not a loss — it is the checklist working correctly. Experienced EA buyers skip 80–90% of products they evaluate. The checklist makes that easy by producing a clear rationale for each rejection. Write the reason in your trading journal: 'Skipped EA-X: no live track record, no OOS period, grid mechanics not disclosed.' This record prevents you from returning to the same EA six months later when you've forgotten why you passed on it.
Key terms
August 2024 — when regime change exposed evaluation gaps
In August 2024, USDJPY dropped 10 big figures in a single Asian session as the yen carry-trade unwound. The move was not a black-swan event — it was a regime change that had been building for months, and it exposed the fragility of a specific category of EA: backtest-optimised systems sold during the 2022–2023 trending regime, with no disclosed out-of-sample test and no configurable drawdown limit.
El 5 de agosto de 2024 USDJPY cayó 10 figuras en una sola sesión asiática. Los EAs que corrían sin kill-switch de drawdown absorbieron todo el movimiento. Los que tenían un tope de pérdida diaria de 3-5% salieron a tiempo y preservaron el capital.
The retrospective analysis told a consistent story. EAs that had received DEPLOY verdicts under a systematic evaluation framework — because they had documented OOS tests, stated strategy types, and configurable risk parameters — either survived the August move by hitting their drawdown limits and halting cleanly, or had already been identified as trend-following systems whose operators understood they were exposed to regime change. EAs that would have received SKIP verdicts under the same framework — because they had no OOS disclosure, no live track record, and no configurable risk inputs — were the ones producing the most alarming equity curve reversals. The evaluation gap was not visible until the regime changed. This is the defining characteristic of an underevaluated EA: it looks fine until conditions change, at which point the absence of proper pre-purchase evaluation becomes impossible to ignore. The August 2024 yen move was not a unique event. GBPUSD in 2016, COVID in 2020, SVB in 2023 — every major regime change produces the same postmortem. The solution is the same every time: evaluate before deployment, using a framework that specifically tests for regime sensitivity.
FuentePractice
Apply the 8-point checklist to one EA on MQL5.com right now
This practice session has a single deliverable: a completed evaluation scorecard for one real EA. Choose any EA from mql5.com/en/market that you find interesting. You have 30–40 minutes. At the end you will have a documented deploy/investigate/skip verdict with written rationale.
- 1
Go to mql5.com/en/market and browse to the Expert Advisors section. Choose an EA you're genuinely curious about — something you might actually consider using. Open the product page in full. Do not look at the price or the star rating yet. Open a blank document or your trading journal.
- 2
Work through checkpoints 1–3 (the HIGH-importance items). For checkpoint 1: find or download the backtest report and count the total trades. For checkpoint 2: look for any mention of an out-of-sample test, walk-forward analysis, or OOS period in the product description, comments, or screenshots. For checkpoint 3: search MyFXBook and FX Blue for the EA name. Record YES, NO, or UNCLEAR for each, with the specific evidence you found (or failed to find).
- 3
Work through checkpoints 4–6 (the MEDIUM-importance items). For checkpoint 4: find the maximum drawdown in the backtest report — not average, maximum. For checkpoint 5: identify the strategy type from the product description. If it's not stated, post a question in the comments section. For checkpoint 6: download the demo version, attach it to a demo chart, and open the Inputs panel. Count the risk-related parameters visible. Record YES, NO, or UNCLEAR for each.
- 4
Work through checkpoints 7–8 (the LOW-importance items). For checkpoint 7: in the backtest report screenshots, look for the Quality percentage. If it isn't shown, note UNCLEAR. For checkpoint 8: read the strategy description carefully for words like 'grid', 'averaging', 'recovery', 'lot multiplier', or 'martingale'. If you find any, answer NO and write a one-sentence explanation of what you understand the mechanics to mean.
- 5
Apply the verdict logic: if any HIGH item is NO, your verdict is SKIP. If all HIGH items are YES and at least 2 of 3 MEDIUM items are YES, your verdict is DEPLOY (subject to a 30-day demo test). Otherwise your verdict is INVESTIGATE. Write the full verdict and rationale in your journal. Then ask yourself: would you have reached this verdict without the checklist, or would you have bought the EA based on the equity curve alone?
Mastery check
Four questions. Pass at 75% (3/4). Focus on understanding the purpose of each checkpoint and the verdict logic.
Mastery check — Lesson 11
Pon a prueba tu comprensión con 4 preguntas. Aprueba con 75/4 correctas.
Reflect
Reflexión
Escribe tus respuestas honestas — se guardan solo en este dispositivo. Úsalas la próxima semana para detectar patrones en tu forma de pensar al operar.
Pro deep dive
The 8-point checklist addresses individual EA evaluation. Professional quantitative desks extend this framework with two additional layers: statistical robustness testing beyond OOS, and portfolio-level regime sensitivity analysis.
Monte Carlo analysis — the test after the backtest
An out-of-sample test verifies that the EA's parameters weren't overfit to a specific data sample. Monte Carlo analysis tests whether the OOS result itself could have been produced by luck. The method: take the EA's list of completed trades (wins, losses, and their magnitudes), randomly shuffle the trade order thousands of times, and measure the distribution of outcomes. If the actual OOS equity curve sits comfortably within the middle of the Monte Carlo distribution, the performance is consistent with what would be expected from a strategy with this edge profile. If the actual OOS result sits in the top 5% of the Monte Carlo distribution, there is a 5% probability that the performance reflects genuine edge rather than an unusually favourable trade-order sequence. Monte Carlo tools are available in MT5's built-in Optimiser and in third-party platforms like Market Simulation from FX Blue. The result adds a confidence interval to your evaluation — not a yes/no, but a 'this strategy is performing within expected parameters' or 'this result is suspiciously good.'
Walk-forward optimisation — a stronger OOS standard
A single in-sample/out-of-sample split tests the EA's parameters on two data periods. Walk-forward optimisation runs the process repeatedly, stepping forward through time: optimise on period 1, test on period 2, re-optimise on periods 1–2, test on period 3, and so on. The result is a walk-forward efficiency ratio — the ratio of out-of-sample performance to in-sample performance across all windows. An efficiency ratio above 0.7 suggests the EA's parameters generalise reasonably well across different market conditions. An efficiency ratio below 0.5 suggests the EA is highly sensitive to the specific data window used for optimisation — a warning signal for regime sensitivity. MT5's built-in Strategy Tester can produce walk-forward results if you select the 'Walk Forward' option in the Optimiser settings. Ask whether any EA you're evaluating has published walk-forward results alongside its backtest.
Regime sensitivity — the dimension most evaluation frameworks miss
Every trading strategy performs differently across different market regimes: low-volatility ranging markets, high-volatility trending markets, mean-reverting markets around fundamental announcements, and liquidity-thin markets during holidays or crisis events. A complete EA evaluation should include a regime breakdown: how did the EA perform during each of these conditions in the historical record? Vendors with genuine confidence in their product publish this breakdown. The calculation requires splitting the backtest into labelled regime periods (using VIX levels, ATR as a volatility proxy, or a manual classification based on visual chart review) and measuring the EA's performance metric (profit factor, win rate, average trade) within each regime. If the EA underperforms consistently in a specific regime, and current market conditions resemble that regime, that is the single most important piece of information in your evaluation. This analysis is not available from most retail EA vendors — but you can run it yourself by exporting the trade list from the backtest report and labelling each trade's period manually. The exercise typically takes 2–3 hours but provides a level of insight into the EA's conditions of failure that no backtest summary number can replicate.
The legal and disclosure standard for EA vendors
Retail EA vendors in most jurisdictions are not currently required to provide standardised performance disclosure — unlike regulated investment advisors, who are subject to specific rules about how historical performance must be presented. This creates an information asymmetry that the 8-point checklist is designed to compensate for. The FCA's 2024 guidance on algorithmic trading products for retail clients signals a direction of travel: regulators are increasingly concerned about marketing materials that present historical backtest results without disclosure of the testing methodology, the use of OOS validation, or the distinction between backtest and live performance. The practical implication: the evaluation framework you apply today, as a discipline, aligns with what regulators will eventually require. Vendors who already provide the information the checklist requires are likely preparing for, or already operating within, a more demanding disclosure environment. Vendors who cannot or will not provide this information should be evaluated accordingly.
Sources
Mostrar respuesta
Trade count above 100 (sufficient statistical sample to distinguish edge from luck), out-of-sample period documented (confirms the backtest parameters weren't fit to the test data), and live track record available (confirms the EA has been verified under real market conditions with real fills and spreads).
Material educativo únicamente — no es asesoramiento de inversión. Operar conlleva riesgo de pérdida de capital. Practica siempre en demo y usa un stop-loss. ← Volver a Automated Trading