Simulation for high-frequency strategies is, at best, a decent way to take one’s strategy through its paces, perhaps giving a view on profitability. In no way can it present an accurate view on profitability, rather at best can look at the strategy’s performance under various synthetic assumptions.
Naturally our simulator must reflect:
- typical order event latency (perhaps with some jitter)
- typical market event latency
- accurate orderbook
- view on trade fills
- view on impact (icing on the cake, perhaps not necessary)
The Data
There is just not enough information presented in market data to model with complete accuracy nor can we predict with certainty, the effects of our own trades on subsequent market events. With access to a HF feed, we receive events such as:

We are able to see most individual order activity. Primarily, the operations are New, Update, and Delete order. The exceptions are:
- hidden orders
- orders that immediately cross
- IOC orders (immediate or cancel)
What is shown or not shown may vary from venue to venue. With the above we can reconstruct the orderbook as a set of price levels on the bid and another set on the ask. Each price-level represents a queue of orders against which executions will occur on a FIFO basis.
The orderbook (abbreviated) might look something like this:

A simulator using historical data needs to reconstruct the orderbook, maintaining proper FIFO behavior. The two biggest problems in simulation are:
- determining when / if a trade happened
- determining market impact of strategy trades on historical data
With the various FX venues trades are shown selectively, and at that, with significant lag to the actual event. Order (New or Update) events that would have crossed are also not shown. This leaves us to guess when a trade is likely to have happened. We can execute our strategy under various scenarios of trade fill.
Possible Trade Events
Let’s look at some possible signals that a trade may have occurred:
- all orders on our level are removed
- due to trade or cancellation?
- inside order level(s) removed
- due to trade(s) or cancellation?
- one of the above + our level has moved within epsilon of crossing
- our level is now crossing
None of these except (4) can indicate a trade with certainty, however without some assumptions about the first three scenarios would only leave us with aggressive crossing or waiting for price levels to collide.
All orders on our level removed
Consider the above orderbook and the transformed below (where the removed orders are in red):

All orders on our orderlevel of 82.709 on the ask have been removed in the historical events (except our synthetic order from the strategy we are testing). The likelihood that a trade happened is higher if the # of orders on the level and/or the maximum size historically for the life of the level is high. With one order on the level removed it may well have been a cancellation.
In this particular case we see the order on our level and the first order on the next level removed. This strengthens a view that a trade may have occurred.
We can assign a probability of trade based on evidence like this (though if using a random / probability approach, ones results will differ with every run).
Deeper Inside Level removed
Consider the orderbook below:

Our order is on the inside level and the level(s) above it is either cancelled or taken out. This is an even stronger indication of trade if more than one level taken out. Levels can disappear quickly from a number of possible stimuli:
- traded: large buying algorithm buys in size (this happens quite often during the day)
- moved: level primarily lead by market makers and activity elsewhere demands changing the offering
- risk aversion: pending event or volatility cause pullback
#2 is harder to gauge perhaps, but #1 and #3 are relatively easy to see in retrospect.
Order Moves Close to Crossing
We may have a passive order that drifts towards crossing (or is placed very close to crossing):

We see that our order is 2/10ths pips away from the cross. On some venues there may be inside-of-inside hidden orders (for venues that don’t require partial exposure of size). In those cases, a move that tight is likely to get executed right away. Market makers with inventory may find a cross of 2/10ths appealing as well.
Other Trade Signals?
We did not look at what was happening on the other side of the book, but could possibly correlate activity on the other side and activity in the above scenarios to determine a stronger likelihood of a trade. I’d appreciate ideas.