For very low-level order analysis we try to determine continuity from one order to the next as a stream of orders across a period. This is applicable for orders generated by a given trader (algo) that are more-or-less continually in the market.

In most cases an order near the top of book has a short lifetime, but an algo will maintain continuity through resubmission into the order book at a different price level. This stream of orders is often masked in various ways, but often has discernable patterns, depending on the sophistication or consideration towards masking this in the algo.

Going from order deletions to new order insertions we want to attempt to match up the old and new orders (where applicable), to determine the continuity of an order stream. We therefore want to map from an old-set to a new-set:

However, in order to determine, we have to explore all possible mappings with weights between these edges:

The optimal mapping will be one that finds the set of one-to-one edges that has maximal weight. This is a problem in graph theory and optimisation theory. This can be expressed rather neatly as a linear constraints problem where we have:

Where Xij is a matrix of 0 or 1, 1 indicating an edge and 0 no edge. Wij is a corresponding matrix of weights for each edge. Integer (or integral) linear programming solutions are classified as NP-hard and do not have an efficient algorithm. We will be evaluating these on a relatively small # of orders each time, so the cost of a brute force solution is not bad.

The real trick is in determining the weights, representing the degree of similarity. Going further, one may generate a tree of these (that spans out exponentially, but is trimmed regularly). Some decisions on prior matches will be wrong and should be reevaluated holistically.

Is it worth the effort? Depends on what information you are interested in and what your game is. We do a mixture range ultra-high frequency to medium frequency. This is one sub-problem of a bigger picture.

### Addendum

I should mention that the worst case brute-force approach involves examining combined weights for each permutation of edges. This is a O(n!) algorithm that quickly becomes uncomputable. For the specific bipartite matching problem, rather than the general Integral Linear Programming problem there are much more efficient solutions. One of the simplest (though not the most efficient) is the “Hungarian algorithm“. A reader pointed out a paper for an “online” approach to matching as well.

Great example of micro-regime discovery, with nice potential for tail-running algos (connecting dots from previous two posts). Untangling non-MM liquidity providers from smart streams is interesting indeed, given the latter is presumably not trading continuously.

Why bother with FX, given order books are not available?

Hi, yes, this can be used for a variety of things including understanding where the market is going.

As for FX, not only does one have orderbooks, but also very detailed order deltas (i.e. the raw market data is in the form of New, Update, Delete on orders. That said, not all orders will be shown in market data.

Naturally the orderbook (minus hidden trades) can be reconstructed with this order information. Equities and futures provide a bit more information however.

Interesting; beyond retail FX books (which seem too small percent of market to be useful, plus lacking informed traders), how are you pulling real-time big bank order streams?

I am connected directly to venues in a co-located setup. Banks make up a big % of participation, but there are also many hedge funds on as well.

Nice! I think you should take a look of the “online computing” approach to find the optimum…

http://portal.acm.org/citation.cfm?id=100216.100262

Thanks. I glossed over the algorithm in the post. If one uses brute force involves exploring all permutations with a cost of O(n!). I’m using a slight variant of the Hungarian algorithm for the simpler stuff which does better with polynomial time.