Hawkes Process & Strategies

Call me unread, but I had not encountered the Hawkes process before today. The Hawkes process is a “point process” modeling event intensity incorporating empirical event occurrence.

The discrete form of the process is:

where ti is the ith occurrence at time ti < t for some t. The form of the function is typically an exponential, but can be any function that models decay as a counting process:

Ok, that’s great but what are the applications in strategies research?

Intra-day Stochastic Volatility Prediction
The recent theme in the literature has been to replace the quadratic-variance approach with a time-based approach. The degree of movement within an interval of time is equivalent in measure to the amount of time required for a given movement, and can be interchanged easily as Andersen, Dobrev, and Schaumburg have shown in “Duration-Based Volatility Estimation”.

Cai, Kim, and Leduc in “A model for intraday volatility” approached the problem by combining an Autoregressive Conditional Duration process and a Hawkes process to model decay, showing that:

and then equivalently expressed in terms of intensity (where N represents the number of events of size dY):

relating back to volatility measure as:

The intensity process is comprised of an ACD part and a Hawkes part:

Picture 1

They claim to model the intra-day volatility closely and propose a long/short straddle strategy to take advantage of the predictive ability.

High Frequency Order Prediction Strategy
The literature suggests the use of Hawkes processes to model the buying and selling processes of market participants.

John Carlsson in “Modeling Stock Orders Using Hawkes’s Self-Exciting Process”, suggests a strategy where if the Hawkes predicted ratio of buy/sell intensity exceeds a threshold (say 5) buy (sell) and exit position within N seconds (he used 10).

This plays on the significant autocorrelation (ie non-zero decay time) of the intensity back to the mean. A skewed ratio of buy vs sell orders will surely influence the market in the direction of order skew.

The strategy can be enhanced to include information about volume, trade size, etc. We can also look at the buy/sell intensity of highly correlated assets and use to enhance the signal.

Advertisements

21 Comments

Filed under point processes, statistics, strategies, volatility

21 responses to “Hawkes Process & Strategies

  1. Derek

    I was wondering if you had come across any R/Matlab code that implemented a MLE of a bi-variate Hawkes process. There are several R point process packages but they seem to be uni-variate. I’d like to replicate one of the papers before I recode it in C#.

  2. Dan

    Take a look at the lecture notes on this page:
    http://www.stat.columbia.edu/~liam/teaching/neurostat-spr11/

    They include methods for rapidly fitting point-process GLMs that you can code. In general, neuroscientists are way ahead of finance when it comes to this kind of model. All of the methods on that page are applicable to understanding order book behavior if you change “spike” to “order book event”, “price spike”, etc.

  3. tr8dr

    Indeed, the most useful things that come into finance originate elsewhere. I take it your background is in neuroscience? Most practitioners, such as myself, have come from other fields, often without formal study in finance.

  4. Scott Locklin

    I ran across this idea again today, and of course, you thought of it first. Damn it. Anyway, this review article lists several useful R packages:
    http://www.biostat.jhsph.edu/~rpeng/papers/archive/ptprocR102.pdf

    • I use self-exciting processes quite a bit now in modeling & smoothing orderbook and trade data. These sorts of processes are modeled quite nicely with the approach.

      • Not presently modeling the order book, but I can see where it could be used for this. It’s a very handy looking gizmo in lots of places, I think. Thanks for writing this up: it is the clearest explanation of its utility on the html internets at present.
        Weird thought for the day: they based the original “AI” ideas on what they knew about the architectures of brains in those days. Seems most of it was wrong: brains work on rate encoded spikes. Wonder if you could build neat things based on that idea. VN architectures would suck at it, but at least you could simulate whether or not interesting things might happen.

  5. SIBI David

    Hi tr8dr,

    Thanx for all good recipes in your blog first.
    I would be interested to get your advice on a strategy that i am running.
    its stat arb market neutral on market open.
    The question is about exit during the day.

    When exit the basket, i would like to know which can of model could help me to achieve the best exit in term of price.

    – etiher estimating intraday vol & setting more interesting exit level for each stocks.

    or

    – estimating order flow trough point a processes like hawkes and taking benefit of this order flow ?

    Currently when i exit, i do best limit for all and modify at best limit every few seconds. On Average, i pay the mid…

    the objective would not be to grab the Bid-Ask, but taking benefit of current situation to take few basis points per stocks when i exit the book.

    • tr8dr

      This is a very interesting question, with many answers. In general you can improve your execution by having a model that indicates forward price direction over short periods. I think this is what you are after.

      With regard to order flow, can be used to detect momentum quite accurately. The actual order flow activity does fit fairly well into a hawkes process. Fitting to the model allows some level of smoothing and prediction. Filtering orderflow through a hawkes model is not enough, IMO. You need to identify the imbalances in the flow and apply a ML classifier (or an equivalent model) to determine whether the flow represents a momentum driven imbalance or is just “business as usual”.

      In practice, for the FX market, implemented without “smoothing” through hawkes processes. With the right sampling and mapping into features, and then generating appropriate labels for a training and validation set, found that fairly standard ML classifiers work well. The accuracy is very high.

      So for a basket of assets, if you have a combination of signals (say one is orderflow / momentum based), can devise a process for each asset that describes the probable path over a period for each asset. When there is no momentum, the expectation is just the current price, with momentum, it serves as a drift factor and the expectation will be upward or downward biased. One can make this more interesting by modeling the causual relationship amongst the assets instead of modeling each as a separate process.

      At the end of the day, you can determine for each asset, the likelihood of the price moving adversely or in line with your position and decide how / when to exit each asset accordingly. Instead of a system of related processes, can first take the heuristic approach of just identifying assets that are in or starting momentum, either holding longer if in your favor or executing aggressively if not. And if no momentum, look for an exit near the mid.

  6. SIBI David

    Thx very much for your reply,

    in fact, i realize that my feed is not enough good to count orders trough hawked process. its IB feed and its snapshoted every 250ms!!!

    so i decided, to use volume bulck classification in order to asses proba of informed trader (PIN), this is also another view of order imbalance.

    concerning the choice of machine learning, i will try:

    neural network feeded by order imbalance (PIN), previous stock return & previous index retrun in order to “”””predict””””” on short term price behaviour to arrange a better exit.

    either, markov switching regression with two state, (one state of informed traders where happens adverse selection and other the opposite), the regression will be pretty much the same.

    of course i dont like the neural net because of hidden non linearity but it seems more easy to set up given existance of lot of C# libraries…

    thx for your advices, i will keep you inormed of the results if you feel interested.

    a lot of programation now

    David,

    • Jonathan Shore

      Depending on how you set up the problem, you may do better with a SVM classifier, as opposed to ANNs. You will probably want to compute a directional PIN to make this useful, and hence will want to classify a bar as being buyer or seller dominated volume.

      How you decide whether buyer or seller is key to making this work well. Also, hidden orders play a large role in momentum moves. Hopefully the trades resulting from these are reported in the IB feed.

      Let me know how it goes. cheers

  7. Hi Jonathan,

    it seems that after a lot of work and test the best way to figure out what’s is coming out in the next few minutes for stock is the trade flow…..

    above all, the relation is linear and robust….so i wont dig into Markovian switching or SVM….to learn a linear relation.

    the draw back, its that to exit a basket of stock using the order flow is very powerfull if those stocks you trade exhibit a movment to the spread of equity which is high. in other words, short term volatility must be high and share with high turnover.

    If you have a better approach for stocks which are not really liquid ?

    i still currently continue to use that to exit my basket of stock, each ones individually given i didnt found better.

    Another drawback is that the trade flow becomes valid after minimum ten minutes after the open and, sometimes, i can exit my basket few minutes after the market open, and at that time, the trade flow is not significant at all, that why i fiinnally think to come back to count bid/ask flow, trough hawkes process.

    so i have few questions about hawkes.
    1- it seems no so easy to calibrate it…would you know a piece of code in .Net

    2- i feel a little bit lost because i dont understand why puting so much effort to calibrate a hawkes “intensity” on order flow, getting the distribution. Would simply counting the number of bid & ask and compare to an average, be enough sufficient, or something like that. I cannot figure out what brings the calibration to that specific law more than classical normal law.

    David,

    • tr8dr

      I would not use hawkes or any other “model” unless you see that it adds value in terms of denoising what you are looking at. In the US equities and FX markets I am able to get quality signal without requiring hawkes-style smoothing.

  8. S

    PIN is nothing more than aggregating classified LeeReady with other measurements on price. hence a lagging indicator. @tr8dr, do you have success applying hawkes to a basket of products? Can you also elaborate on how to model trade size and volume, etc?

  9. The best i have found up to know is that trade flow imbalance is a quiet robust leading indicator of price variation, linearly speaking. the trade flow must be caclulated in volume time (which volume make the stock moving) using volume bucket classification and real time feed must receive all the Traded feed (darkpookls finra ADF …). the only difficulty is that the trade flow has a lot of noise and if you de noise too much you loose the leading indication. sometimes the leading behaviour is awesome. now i am trying to add confirmations with order cancellation rates and other realistics measures … hawkes is too complicated, i find, to difficult to maintain day after days if you have a large pool of equities, but i am not a specialist..

    • tr8dr

      In practice I don’t really use hawkes for imbalance. I find that choosing the features carefully from the orderbook and/or trade data and classifying produces a very robust signal of momentum.

      Hawkes could be use a denoising approach, but did not find necessary.

    • S

      @David, when you use volume as bucket, you will get a lagging indicator because you are very much relying on the normality of the distribution. Hawkes itself is not that complicated, but it won’t save you from the noise, i would guess.

      • hi S,

        thx for the advice but for now i dropped hawkes. I finally calculate an imbalance thx to level 2 based on outsanding, executed, and cancelled orders volume at different levels. it gives me quiet good view of whats is going on in the next minute for liquid names.

      • S

        @ David

        That’s good news. However, I believe a lot of outstanding/cancellation are pure noise due to HFT. So I would be very careful with that. You know, the speed of cancellation is nonlinear and in this world, a lot of actions is done for different purposes.

  10. LinuXPoWa

    @David / @S / @Jonathan
    you stay in orderbook model and prediction ? or you are changing of way šŸ™‚

    • tr8dr

      Really depends on mkt as to how much information you can get from the OB. For US equities trade information dominates in terms of usefulness. But there are some combined event measures on trade and OB that give a boost.

  11. LinuXPoWa

    John you alive šŸ™‚ great !
    On the FX you haven’t any information about trade data … just the price ..no size or something else?

    OB it’s differents .. by example on a window of 10sec you can calculate the cumul of size, intensity of update of cancel orders, how many of uptick / downtick, the price mouvement on this window, right way? other ideas?

    After i need to do a good label… of course if i search a big? HF momentum mouvement (few ticks 4 or 6 ticks or more 15ticks?) i set + + + before and on the mouvement.

    Last … svm classification on training data ? how many 50000 ? 500000 ticks ?
    easy task? šŸ˜€

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s