Scientific Computing on the Cloud

I’ve been keeping a close eye on the costs of the various clouds versus the costs of internal cpu farms.   Amazon EC2 pricing for high CPU map-reduce appears to be evenly priced with my costs to host internally.

I calculated this based on depreciating core i7 920s over a 3 year period and accounting for 0.14$ / khw @ 150 watts continuously.   I arrived at a lower cost than Amazon’s, however when adjusting for cpu performance, !/$ ties out or is bettered by the Amazon proposition.

Amazon’s rates for calculations using map-reduce are 1/5th the cost of a normal instance.    I’m estimating that the the high cpu 8 core is performing at SPECfprate2006 of approximately 150, twice the core i7 920.   The cost is $0.12 / hr versus a non map-reduce instance cost of 0.68 / hr.

This is great news for those doing transient large scale scientific computations (such as myself).   I now need to look to map my machine learning and strategy evaluation algorithms to map-reduce.


Filed under strategies

6 responses to “Scientific Computing on the Cloud

  1. Why would you need so much brute force? Personally I can not think of a strategy that can not be calculated in more than several minutes on an entry-level desktop pc. And I’ve been using matlab, native code will probably run in seconds.

    • tr8dr

      Really depends on what u r doing. Most strategies that people consider are very straight-forward computation-wise.

      There is another class of strategies that may involve complex machine learning.

      Liked your ETF analysis BTW. Looks like you have a number of good strategies implied there …

      • ‘Looks like you have a number of good strategies implied there …’ What can I say? I can not complain ;-).
        By the way: it looks like I am using an approach opposite to yours: my strategies are of the ‘kiss’ variety, but the trick lies in the data selection. This is how I do it: the strategy is deeply analytical and is always heavily tested for robustness while the portfolio it operates on is chosen by brute force (still not much more than minutes cpu work).
        From your posts I reckon that you are using an analytical asset selection and a GA-type strategies. It’s the other way around ! ;-).

      • tr8dr

        Actually, most of my strategies are statistical in nature, more often than not stochastic state systems.

        I am working on a new strategy which is both statistical and also dependent on machine learning.

        I don’t consider GA to be machine learning, rather is just an optimisation strategy.

  2. andrew

    An alternative to EC2 you might have already considered is using VPS’s.
    Have a look at which has random cheap vps offers.
    I have rented a box for 60usd pa that is about the same speed as one core of my core2duo laptop (runing perl scripts).
    This is less than the cost of electricity in the uk.

    • tr8dr

      Thanks for the link. What becomes a bit tricky is to find a box with enough memory to be useful for scientific computing. Many of the schemes give one partial access to a box with a very small memory profile.

      At this point have built some 16 core boxes, just because it was so much cheaper than the alternatives both up front and from a TCO perspective. Amazon now has a much more economical rate for hadoop jobs (if indeed your problem can be expressed in terms of hadoop). I like the idea of renting better than owning. I suspect we’ll see a lot more computing power auctioned in the future.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s