Computing on the cheap with Amazon EC2

We tend to host our own servers, because we like it, and because we can. (We also speak in the third person when there’s just one of us, but there’s no accounting for some people). Nothing fancy, mind you… for now, a handful of websites are running on an Ubuntu virtual machine through VirtualBox on a Windows 7 (or maybe Vista, I forget) box that otherwise serves as a Media Center. It’s actually simpler than it sounds.

Lately, though, what with the Heritage Health Prize and a lot of hours spent learning and playing with data mining techniques, the poor little server has been called upon to do much more intensive work. It’s routinely running simulations and calculations all night long and it’s really not built for that. The fan has started humming heroically (i.e. loudly), which isn’t always best for a media center.

Noone wants their media center to hate them, or to catch fire.

Enter Amazon EC2. That stands for Elastic Compute Cloud. See how clever that is — what they did with that 2 there? Rather than go “ECC”, they just counted the C twice and made it like a math or a chemistry equation. These Amazon guys are some serious funny. I’m actually very impressed with the setup they have. There’s a wealth of options for configuring the virtual servers — public AMIs (preconfigured images) are available for most major software vendor platforms, from the expected Oracle, Microsoft, and Linux offerings to MicroStrategy, R, Elastic Bamboo, Citrix, and even BitCoin configured software. Public data sets are available should you need them, advanced storage, database, failover, clustering, networking, identity management, queuing, notification, and probably a million other things at pennies-per-hour prices.

At the moment, I’m running a simulation on a 20-CPU 1.6 Terabyte beast of a machine for $0.228 per hour. This is the sort of thing that infuses me with glee. It’s easily outperforming my media center by 30:1.


The short point is you can rent versatile computer time for relatively cheap. There’s actually a “free” tier that gives you 24×7 run time with a minimal memory and storage footprint. I’m really not sure how they afford that, but that just makes it all the better for us!

There are a lot of guides out there to setting up an AWS instance, so I won’t duplicate info, but there are a few gotchas I ran into that I’ll mention:

  • The free instances are limited to using the Amazon Linux distribution (which always seems to be listed as a “beta” AMI). It’s currently a yellow dog variant, so the default action is to use “yum” to install new components. If you’re coming from debian or ubuntu it may take a minute to notice this. For example sudo yum install svn
    is one of the first things I had to do to get the svn client running.
  • The Amazon EC2 guide to using PuTTY is accurate, but it finishes early — when you connect with PuTTY it will still ask you for a user name. “root” won’t work (it rejects interactive logons as root), so you should use “ec2-user”, yes, with a dash (but no quotes).
  • Stopping and Terminating instances are very different things. Terminating an instance will, under default circumstances, permanently delete the files attached to it. This falls under the “things I wish I knew before” category. You can avoid this by detaching the storage before terminating, and there are some options for “Termination Protection” so that you just don’t terminate in the first place.
  • Spot prices seem to be a real bargain. If you can live with the threat of impermanence of your root drive (Spot instances are either running or terminated [see above]… backup frequently, use a custom AMI, and attach a secondary EBS volume to get around that), you can save quite a bit. I’ve bid half the standard price for the largest CPU allocation I can actually use, and the Spot price hasn’t exceeded my bid (or even come within 75% of it) since there was a bizarre pricing spike in May.
  • Windows servers are available, and the Amazon EC2 hourly costs include the windows server license costs. This is pretty cool. The same arrangement doesn’t seem to apply to other licensed software, which makes me wonder how the Microstrategy AMI works (I could boot it and maybe find out, but I’m lazy). Still, if vendors can quickly create cloud pricing strategies that support this technology, it would be beneficial for all involved, including, most importantly, me.

Of course there must be a gazillion more useful tidbits, but I haven’t gotten to them yet. After just a couple of days playing with it, though, it’s clear the technology has seriously matured very quickly. It’s incredibly easy to set up, cheap, powerful, scalable… all the things that make us Happy.

Leave a Reply

Your email address will not be published. Required fields are marked *