Dressipi Blog

How Dressipi Gets Deployed

Posted on: July 22, 2011

The Dressipi service is powered by a Rails front end, with various bits of data crunching happening in the background. Our service runs on top of Amazon’s EC2, which gives us great flexibility: we can create and dispose of virtual servers at will, only paying for the time we use. For me, the mental shift with EC2 was not virtualization itself, but considering servers as inherently transient and disposable entities. If a server suffers a serious problem, rather than try and fix it, we simply replace it with a new one, leaving the stricken machine free for post-mortem analysis. This is also great for creating test environments that are only needed a few hours here and there.

Storage is treated similarly. EC2’s equivalent of a giant pile of spare hard disks is EBS. You tell EC2 how big a disk you want and Amazon gives you a volume that you can use like a physical disk. You can partition them, RAID them together etc. You can also create snapshots of these volumes and use those snapshots to create fresh volumes with the same contents as the original. We use snapshots for backups and in order to provide a rollback plan when we run database migrations. Before migrations run we take a snapshot. If something goes wrong, we can then use the snapshots to create fresh volumes containing our data as it was before the migration. This gets us back up and running quickly, and we get to keep the original volumes so that we can understand what went wrong.

With great Power comes great Responsability

The ability to provision new machines with a single api call is immensely powerful, but to make the most of this you need a deployment scheme that can match this. We need to, with high confidence, be able to create new server instances that are as close to possible as identical to the one they are replacing or augmenting. While Amazon gives you the tools to create new servers on the fly, it’s up to us to turn these blank servers into ones running our software.

The first thing we did is build a set of AMIs for the different kinds of server instances we run (application servers, backend processing servers etc). For those of you not in the know, an AMI is essentially an image which EC2 can use to create new servers from. AMI creation should of course be an automated process so that is easy to create new AMIs when required. We went down the route of writing a set of capistrano tasks that are able to launch a new EC2 instance with one of Amazon’s standard images, configure it to our liking and then store it as an AMI. Another set of capistrano tasks can then use these AMIs to create new instances.

Virtual or not, servers occasionally fail and when that happens, we want them replaced as quickly as possible, so as not to inconvenience our lovely users. Happily, one of EC2’s features is autoscaling. Subject to policies you define, autoscaling will create or destroy servers for you. For example you might say that you want a pool of at least 3 web servers. If one of those instances becomes unresponsive, then EC2 will destroy that instance and create a new one from the template you provide to bring you back up to the desired count of 3 healthy instances. You can also have a variable number of servers, scheduling the addition of servers at times you know will be busy or when the metrics amazon (or you) collect tell you extra servers are required. If you are using Elastic Load Balancing, then autoscaling will even take care of updating the set of instances registered with the load balancer.

Autoscaling happily

Autoscaling initially presented us with a bit of a problem, because the template it uses to create machines for you consists solely of an AMI and the ability to pass a string of user data to the instance. Although our AMIs contain almost everything to produce a functioning instance, they don’t contain the actual Dressipi software. At Dressipi we usually deploy a new version of our software every two weeks, with occasional minor deploys in between if an urgent problem arises. On the other hand, the base server configuration changes very rarely. It felt cumbersome to have to create new AMIs as part of every deploy. One alternative would have been to have the capistrano scripts responsible for deploying the app embedded in the ami and have a startup script on the instance call them, perhaps using the user data parameter to indicate which git tag capistrano should deploy. This also felt wrong as we wanted the process of an instance being autoscaled to be fast and to have the smallest number of moving parts possible.

We solved this by using EBS snapshots. As is the custom with passenger, our software is deployed to  /var/www/dressipi. We’re using bundler, so all of the application gems live in /var/www/dressipi/shared. Instead of our application servers having a single EBS volume, they have an additional one, mounted at /var/www/dressipi that contains our application. After each deploy we create a snapshot of that volume,  containing all of our application code and its gem dependencies. We then configure autoscaling so that as well as using our AMI, it also uses this snapshot to create a new volume which it mounts at /var/www/dressipi. No boot time shenanigans needed.

We use the excellent fog library to manage our Amazon interactions. The autoscaling parts are not in the fog mainline yet, but are on my github fork.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: