Provision machines with AWS - custom bootsrapper
In the previous post I wrote a little about our transition to AWS.
Now I will tell a little more about our instance bootstrap process.
Basically at the end of the previous post we discussed tree possible options for automated machine startup:
Create different AMI for each server role.
Install all binaries into one ami an provide a way to load dynamic configs parts through some custom bootstrap script.
Use infrastructure automation framework like Chef or Puppet, which could handle installs and configuration for you.
We tried and quickly abandoned the first option ( many different AMIs ) - it turned out to be a really big headache to maintain those.
Next we tried option 2 - install everything we need into one machine. We actually use this option to this date, which means it’s not that bad and can be used with care ( even though I’m not a big fan of it and will tell you why later ).
Anyways, here is how we did it:
built all binaries and configs into one AMI
created custom bootstrap script and added it to the end of the rc.local ( that way it will be executed each time instance boots )
I can’t paste the whole contents of the custom bootstrapper, but will describe some essential parts:
1. First step in the boostrapper - fetch userdata field. We use it to specify 3 things: environment, role and context during instance launch.
To extract that data we doing a call to
http://169.254.169.254/latest/user-data which is a special place in AWS where you can get data about running instance.
environment=prod role=memcached context=b
We also use userdata to pass IAM keys to the instance, but this is the whole new discussion and I wont go into more details about it here. I’ll just say that with the release of IAM roles you don’t need to pass keys to the machine any more.
2. Next section is responsible for the code deploys.
Couple words about how our release system works. When developers are done committing their changes to git and ready to release their code, they merge changes to master and push to the github.
Then we have release script which runs remotely on a different server. Its purpose is to prepare build (archive with all necessary files) and upload it to S3. We also have a reference file there at S3 which points to the most recent build tag.
When we launch new instance, our custom bootstrapper runs and when it reaches deploy section, it reads pointer file and fetches specific build from the S3. Clever!
3. Custom init section.
This is the most interesting part I think. The idea is that you can have separate init files, which will be searched and executed in order from most specific to most generic. First found match wins - only one init file gets executed.
"$context/$environment/$machine_id.sh", "$context/$environment/bootstrapper.sh", "$context/bootstrapper.sh"
The idea behind custom bootstrappers was to give you a place to start / stop particular services ( since we have one AMI with all of binaries / services, and we don’t want them to always run on the instance ). You could also use it to copy particular config file for the service.
That was the intention anyways, on practice it wasn’t so beautiful - at least for me..
Even thought this “build all into AMI” approach may seem good at first I found it pretty annoying to deal with. Here are some of the problems I found with this method:
every change to the config or new service install requires new AMI burn.
there is a really high chance that you will burn some sensitive or plain garbage data which shouldn’t go into prod AMI..
after a while you loosing track of services that are available on the AMI, you need to dig through it to refresh memories.. Unless you keep very thorough documentation of every binary you install and every config change you make. Theoretically possible, practically - not so much..
you can’t leverage community like you could with infrastructure automation frameworks
when you have many servers running, every config change deploy turns out to be pretty annoying experience
Those are just some points on top of my head, I’m sure you can come up with your own list, or start arguing with me about specifics. The point is - you can use this approach and we still do, but I think there are better tools out there created specifically to address problems stated above.
One of such tools called Chef and we decided to give it a try. In the next posts I will be writing about our transition to Chef. Read on if you are interested.
Some popular ones
- Story behind X-Forwarded-For and X-Real-IP headers (23 Apr 2014)
- Internal redirect to another domain with proxy_pass and Nginx (14 Oct 2013)
- Secure data bag items with chef solo (04 Aug 2013)
My books recommendations
Great book for operations people. Helped me to design and build solid deployment pipelines. Awesome advices on automated testing as well. The author advocates against feature branches, every commit goes to master! Scary? I know, but it actually makes sense once you get the idea. Read the book to find out more.
One of those rare books where every word counts!
Classics from John Allspaw who is SVP of Infrastructure and Operations at Etsy (and used to work for Flickr). The book covers very important topics like metrics collection, continuous deployment, monitoring, dealing with unexpected traffic spikes, dev and ops collaboration and much more. Def recommend if you are starting out in the operations field or been doing it for a while ( in latter case you probably read this book already :).
This book is must read for every software engineer, no matter which language you use! It will change your perspective on writing code. I was amazed by the quality of material - very detailed and up to the point.
"The only way to make the deadline -- the only way to go fast -- is to keep the code as clean as possible at all times."
blog comments powered by Disqus