Friday, September 12, 2014

Adventures in Golang - Building a Celery Task API

I've been playing around a lot with RabbitMQ and Celery lately. They are really helpful tools for building asynchronous job processing systems but interfacing directly with the rabbit queues that power celery can be annoying. This is especially true if you are dealing with SSL encrypted AMQP sockets and third parties.

In order for secure communication between clients and RabbitMQ servers all of the clients need to have access to the client side certs and keys as well as the certificate authority cert. Additionally I've found that a lot of the open source celery client libraries don't support SSL connections at all. In many cases, in order to support a third party client you end up having to fork the client library, add SSL support, hope they are active and accept your pull request quickly, provide the third party with the SSL certs, and then help the third party connect to your servers.

The problem is that a lot of developers are not familiar with socket connections or AMQP in general so you end up investing more time to get them up and running. Restful services, however, have been popular for awhile now and a lot of developers have familiarized themselves with consuming them. So if all you need is to give people the ability to create tasks that will run asynchronously in the background the answer is simple. Build an API for them to consume.

I've been interested in golang for awhile now but never had the opportunity to build anything with it. I decided this was the perfect opportunity. So I spent a day or two and built the go-celery-api. Its my first attempt at something written in Go so I learned a lot in the process.

Go is an odd language when you have a background in more traditional object oriented languages. The structs and the way that functions bind to them feel odd, the receive operator was a new concept for me, and having functions that return multiple results is powerful but different. I'm used to handling exceptions when an error occurs but the go standard seems to be that functions should return a result object and an error object. If the error object is nil then you're good to go.

While most of these oddities are easy enough to get over and are to be expected when learning a new language the one thing that was somewhat shocking is that there is no officially accepted package management solution for go. PHP has composer, Node has npm, Ruby has gems, Javascript has bower, Go doesn't have any consensus. As I was looking for a package management solution for go it seemed that most people recommend copying the code directly into a vendor folder in your repo. That seems like insanity to me, but maybe I'm just spoiled.

There are some tools out there that provide package management and of the ones I looked at I liked gom the most. It provides the basic functionality of composer and I was pretty happen with that. Until I noticed that the packages I wanted to use weren't tagged. Apparently the golang standard is to maintain a clean master branch and not use tags. I guess that means that backwards compatibility changes are either not possible or then break everything using the package. At least gom allows me to lock dependencies to specific commit hashes.

Overall I think Go is an interesting language I'm looking forward to learning more about it. But for now I have a working API to create tasks for celery to execute. Check it out and let me know what you think.

Wednesday, May 7, 2014

Using Chef to Setup a RabbitMQ Cluster

RabbitMQ allows for reliable inter-process communication via message queues.  The installation process is pretty straight forward but what if you want to automate it?  What if you need it to be highly available and support queue fail over if servers goes offline?  Hopefully this post will answer your questions.

What you'll need 

Setting up vagrant


Now that you have vagrant and virtual box installed you can make a directory for your rabbit cluster work.
mkdir rabbit cd rabbit 
Initialize vagrant and setup the hostmanager plugin for easy host file maintenance:
vagrant init vagrant plugin install vagrant-hostmanager
Now you need to download the vagrant box:
vagrant box add ubuntu1204-chef https://opscode-vm-bento.s3.amazonaws.com/vagrant/opscode_ubuntu-12.04_chef-11.4.4.box

Download the Required Cookbooks

mkdir cookbooks cd cookbooks git clone https://github.com/opscode-cookbooks/erlang.git git clone https://github.com/opscode-cookbooks/apt.git git clone https://github.com/opscode-cookbooks/yum.git git clone https://github.com/opscode-cookbooks/build-essential.git git clone https://github.com/Youscribe/cpu-cookbook.git cpu git clone https://github.com/divideandconquer/rabbitmq.git git clone https://github.com/divideandconquer/haproxy.git cd ..
Note that the rabbitmq cookbook is a fork of https://github.com/opscode-cookbooks/rabbitmq that fixes an issue with setting rabbitmq up as a cluster. The haproxy cookbook is also a fork. The https://github.com/Youscribe/cpu-cookbook did not allow the mode to be set in haproxy. Both of these fixes have pull requests in that are pending approval.

The VagrantFile

Open the VagrantFile created by the init command with your favorite editor and replace its contents with the following:
Vagrant.require_plugin('vagrant-hostmanager') # Define the cluster nodes = [ { :hostname => 'rabbit1', :ip => '192.168.0.11', :box => 'ubuntu1204-chef'}, { :hostname => 'rabbit2', :ip => '192.168.0.12', :box => 'ubuntu1204-chef'}, { :hostname => 'rabbit3', :ip => '192.168.0.13', :box => 'ubuntu1204-chef'} ] proxy = { :hostname => 'rabbitproxy', :ip => '192.168.0.10', :box => 'ubuntu1204-chef'} Vagrant.configure("2") do |config| #Setup hostmanager config to update the host files config.hostmanager.enabled = true config.hostmanager.manage_host = true config.hostmanager.ignore_private_ip = false config.hostmanager.include_offline = true config.vm.provision :hostmanager #setup haproxy config.vm.define proxy[:hostname] do |node_config| # configure the box, hostname and networking node_config.vm.box = proxy[:box] node_config.vm.hostname = proxy[:hostname] node_config.vm.network :private_network, ip: proxy[:ip] # configure hostmanager node_config.hostmanager.aliases = proxy[:hostname] # use the Chef provisioner to install haproxy node_config.vm.provision :chef_solo do |chef| #override default chef config chef.json = { 'haproxy' => { 'mode' => 'tcp', 'enable_default_http' => false, 'install_method' => 'source', 'listeners' => { 'listen' => { 'rabbitcluster 0.0.0.0:5672' => [ 'mode tcp', 'balance roundrobin', 'server rabbit1 192.168.0.11:5672 check inter 5000 downinter 500', 'server rabbit2 192.168.0.12:5672 check inter 5000 backup', 'server rabbit3 192.168.0.13:5672 check inter 5000 backup' ] } } } } chef.add_recipe "haproxy" end end nodes.each do |node| config.vm.define node[:hostname] do |node_config| # configure the box, hostname and networking node_config.vm.box = node[:box] node_config.vm.hostname = node[:hostname] node_config.vm.network :private_network, ip: node[:ip] # configure hostmanager node_config.hostmanager.aliases = node[:hostname] # use the Chef provisioner to install RabbitMQ node_config.vm.provision :chef_solo do |chef| #override default chef config chef.json = { 'rabbitmq' => { 'cluster' => true, 'cluster_disk_nodes' => [ 'rabbit@rabbit1', 'rabbit@rabbit2', 'rabbit@rabbit3' ], 'erlang_cookie' => 'fjhbsdflhgbfghdfgkdf', 'policies' => { 'mirror' => { 'pattern' => '.*', 'params' => {'ha-mode' => 'all'}, 'priority' => 100 } }, 'disabled_policies' => ['ha-all', 'ha-two'], 'enabled_plugins' => ['rabbitmq_federation','rabbitmq_federation_management'] } } chef.add_recipe "rabbitmq" chef.add_recipe "rabbitmq::mgmt_console" chef.add_recipe "rabbitmq::plugin_management" chef.add_recipe "rabbitmq::policy_management" end end end end

This Vagrantfile will boot up three rabbit servers, cluster them together, setup a policy to mirror all queues across the cluster, and install the federation plugins in case you need to setup some cross-datacenter redundancy. It will also setup an haproxy server that will act as a load balancer for your rabbit cluster.

Running the Vagrant Cluster

Now you are ready to run the vagrant cluster.  Simple run the following command in the rabbit folder we created earlier:
vagrant up
You should see vagrant boot up each node and then run the rabbitmq chef cookbook on each. When it's finished you'll be able to see the rabbitmq admin panel at: http://rabbit3:15672/ with the user name and password guest/guest.

Since we have a load balancer in place, all of your applications should connect to the rabbitproxy server instead of the individual nodes. Connecting to the proxy gives you automatic fail over. If one of the rabbit nodes goes down your application can reconnect to the proxy which will automatically forward the app to one of the other working nodes. Since all the queues are being mirrored your messages will be available on the new node as well. To ensure that you don't lose any messages in the failover process can follow the instructions in the rabbitmq getting started guide.