UPDATE 2: Clint looked into using preload for the Puma workers and reported his findings in the comments. Take a look, but the gist is preload is not necessary it seems
18 months ago I wrote a blog post about how to use Unicorn to optimize our Heroku performance. Since then we’ve been using Unicorn on Heroku.
Over the last couple of months our business grew a lot and Unicorn seemed to take more resources than necessary. We switched to 2X instances and still needed quite a lot of workers.
Although larger Heroku bills were part of our decision to optimize, we mostly felt the quality of our service was diminishing.
We started with Puma as it seemed to be one of the more widely used options. For the last month we have used Puma in production on the Codeship and are happy with it. There are a couple of potential pitfalls, but overall it made our performance better and Heroku bills smaller.
We are still using MRI 2.0.0-p195 on Heroku and haven’t switched to either JRuby or Rubinius. Although there is a lot performance we could gain from this switch we are happy with the performance we have and it would make our development more complicated.
We optimized different parts of our application at the same time, so we don’t have scientifically valid data for our change from Unicorn to Puma. For example we switched to a Database with more memory on Heroku and changed our auto-reload functionality. All of this combined resulted in a much better performance, but it can’t be attributed to Puma alone.
Setting Up Puma on Heroku
Getting Puma to run on Heroku is very easy. You should be able to get everything up and running in minutes.
Add Puma to Gemfile
First add Puma to your Gemfile and remove any other servers.
Add Puma To Procfile
Add the following line to your Procfile:
This allows you to set the number of threads and workers Puma uses through the environment.
It also uses bash syntax to define defaults if no environment values are set. Starting several Puma workers allows you to get even more out of your Heroku Dyno.
Setting the values through the Environment helped make it easy to figure out the numbers of workers and threads that work fine. This will be different for every application and you should experiment with it. Currently we use 10 Threads and 2 Workers.
Set DB connections through Environment
As described in the Heroku documentation set up a database_config initializer which lets you define your pool size and reaper frequency
The ActiveRecord reaper will regularly check your open connections and remove them if they become stale. You can read more about it in the Heroku Docs
Monitor with NewRelic
We use NewRelic for all of our monitoring. Their Request Time and Instance views helped is to see the improvements and make sure our memory consumption isn’t too much.
You should set the MIN_THREADS and MAX_THREADS environment variables to the same value. As your dyno doesn’t need to release the resources anyway it’s perfectly fine to set it to a static value.
Make sure your MAX_WORKERS is lower than the DB_POOL_SIZE, otherwise your application might not be able to connect to the database and throw exceptions.
After a new deployment the memory climbs until it plateaus at currently ~250MB on Average. Other people have mentioned that there might be memory leaks with Puma, but it is probably somewhere in our codebase. We are currently looking into this, but it hasn’t been a problem so far. Just something to keep in mind.
Our assumption is that Unicorn hid this problem before by restarting the workers regularly.
So far Puma is running great and we run way less dynos. This might be due to a number of optimizations, but Puma was definitely an important part in that.
It is very easy to get started and if you feel you could gain better performance definitely give it a try.
As passenger now supports Heroku we might look into it as well sometime in the future. Right now we are happy with Puma though.
Tell us your experience with Puma or other performance optimizations on Heroku in the comments.