I have had to spend a considerable amount of time scaling Rails applications for things like major product launches to live TV events like SharkTank and Beyond The Tank and national TV marketing campaigns.

From all that experience, the most effective way I have found to scale a web application is to do the following:

  1. Implement sufficient application performance monitoring (I use NewRelic)
  2. Gather data (ideally, load test until you scale out)
  3. Identify what the bottleneck was
  4. Fix the bottleneck
  5. πŸ” Repeat (until you hit a goal or run out of time)

The reason why this is so effective is because it ensures that you spend your time fixing the actual constraint of your system.

A great way to think about this come from the Theory of Constraints:

Any improvements made anywhere besides the bottleneck are an illusion. -Gene Kim

So, in my experience, when scaling a Rails app the bottleneck typically falls into one of two large areas:

  1. The Database (Heroku Postgres in our case)
  2. The Rails Application

Let’s dive into each of those.

Heroku Postgres


  • πŸ“‰ Decrease time spent per database query
  • πŸ“‰ Decrease number of database queries
  • πŸ“ˆ Increase database connection utilization
  • πŸ“ˆ Increase number of application requests per database connection


Heroku Rails Web Application


  • πŸ“‰ Decrease application throughput
  • πŸ“‰ Decrease application response time (95th percentile)


  • Upgrade Ruby to latest version
  • Reduce throughput/requests to your application
    • Serve static content (including assets/HTML/JSON) from a CDN e.g) Cloudflare
    • Cache content that can be stale (including HTML/JSON) in a CDN e.g) Cloudflare
    • Break off any high throughput/independent parts of your app into independent applications
    • Optimize clients to make fewer requests to application
    • Properly and aggressively set Cache-Control headers
  • Cache expensive calculations and/or queries
    • On the Rails server in memory, in Redis, on an Edge server, on the client, or any combination
  • Optimize your web server (Puma)
    • Test multiple Puma configurations including multi-threaded, multi-process, and a combination
  • Upgrade to Performance web dynos (dedicated)
  • Off load any long running processes into background jobs (Sidekiq)
  • Have an aggressive request timeout solution (RackTimeout)
  • Ensure any external API request clients have aggressive timeouts configured
  • Configure database to have a statement_timeout
  • Optimize database performance/queries/indexes (see above)