Scaling a Ruby on Rails app on Heroku

I have had to spend a considerable amount of time scaling Rails applications for things like major product launches to live TV events like SharkTank and Beyond The Tank and national TV marketing campaigns.

From all that experience, the most effective way I have found to scale a web application is to do the following:

Implement sufficient application performance monitoring (I use NewRelic)
Gather data (ideally, load test until you scale out)
Identify what the bottleneck was
Fix the bottleneck
🔁 Repeat (until you hit a goal or run out of time)

The reason why this is so effective is because it ensures that you spend your time fixing the actual constraint of your system.

A great way to think about this come from the Theory of Constraints:

Any improvements made anywhere besides the bottleneck are an illusion. -Gene Kim

So, in my experience, when scaling a Rails app the bottleneck typically falls into one of two large areas:

The Database (Heroku Postgres in our case)
The Rails Application

Let’s dive into each of those.

Heroku Postgres

Goals

📉 Decrease time spent per database query
📉 Decrease number of database queries
📈 Increase database connection utilization
📈 Increase number of application requests per database connection

Solutions

Update your Heroku Postgres database plan
Upgrade Postgres to latest version
Implement caching in front of the Database where appropriate
Optimize / remove any long running and/or time consuming queries
Reduce number of queries by rewriting application to make fewer queries
Optimize / configure DB connection pooling
- Optimize ActiveRecord’s connection pool
- Implement/Optimize PgBouncer
- Ensure your application does not run out of DB connections
Configure a statement_timeout
Ensure your database is properly indexed (for both reads and writes)
Configure a read only follower database and use where appropriate
Ensure DB backups are configured to run off-peak hours

Heroku Rails Web Application

Goals

📉 Decrease application throughput
📉 Decrease application response time (95th percentile)

Solutions

Upgrade Ruby to latest version
Reduce throughput/requests to your application
- Serve static content (including assets/HTML/JSON) from a CDN e.g) Cloudflare
- Cache content that can be stale (including HTML/JSON) in a CDN e.g) Cloudflare
- Break off any high throughput/independent parts of your app into independent applications
- Optimize clients to make fewer requests to application
- Properly and aggressively set Cache-Control headers
Cache expensive calculations and/or queries
- On the Rails server in memory, in Redis, on an Edge server, on the client, or any combination
Optimize your web server (Puma)
- Test multiple Puma configurations including multi-threaded, multi-process, and a combination
Upgrade to Performance web dynos (dedicated)
Off load any long running processes into background jobs (Sidekiq)
Have an aggressive request timeout solution (RackTimeout)
Ensure any external API request clients have aggressive timeouts configured
Configure database to have a statement_timeout
Optimize database performance/queries/indexes (see above)