Scaling a Ruby on Rails app on Heroku
I have had to spend a considerable amount of time scaling Rails applications for things like major product launches to live TV events like SharkTank and Beyond The Tank and national TV marketing campaigns.
From all that experience, the most effective way I have found to scale a web application is to do the following:
- Implement sufficient application performance monitoring (I use NewRelic)
- Gather data (ideally, load test until you scale out)
- Identify what the bottleneck was
- Fix the bottleneck
- π Repeat (until you hit a goal or run out of time)
The reason why this is so effective is because it ensures that you spend your time fixing the actual constraint of your system.
A great way to think about this come from the Theory of Constraints:
Any improvements made anywhere besides the bottleneck are an illusion. -Gene Kim
So, in my experience, when scaling a Rails app the bottleneck typically falls into one of two large areas:
- The Database (Heroku Postgres in our case)
- The Rails Application
Letβs dive into each of those.
Heroku Postgres
Goals
- π Decrease time spent per database query
- π Decrease number of database queries
- π Increase database connection utilization
- π Increase number of application requests per database connection
Solutions
- Update your Heroku Postgres database plan
- Upgrade Postgres to latest version
- Implement caching in front of the Database where appropriate
- Optimize / remove any long running and/or time consuming queries
- Reduce number of queries by rewriting application to make fewer queries
- Optimize / configure DB connection pooling
- Optimize
ActiveRecord
βs connection pool - Implement/Optimize
PgBouncer
- Ensure your application does not run out of DB connections
- Optimize
- Configure a
statement_timeout
- Ensure your database is properly indexed (for both reads and writes)
- Configure a read only follower database and use where appropriate
- Ensure DB backups are configured to run off-peak hours
Heroku Rails Web Application
Goals
- π Decrease application throughput
- π Decrease application response time (95th percentile)
Solutions
- Upgrade Ruby to latest version
- Reduce throughput/requests to your application
- Serve static content (including assets/HTML/JSON) from a CDN e.g) Cloudflare
- Cache content that can be stale (including HTML/JSON) in a CDN e.g) Cloudflare
- Break off any high throughput/independent parts of your app into independent applications
- Optimize clients to make fewer requests to application
- Properly and aggressively set Cache-Control headers
- Cache expensive calculations and/or queries
- On the Rails server in memory, in Redis, on an Edge server, on the client, or any combination
- Optimize your web server (
Puma
)- Test multiple
Puma
configurations including multi-threaded, multi-process, and a combination
- Test multiple
- Upgrade to Performance web dynos (dedicated)
- Off load any long running processes into background jobs (
Sidekiq
) - Have an aggressive request timeout solution (
RackTimeout
) - Ensure any external API request clients have aggressive timeouts configured
- Configure database to have a
statement_timeout
- Optimize database performance/queries/indexes (see above)