Performance tune the SQLite3 adapter connection configuration#49349
Conversation
|
I see that people are using sqlite in production more and more (for extra small projects, probably, where there are a few users and performance ± does not matter), even though it is better suited for development and testing only. If people care about performance, then they probably should use other DBMS? 🤔 I do not have a strong opinion on this, but think, that this performance tuning should be done by users themselves, like this is done for PostreSQL/MySQL. |
2948543 to
28d59b3
Compare
|
@fatkodima Performance is a primary benefit, but this is fundamentally just a better configuration for SQLite databases, regardless of what Rails environment they are being used in. SQLite's noble commitment to backwards-compatibility means that users are required to tweak configuration (and even compilation, see sparklemotion/sqlite3-ruby#408) to get the recommended setup for green-field, modern usage of SQLite. I also plan on introducing a PR to allow for users to tune connection configuration, but right now that is actually quite cumbersome for end-users as they need to somehow extend the So, while I agree that we ought not to configure the default SQLite experience only for performance gains, this PR isn't aimed at that. This PR provides a standard, modern configuration for any usage of SQLite which provides a notably better out-of-the-box experience for users. And, as you say, the users of SQLite in Rails is growing. Those users, myself among them, deserve a better out-of-the-box experience that doesn't require study of the history and constraints of SQLite, standard configuration recommendations, and knowledge of Rails internals to then update the configuration. |
|
This is a good update. We don't see any other parts of the ecosystem (puma, rack etc) holding back performance initiatives. There are no downsides to moving default performance forward in this way. We need to keep in mind that the default Rails database experience remains SQLite (when no other switches are used to start a new Rails application). Also relevant: https://x.com/dhh/status/1705271515997143168?s=20 |
|
These are a few common misconceptions about Sqlite that I'm hoping get cleared up over the next few years.
If you accept that web applications follow a long-trail distribution, there are more instances of small projects, hobby apps, etc running in the world than large Rails installations. Today the Rails production story for these apps is something like "deploy your tiny app to a server, then install a database like MySQL or Postgres, but make sure you tune it properly or pay somebody else to do that!", which is bonkers if you think about all the extra "stuff" and RDBMS comes with that these types of apps just don't need. Also, Sqlite can be very performant. In fact, for smaller single-node apps you can't get much faster than your app reading and writing to a Sqlite database directly to an NVMe disk. https://github.com/oldmoe/litestack/blob/master/BENCHMARKS.md is a good read if you want to see some relative benchmarks (as usual, take with a grain of salt)
The only reason we think this is true today is that we've been told to by The Cloud Lords that a "real" web app must run on multiple nodes and we should leave state management up to The Cloud Lords because a mere mortal is incapable of backing up and restoring stateful database data. The reality is that backing up a Sqlite database is as simple as downloading a file or replicating it to an object store for people who want to get fancy. Are there constraints to running Sqlite in production? Yes, and it is slightly different than other RDBMS's, but that doesn't make it fundamentally bad in a way that's punishable by banishment from production. I've been doing work at Fly.io to make Sqlite Rails deployments out-of-the-box something that's safe, easy, and inexpensive (https://fly.io/ruby-dispatch/sqlite-and-rails-in-production/). I hope for the sake of the community that other hosting providers follow suit. Here's all the reasons why I want to see Sqlite break through these myths: Lower costsIt will be insanely cheap to run a Rails Sqlite app in a lot of different production environments since no external services are needed with Sqlite stacks like https://github.com/oldmoe/litestack. Lower operational complexityWith Litestack, people will be able to deploy a full Rails stack without having to run additional servers like Redis, Postgres, or MySQL. Easier for beginnersThe choice of Sqlite for dev and test was smart for Rails, it makes workstation setup much easier. The problem is when a person new to web development is asked to deploy to production, they suddenly have to change their RDBMS to Postgres or MySQL. In practice, it's a bad idea to dev on one database engine like Sqlite, then deploy to a different engine like Postgres. I see lots of hosting providers warning Rails developers that it's a bad idea to dev in Sqlite, then deploy to their production RDBMS. Taking Sqlite seriously for production deployments stands to make Rails even easier for people just starting out and more accessible since hosting costs will be incredibly low. |
|
Looks like DHH read this and decided to go all-in on Sqlite for ONCE: https://twitter.com/dhh/status/1705271515997143168 😆 |
skipkayhil
left a comment
There was a problem hiding this comment.
Looks great! Just one question and one test suggestion
There was a problem hiding this comment.
For future readers: this feature is available in 3.7.0 and Active Record only supports 3.8.0+:
activerecord/test/cases/adapters/sqlite3/sqlite3_adapter_test.rb
Outdated
Show resolved
Hide resolved
activerecord/lib/active_record/connection_adapters/sqlite3_adapter.rb
Outdated
Show resolved
Hide resolved
28d59b3 to
a463b49
Compare
|
I can see a significant improvement in the running time for the sqlite3 test suite before: after: |
For Rails applications, the Write-Ahead-Log in normal syncing mode with a capped journal size, a healthy shared memory buffer and a shared cache will perform, on average, 2× better.
a463b49 to
c6d7ffc
Compare
* rails-7.1: Rails 7.1 no longer populates redirect body (rails/rails#44554). Calling silence on ActiveSupport::Deprecation is deprecated and will be removed from Rails (use Rails.application.deprecators.silence instead) Deprecator setting has been deprecated. run db:drop, db:create and db:migrate in a separate commands (probably due to rails/rails#49349) Override and revert rails/rails#46699 for now, move test database from /storage back to /db Rails 7.1 replaces config.cache_classes with config.enable_reloading in template environment/test.rb Add Rails 7.1 test gem file. to_default_s is deprecated and will be removed from Rails 7.2 (use to_s instead) ActionView::OutputBuffer refactored by rails/rails#45614 (Rails 7.1) See rails/rails#36020
* Ignore "test.sqlite3-shm", "test.sqlite3-wal" files Generated by rails/rails#49349 * Fix warning: `*' interpreted as argument prefix ``` app/models/concerns/turbo/broadcastable.rb:297: warning: `*' interpreted as argument prefix ``` * Fix warning: assigned but unused variable - board ``` test/streams/broadcastable_test.rb:304: warning: assigned but unused variable - board ```
* Ignore "test.sqlite3-shm", "test.sqlite3-wal" files Generated by rails/rails#49349 * Fix warning: `*' interpreted as argument prefix ``` app/models/concerns/turbo/broadcastable.rb:297: warning: `*' interpreted as argument prefix ``` * Fix warning: assigned but unused variable - board ``` test/streams/broadcastable_test.rb:304: warning: assigned but unused variable - board ```
…formance (#5150) # Description Add some additional pragma statement that should improve performance with SQLite. This configuration is the same one used by Rails and added in rails/rails#49349 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Manually tested and added some new tests checking that pragma statements are correctly configured in the database. **Checklist** - I added relevant documentation - follows the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
…hancedsqlite3-adapter Most of the enhancements provided by activerecord-enhancedsqlite3-adapter have been upstreamed into Rails, including: * Add support for generated columns in SQLite3 adapter rails/rails#49346 * Add SQLite3 support for supports_insert_returning? rails/rails#49290 * Allow overriding SQLite defaults from database.yml rails/rails#50460 * Performance tune the SQLite3 adapter connection configuration rails/rails#49349 * Allow SQLite3 busy_handler to be configured with simple max number of retries rails/rails#49352 One enhancement Joy still needs is a way to load SQLite3 extensions. There is a sqlpkg gem that would provide this, which includes generating an initializer that overrides the Rails SQLite3 adapter. Instead, I‘ve just adapted the enhancedsqlite3-adapter code and droppped into the lib/ directory. The code here also knows how to load an extension by gem name which is makes it useful for the sqlite_ulid gem currently in use.
…hancedsqlite3-adapter Most of the enhancements provided by activerecord-enhancedsqlite3-adapter have been upstreamed into Rails, including: * Add support for generated columns in SQLite3 adapter rails/rails#49346 * Add SQLite3 support for supports_insert_returning? rails/rails#49290 * Allow overriding SQLite defaults from database.yml rails/rails#50460 * Performance tune the SQLite3 adapter connection configuration rails/rails#49349 * Allow SQLite3 busy_handler to be configured with simple max number of retries rails/rails#49352 One enhancement Joy still needs is a way to load SQLite3 extensions. There is a sqlpkg gem that would provide this, which includes generating an initializer that overrides the Rails SQLite3 adapter. Instead, I‘ve just adapted the enhancedsqlite3-adapter code and droppped into the lib/ directory. The code here also knows how to load an extension by gem name which is makes it useful for the sqlite_ulid gem currently in use.
…hancedsqlite3-adapter Most of the enhancements provided by activerecord-enhancedsqlite3-adapter have been upstreamed into Rails, including: * Add support for generated columns in SQLite3 adapter rails/rails#49346 * Add SQLite3 support for supports_insert_returning? rails/rails#49290 * Allow overriding SQLite defaults from database.yml rails/rails#50460 * Performance tune the SQLite3 adapter connection configuration rails/rails#49349 * Allow SQLite3 busy_handler to be configured with simple max number of retries rails/rails#49352 One enhancement Joy still needs is a way to load SQLite3 extensions. There is a sqlpkg gem that would provide this, which includes generating an initializer that overrides the Rails SQLite3 adapter. Instead, I‘ve just adapted the enhancedsqlite3-adapter code and droppped into the lib/ directory. The code here also knows how to load an extension by gem name which is makes it useful for the sqlite_ulid gem currently in use.
…formance (#5150) # Description Add some additional pragma statement that should improve performance with SQLite. This configuration is the same one used by Rails and added in rails/rails#49349 **Type of change** - Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** - [x] Manually tested and added some new tests checking that pragma statements are correctly configured in the database. **Checklist** - I added relevant documentation - follows the style guidelines of this project - I did a self-review of my code - I made corresponding changes to the documentation - I confirm My changes generate no new warnings - I have added tests that prove my fix is effective or that my feature works - I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)
For Rails applications, the Write-Ahead-Log in normal syncing mode with a capped journal size, a healthy shared memory buffer and a shared cache will perform, on average, 2× better.
Motivation / Background
Rails applications using SQLite are poorly tuned for the web application context in which the database is being run. Because of its commitment to backwards-compatibility and its original use-case for embedded systems, the default configuration of SQLite today is not well-suited for web applications. Luckily, through its
PRAGMAstatements, SQLite is easy to configure.As has been detailed in academic literature, real-world developer experience, and my own writing, SQLite databases can be easily optimized to achieve at least a 2× performance increase.
This pull request updates the default configuration of Rails' SQLite connections to better tune them to the context of modern day Rails applications.
Detail
The first and most important pragma to understand and tune is the journal_mode pragma. Since version 3.7.0 (2010-07-21) SQLite has offered a Write-Ahead log to support the atomic transactions. This is the recommended journal mode for web applications, since, as the SQLite docs lay out:
Once an application is using the Write-Ahead log, it also makes sense to relax the synchronization rhythm:
The
NORMALsync mode means that SQLite will flush to disk less often than after every single write. SQLite has its own algorithm for determining the “most critical moments” to write to disk, where it syncs everywal_autocheckpointpages (which defaults to 1000). So, if/when thewal_autocheckpointpragma is changed,NORMALmode syncs would occur after that many pages are written. So, we trade an aggressive approach to durability for speed. However, SQLite does a lot to mitigate the reduction in durability, and it is honestly an extreme edge-case. In fact, SQLite ensures that any potential data loss could only happen with OS or filesystem failure; any process crash won’t affect data durability. So, we are optimizing for the 99% case and not the 1% case, which I think is appropriate for a Rails application.The
journal_size_limitpragma tells SQLite how much of the write-ahead log data (in our case) to keep in the on-disk file. The default of -1 means that there is no limit, so this disk file will grow in size indefinitely. This is not what we want. Anyone who has experienced app downtime because log files filled up your disk space no that unlimited file size is just a massive headache waiting to happen. We need to ensure that the file size is capped at an appropriate size.Next up, we have the abbreviated pragma
mmap_size. This setting controls the “the maximum number of bytes of the database file that will be accessed using memory-mapped I/O.” This is a mouth-full, but the gist is that when we enable memory-mapped I/O, we are allowing SQLite to share data among multiple processes. The memory map plays a similar role to Postgres’ buffer pool, so instead of disabling it, we should set it to the same safe as the default Postgres buffer pool—128MB.Finally, the
cache_sizepragma sets the “maximum number of database disk pages that SQLite will hold in memory at once per open database file.” The default value of -2000 is a negative number, which SQLite interprets as a byte limit. If we use a positive number, SQLite will interpret this as a page limit. The default limit is ~2MB (2,048,000 bytes) and is independent of the number of pages. We want to ensure that we have a large cache and that it doesn’t split across pages, so let’s use a positive number to set the cache limit to a page number. I recommend 2,000 pages as the cache_size, which, with the default page size of 4,096 bytes, means that the cache limit is ~8MB (8,192,000 bytes).Checklist
Before submitting the PR make sure the following are checked:
[Fix #issue-number]