Connecting to Multiple Databases

Part 7 of building a Rails 7 application

This application needs to retrieve and manipulate data from an existing application. There are a couple ways of going about this.

The first would be to consume an API provided by the existing application (such as JSONAPI or GraphQL). This has the advantage of completely separating the responsibilities of each application. The potential downside is performance. My application intends to read and update a large number of records (millions), this may not be fast enough via an API.

The second way is by having my application connect directly to the other application's database. This assumes that my application can be deployed on the same network. It has performance benefits but the downside is that any business logic that is being applied to data by the existing application will not be reflected in my application (unless I duplicate it).

As the subject of this blog indicates I will be using the second way as I feel I will need the best possible performance in order for the application to be useful. Later in this blog series I will do a refactor spike to test using the existing application's API instead for comparison purposes.

In order to connect to a second database I need to modify my database.yml as follows:

development:
  primary:
    <<: *default
    database: catalogue_cleanser_development
  external:
    <<: *default
    database: ***************
    database_tasks: false

The database for my application is now referred to as primary and the external application's database can be referred to as external. This is important in a moment. Specifying database_tasks as false tells Rails that I do not wish this database to be affected by any tasks such as db:migrate.

To create models in my application that can use this external database I need to create a new abstract base class:

app\models\external\application_record.rb

module External
  class ApplicationRecord < ActiveRecord::Base
    self.abstract_class = true

    connects_to database: { writing: :external, reading: :external }
  end
end

The {writing: :external, reading: :external} is where I refer to the external database configuration that is in database.yml.

This now allows me to create classes that can access tables in the external database, like this:

app\models\external\catalogued_product.rb

module External
  # Connects to external Goods::CataloguedProduct class
  class CataloguedProduct < External::ApplicationRecord
    self.table_name = :goods_catalogued_products
  end
end

Let me test whether that works as intended.

3.0.2 :001 > External::CataloguedProduct.count
  External::CataloguedProduct Count (4645.2ms)  SELECT COUNT(*) FROM "goods_catalogued_products"
 => 11363621                                                                       
3.0.2 :002 >

Excellent, that has connected and is returning data as expected.

I can now implement any other classes needed to retrieve or update data in the external database as needed.

I needed a couple more configuration settings in order to get multiple databases created by GitHub actions so the tests won't fail.

Add a new step to create an external_test database:

.github/workflows/rubyonrails.yml

      - name: Add external database
        run: psql -h localhost -d rails_test -U rails -c 'CREATE DATABASE external_test;'
        env:
          PGPASSWORD: password

Modify the database.yml to allow the database configuration for the external database to be passed in as an environment variable:

test:
  primary:
    <<: *default
    host: <%= ENV['DATABASE_HOST'] || 'localhost' %>
    database: <%= ENV['DATABASE_NAME'] || 'catalogue_cleanser_test' %>
    username: <%= ENV['DATABASE_USER'] || 'postgres' %>
    password: <%= ENV['DATABASE_PASSWORD'] %>
  external:
    <<: *default
    host: <%= ENV['EXTERNAL_DATABASE_HOST'] || 'localhost' %>
    database: <%= ENV['EXTERNAL_DATABASE_NAME'] || 'external_test' %>
    username: <%= ENV['EXTERNAL_DATABASE_USER'] || 'rails' %>
    password: <%= ENV['EXTERNAL_DATABASE_PASSWORD'] || 'password' %>
    database_tasks: false

And then use those new EXTERNAL_xxx environment variables when running the tests.

.github/workflows/rubyonrails.yml

      - name: Run Tests and Upload Code Coverage
        uses: paambaati/codeclimate-action@v3.0.0
        env:
          CC_TEST_REPORTER_ID: ${{ secrets.CC_TEST_REPORTER_ID }}
          EXTERNAL_DATABASE_HOST: localhost
          EXTERNAL_DATABASE_NAME: external_test
          EXTERNAL_DATABASE_USER: rails
          EXTERNAL_DATABASE_PASSWORD: password
        with:
          coverageCommand: bin/rails test:all
          debug: true