For my application I have some data display requirements that will necessitate querying some statistics about products. In this particular case I want to know how "popular" a product is in comparison to all other products. One way I can do this is to determine some usage quartiles, that is, the three values that split the data into four equal parts. I can then compare the usage count for a single product to see which quartile it fits into, giving them a ranking name from "none", "lowest", "low", "medium" to "high".
Here are some examples of the user interface to represent this:
Postgres has a function that can calculate these numbers for me called PERCENTILE_CONT
. An example of what this looks like is as follows:
SELECT
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY catalogue_count) AS low,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY catalogue_count) AS medium
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY catalogue_count) AS high
FROM
products
Turning this into a query in Ruby therefore requires using custom SQL, so I could add something like this to my Product
class:
app/models/product.rb
class Product < ApplicationRecord
...
def catalogue_usage_quartiles
select(
'PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY catalogue_count) AS low,' \
'PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY catalogue_count) AS medium,' \
'PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY catalogue_count) AS high'
)
.take
end
...
end
And then I could use those quartiles to generate my named rankings, something like the following:
app/models/product.rb
class Product < ApplicationRecord
...
def catalogue_usage_ranking
case catalogue_count
when 0
'none'
when 1..catalogue_usage_quartiles.low
'lowest'
when (catalogue_usage_quartiles.low + 1)..catalogue_usage_quartiles.medium
'low'
when (catalogue_usage_quartiles.medium + 1)..catalogue_usage_quartiles.high
'medium'
else
'high'
end
end
...
end
One downside to this approach is that now there is hard coded SQL in my Product
model that isn't really related to a Product's responsibilities. Determining the quartiles should be the responsibility of it's own class. One way to do this (and the subject of this blog) is the use of Query Objects. The folks over at Thoughtbot have an excellent article on Query Objects here. I encourage anyone to have a read of that.
So how can I use a Query Object here? Here is what it might look like:
app/models/queries/catalogue_usage_quartiles.rb
module Queries
class CatalogueUsageQuartiles
class << self
def call(scope: nil, options: {})
new(scope: scope).call(options: options)
end
end
def initialize(scope: nil)
super(scope: scope || Product.all)
end
def call(*)
@scope
.select(
"PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY #{sum_of_catalogues}) AS low," \
"PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY #{sum_of_catalogues}) AS medium," \
"PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY #{sum_of_catalogues}) AS high"
)
.find_by(
"(#{sum_of_catalogues}) > 0"
)
end
def sum_of_catalogues
'catalogue_count + recipes_count + inventory_stock_levels_count + inventory_derived_period_balances_count'
end
end
end
What is this doing then?
First of all there is a class method with a name of call
. I can pass in a custom scope to this, maybe I want only the products that have certain criteria met. I could potentially be passing in other options for my query too. I do not need to in this example however. The class method creates a new instance of my query object and executes the call
method on it.
The instance call
method is where the work is done. All my custom query logic lives here, and it should return either an instance of a record or another ActiveRecord relation so that these queries can be chained.
I can now remove the quartile calculations from the Product
class and do not even need to change my catalogue_usage_ranking
method.
app/models/product.rb
class Product < ApplicationRecord
...
def catalogue_usage_ranking
case catalogue_count
when 0
'none'
when 1..catalogue_usage_quartiles.low
'lowest'
when (catalogue_usage_quartiles.low + 1)..catalogue_usage_quartiles.medium
'low'
when (catalogue_usage_quartiles.medium + 1)..catalogue_usage_quartiles.high
'medium'
else
'high'
end
end
private
def catalogue_usage_quartiles
@catalogue_quartiles ||= Queries::CatalogueUsageQuartiles.call
end
...
end
So what have I gained? Well now I can hide away the complexity of these quartile calculations into a class that has executing this as its only responsibility. The Product
class is simplified as it has less code in it. I can also further DRY up these rankings by converting my catalogue_usage_ranking
method into a usage_ranking
method that takes the quartiles to use and a count so I can easily expand this to do my transaction and settings usage rankings I need. Also, if there is a need to calculate these quartile values anywhere else in my application (maybe I want to display them somewhere?) then I do not need to get the Product
class involved at all.