skip to main |
skip to sidebar
Ruby on Rails provides some handy methods for using SQL aggregate functions:- count
- average
- minimum
- maximum
- sum
While this is a good start, MySql has additional functions that are not included in this list. I will show an example using group_concat.
group_concat
If you recall from my post about on duplicate key update, we had a Post model that included a virtual attribute called tag_list:
This code retrieves all tags for the post, creates Tag objects, maps that to an array of strings, and joins them together. While this code looks nifty, it is performing a hefty amount of work for the task.
Thankfully, MySql has a group_concat function, and Rails exposes a calculate method to perform arbitrary aggregate functions by name. Without further ado, here is how we can use group_concat:
The separator defaults to ','. If you want to change it, sneak in a separator command:
group_concat will be significantly faster than the original method.
Another group of underused aggregate functions are bit_and, bit_or, and bit_xor. If you have a many-many association between a model and an enumerated set of 64 or less items, this information can be stored in a single BIGINT field on the model.
This wraps up my unglamorous MySql series. If I saved one clock cycle on one database server, then I claim success.
This is a follow up to last week's MySql Lovin Part 1.
You will eventually want to use MySql specific data types. While arbitrary SQL statements can be executed in migrations, an alternative is to extend the MySql adapter to support new types of columns.
Two examples of columns I use are:
- BIGINT. Rails 2.1 attempts to use the :limit option to predict you want a BIGINT, but there seems to be confusion in Rails source about what the N means when you write BIGINT(N). Nevertheless, BIGINT always has the same maximum and minimum limit, and if I want to use one, I like to be explicit.
- Primary Key without auto increment. Sometimes I have enums defined in Ruby code, and want to explicitly define their value when inserting into the database, rather than having their id value automatically incremented by the database.
Let's add support for these columns, so that they can be used with create_table. First, have a look at native_database_types, defined in ActiveRecord::ConnectionAdapters::MysqlAdapter. Rails provides no facility to register more column types with an adapter, but that does not prevent merging them into the column type hash:
With the new types added, they can be used in migrations:
Shorthand Columns
To make these migrations a bit more appealing, extend ActiveRecord::ConnectionAdapters::TableDefinition to support shorthanded column names:
Refactor the migration:
Adding Native Support to Rails
It would be nifty if Rails allowed you to perform the above in an initializer (very similar to how you register mime_types). For example:
I might follow this up with a Rails patch, but need to think about it more. Also, the weather is too nice for me to continue blogging.
By coding through the ActiveRecord abstraction, we lose close ties to SQL. This is a major selling point for Rails, but it should not be why you are sold on Rails.
Avoiding vendor specific database features is silly, and doing so does not make you a better Rails developer. Database neutrality has never helped me because I have yet to swap database vendors. If I did need to swap, it would be due to a vendor specific feature. ActiveRecord is great for the majority of operations, but you should not live exclusively within its constraints. And so I begin the MySql Lovin' series...
Perfect Rails Code
Start with a simple data model for a blog. A post has text, and can have zero or more tags. Here is what the migration looks like:
The tags table also keeps track of the total posts per tag.
The models for the blog might look like this:
Notice that the Post model has a virtual tag_list attribute. This automatically splits a comma delimited list of tags, creates new tags if necessary, and generates the associations between posts and tags.
Too Many Queries
Fill out the form:

Now check the query log to see what queries hit the database:
16 queries. Gross, let's try to do better.
On Duplicate Key Update
MySql's INSERT ... ON DUPLICATE KEY UPDATE is a perfect solution to reduce the number of queries. It can be used to:
- Insert non-existing tags
- Update the counter cache for posts per tag.
- Remove the need to instantiate a Tag object for each tag.
Let's give it a try:
That SQL code sure looks ugly. It probably belongs in the Tag class, and can be generated automatically from the model's attributes. What if I made a plugin that added create_or_update to ActiveRecord::Base?
More importantly, the number of queries have decreased:
This is a good start. The number of queries has decreased by a 2x factor, and Tag objects are no longer being created.
By alas, what is that query on line 4 for? It looks like Rails runs this query after assigning to tag_ids. This can be avoided by directly creating the entries for post_tags, but I can accept this annoyance for today.