Friday, July 4, 2008

MySql Lovin' Part 1: On Duplicate Key Update


By coding through the ActiveRecord abstraction, we lose close ties to SQL. This is a major selling point for Rails, but it should not be why you are sold on Rails.

Avoiding vendor specific database features is silly, and doing so does not make you a better Rails developer. Database neutrality has never helped me because I have yet to swap database vendors. If I did need to swap, it would be due to a vendor specific feature. ActiveRecord is great for the majority of operations, but you should not live exclusively within its constraints. And so I begin the MySql Lovin' series...


Perfect Rails Code
Start with a simple data model for a blog. A post has text, and can have zero or more tags. Here is what the migration looks like:


The tags table also keeps track of the total posts per tag.

The models for the blog might look like this:



Notice that the Post model has a virtual tag_list attribute. This automatically splits a comma delimited list of tags, creates new tags if necessary, and generates the associations between posts and tags.

Too Many Queries
Fill out the form:

Form

Now check the query log to see what queries hit the database:



16 queries. Gross, let's try to do better.

On Duplicate Key Update
MySql's INSERT ... ON DUPLICATE KEY UPDATE is a perfect solution to reduce the number of queries. It can be used to:
  • Insert non-existing tags
  • Update the counter cache for posts per tag.
  • Remove the need to instantiate a Tag object for each tag.

Let's give it a try:



That SQL code sure looks ugly. It probably belongs in the Tag class, and can be generated automatically from the model's attributes. What if I made a plugin that added create_or_update to ActiveRecord::Base?

More importantly, the number of queries have decreased:



This is a good start. The number of queries has decreased by a 2x factor, and Tag objects are no longer being created.

By alas, what is that query on line 4 for? It looks like Rails runs this query after assigning to tag_ids. This can be avoided by directly creating the entries for post_tags, but I can accept this annoyance for today.


4 comments:

pete said...

If you wanted to do something similar without writing all the code yourself, you could try using the import method of the ar-extensions gem. It might not do all you did here, but it has its charms.

Matthew Higgins said...

@pete - Thanks. I ended up finding ar-extensions from a post on igvita.com (Link). It has not been updated for about one year, which is a bit concerning given that it was written for rails 1.2.

pete said...

@matthew higgins:

The author of ar-extensions warns that some stuff is broken with rails 2.x, but strictly using import is fine. I've been using it in a few production apps with rails 2.0.2.

3ebsocomputer said...

i want use with ar-extention ON DUPLICATE KEY UPDATE frequency=frequency+1;

how can i use it