Friday, August 29, 2008

Obsessing over URLs


One of the most important components of a website is URL design. I have obsessed over URLs since the beginning of time, and REST caused me to focus on them even more. Here's a classic example of a clean URL:
www.strictlyuntyped.com/articles/1/comments/1
This seems quite ordinary by today's standards. But admit it, a few years ago us web developers were getting really excited about this stuff.


What was the original problem?

Let's start from the beginning. The 'dirty' version of the above URL:
www.strictlyuntyped.com/read?article_id=1&comment_id=1
One of the most notable problems with this type of URL is that the user sees it. The URL is an important piece of the user experience, since (1) it is displayed at the top of the browser window, (2) users determine where they are on a site by the URL, and (3) they type in and share URLs, so they must be 'readable'. Essentially, no information is provided to the user about the structure and/or hierarchy of your site.

The second problem with dirty URLs are developer facing. The application code pretty much consists of a function that takes in a hash. A Ruby analogy would be the following method:

def read(options={}
The method has zero self-documentation, and a comment would be required to determine what options it expects. Our world here is essentially a one dimensional array of methods that take a hash.

As a comparison, here is what I believe the Rails analogy of the aforementioned clean URL would be:

Article.find(article_id).comments.find(comment_id)
The idea of introducing structure into a line of characters is a simple, yet profound, concept. It seems to have become a requirement for any 'Web 2.0' site.


Don't go overboard with clean RESTful URLs

Clean URLs and REST works incredibly well for content managing systems such as blogs. Without coincidence, Rails works very well with these types of systems. In other cases, I see myself, and others, forcing a square through a round hole when it comes to URL design. Here are two mistakes that myself, and probably others, have made.


Mistake #1: Overcomplicated Hierarchies

There exists an implicit hierarchy to clean URL design; an article having comments is a good example. On the other hand, there is no reason to create deeply nested hierarchies for the sake of representing the real world. the { buckblogs :here } says not to nest resources more than one level deep from a programmatic point of view. I completely agree, and amend his reasons from an information design perspective. If you were to make a website about animals, would the urls represent biological classifications? Here is what the URL for a lion will look like:
/kingdom/animalia/phylum/chordata/class/mammalia/order/carnivora/family/felidae/genus/panthera/species/lion
The point is, flatten your hierarchy and make Tufte proud!


Mistake #2: Unordered Parameters

Oftentimes, the input required for a URL does not fit in a hierarchy. A great analogy is a Ruby method. Think of how many methods in Ruby on Rails use the following format:
def some_method(foo, bar, options = {})
This method makes the options optional, and they can be passed in without any specific order. Why not leverage these same benefits with a URL? With that in mind, there is nothing wrong with the following URL:
www.strictlyuntyped.com/articles/1/comments?order=date&rating=4
How else would Google Maps make locations sharable?

To wrap up my ramblings, don't get stuck with your project obsessing over how perfect your URLs are. Keep the hierarchy simple, and use a hash of parameters when you need it.

Thursday, August 14, 2008

ActiveRecord: Can haz namespaces?


There are plenty of blog postings from has_many :through, err the blog and Pratik's blog describing how to completely avoid namespaced models. I unfortunately discovered these sites only after struggling with namespaces myself.

If namespaces were completely unsupported by ActiveRecord, I would leave it at that. Contrarily, namespaces are moderately supported, but lead first timers to the spooky edges of the Rails framework.

After giving it some thought, I determined that the best way to describe the state of namespacing in ActiveRecord is to act as a first time novice, pretending to use them for the first time!



The Requirements

I am an intern working for a company that has built their entire website with Ruby on Rails. My boss wants me to add statistics tracking similar to Google Analytics, and he wants it all to go into the Statistics:: namespace. The initial requirements include tracking the milliseconds that each HTTP request took to complete, how long users remained logged into their session, and which session each request belonged to.

This sounds like a really easy project, and I want to impress my boss by using all the best practices and conventions of Rails.



Table Name Surprise

I will start with a model that stores the length of each request. It makes sense to use a Rails script to generate Statistics::Request:
ruby script/generate model Statistics::Request milliseconds:integer
  create  app/models/statistics
  create  test/unit/statistics
  create  test/fixtures/statistics
  create  app/models/statistics/request.rb
  create  test/unit/statistics/request_test.rb
  create  test/fixtures/statistics_requests.yml
  create  db/migrate
  create  db/migrate/20080814034925_create_statistics_requests.rb
The generator conveniently created a statistics folder inside both app/models and test/unit. I am a little bit confused as to why the folder test/fixtures/statistics was created, when the actual fixture was placed in test/fixtures/statistics_requests.yml, but I will figure that out later.

The table name in the migration is statistics_requests, so I immediately assume that table names include the namespace. After migrating the database, I fire up the console to try out the model:
>> Statistics::Request.create(:milliseconds => 10)
ActiveRecord::StatementInvalid: Could not find table 'requests'
How strange, I thought the table's name is statistics_requests. Just to make sure:
>> Statistics::Request.table_name
=> "requests"
Is this a bug in Rails? Apparently not. I guess it's time to do some dirty monkey patching:


Good, now the model can find its table. Time to write some fixtures and tests.


Tests are looking good

Feeling confident after getting my namespaced model to work, it would be nice to make sure the the new test runs:
rake test:units
... "test/unit/statistics/request_test.rb"
1 tests, 1 assertions, 0 failures, 0 errors
Sweet - It looks like tests are discovered recursively in the test/unit folder.


Associations

I need to add the Statistics::Session model, and also update Statistics::Request to belong to a session.

The migration:



The model:


I use the console to test out the association:
>> session = Statistics::Session.create(:start => 4.hours.ago, :end => Time.now)
=> #<Statistics::Session id: 1, start: "2008-08-14 00:42:00", end: "2008-08-14 04:42:00">
>> request = Statistics::Request.create(:milliseconds => 10, :session => session)
ArgumentError: Statistics is not missing constant Session!
This is a bizarre error. It turns out that Rails can't find Statistics::Request from Statistics::Session. I finally figured out that I need to do this:


It would make more sense if I only had to specify :class_name if it existed in a different namespace.


Namespaced Fixtures = WTF?

Development and testing will be easier if I generate some fixture data. Remember when the model generator script created an unused folder?
create  test/fixtures/statistics
...
create  test/fixtures/statistics_requests.yml
Based on the other conventions, it seems like this file belongs in test/fixtures/statistics/requests.yml. I'll move it there, and load up some fake data:
rake db:fixtures:load
It will be fun to play around with all that fixture data:
>> Statistics::Request.count
=> 0
How bizarre, the data did not even get loaded. I soon find out that unlike unit tests, fixture files are not recursively loaded. The file can't be moved after all.

I need to get the request -> session association working in the fixtures too. Let me edit statistics_sessions.yml and statistics_requests.yml
:


Unfortunately, this does not work:
rake db:fixtures:load
rake aborted!
SQLite3::SQLException: table statistics_requests has no column named session...
For some reason, the label referencing is not working with my namespaced models. I guess I need to hand hold Rails on this one:



Observers - OMG they work!

Feeling a little down after my battle with fixtures, I decide to make a request observer that logs whenever we get a request:
ruby script/generate observer Statistics::Request
  exists  app/models/statistics
  exists  test/unit/statistics
  create  app/models/statistics/request_observer.rb
  create  test/unit/statistics/request_observer_test.rb
Looking good so far. Let me implement the class:


I also won't forget to add the following line to environment.rb:


And viola, it all works:
>> Statistics::Request.create(:milliseconds => 10)
We got a hit!!!

In Summary

Ok, I'm done playing stupid. My final word is that namespaced models can be made to work, but there are some rough edges. Some final tips for you:
  • Don't have conflicting namespaces and class names. For example, I should not introduce a class named Statistics in the above example. Otherwise, I will end up seeing the following message: "warning: toplevel constant Statistics referenced by Statistics::Statistics".
  • Even when referencing other classes in the same namespace, it is best to always include the namespace prefix. Otherwise, you will run into issues with conflicting class names across different namespaces and Rails' automagical class loading.


Some things that Rails should do:
  • All APIs in Rails that support defining classes by Symbols should also support Strings. I was fortunate that I could pass in 'statistics/request_observer' to active_record.observers. However, other methods such as ActiveRecord::Observer.observe do not support strings.
  • I believe that there is enough reason to support prefixing database table names with the namespace. This is as simple as introducing a new option to ActiveRecord called 'include_namespace_in_table_name'.
  • Fix the fixtures.

Saturday, August 9, 2008

Googtaculous, now with SSL support


This is just a short note. Google recently added SSL support to their AJAX Libraries API. Googtaculous will now automatically use https if the incoming request comes in over SSL. (http://github.com/matthuhiggins/googtaculous/tree/master).

Browsers show warnings if a website includes both http and https requests. A few readers asked for this addition.

Tuesday, August 5, 2008

In place Array methods for Prototype


Unlike Ruby, Prototype does not offer in place iterators for Array methods such as compact, map and reject. I started down the road of implementing these methods, and discovered significant improvements with large arrays. Since Javascript does not support the "!" character in function names, I decided to name the in place versions with underscore prefixes.

map

_map is simple to implement. We replace the value in each index with the result of the iterator function:


In this case, and the remaining ones, this is returned so that the methods can be chained.

reject

For in place _reject, I scan the array once from left to right. Elements that are not rejected are placed in the array from left to right, and finally the length is truncated. I believe it is important to maintain the original order of the elements:


compact

_compact is very similar to reject:


The above implementation can be shortened by using _reject, but it significantly reduces performance. For completeness, that might be written as:


Results

These are results in Firefox 3, using an array of 100,000 numbers and null values. Changing the size did not seem to affect the relative performance.
MethodPrototype TimeIn Place Time
map159ms42ms
reject158ms45ms
compact161ms13ms

One should probably do a more comprehensive test, but these results are enough to show that performance gains can be made with my faster versions. In addition, less memory is being used.