chazmeyers.com/blog


Leaky Abstractions.

Posted in RubyOnRails, General Programming Topics by chazmeyers on the February 6th, 2007

You ever have a programming “Doh!” moment? I had one the other day. You could say I fell face first into a leaky abstraction.

For those of you who are unfamiliar with the phrase “leaky abstraction”, Joel Spolsky coined it in his classic essay titled “The Law Of Leaky Abstractions“. Go read it. Just open a new tab. I’ll still be here.


This week at work, I was writing a prototype for an algorithm in Ruby. Once everything was working in memory, I had to see how well things would work when the algorithm iteratively stored it’s results into a database. I am a lazy man, so I just took my existing classes and made them into ActiveRecords.

Consider the following:

Code (ruby)
  1.  
  2. class Student < ActiveRecord::Base
  3.   def number_of_days_available_between(start_date, end_date)
  4.     start_date = start_date.at_beginning_of_day
  5.     end_date = end_date.to_end_of_day
  6.  
  7.     count = (start_date .. end_date).inject(0) do |sum, date|
  8.       params = { :start => date.at_beginning_of_day, :endd => date.to_end_of_day }
  9.       sum += 1 unless self.vacations.find( :first, :conditions => ["start_date >= :start AND end_date ,= :endd", params ] )
  10.       sum
  11.     end
  12.   end
  13.  
  14.   def probability_for_clinic
  15.     available_days = self.number_of_days_available_between(start_date, end_date)
  16.  
  17.     # if student has no days left, there’s no chance! return negative infinity
  18.     return -1/0.0 if available_days == 0
  19.  
  20.     assignment_count = Assignment.count(:conditions => ["student_id = :me AND clinic_detail_id = :clinic", {:me => self.id, :clinic => clinic.id}] )
  21.  
  22.     clinic_days_left = ( clinic.duty_length - assignment_count ).to_f
  23.     clinic_days_left / available_days
  24.   end
  25. end
  26.  
  27. students = Student.find(:all)
  28. student_probabilities = students.inject({}) do |hsh, student|
  29.   hsh[student] = student.probability_for_clinic( clinic, start_date, end_date)
  30.   hsh
  31. end
  32.  
  33. students = students.sort {|s1, s2| student_probabilities[s2] <=> student_probabilities[s1] }

I started it up and waited for what felt like an eternity. Much frustration and a ^C later, I realized I went wrong somewhere.

That’s when I remembered that ActiveRecord could be asked to write everything it was doing to STDOUT. Boy was I shocked!

I restarted the program to be greeted by a sea of “SELECT * FROM Vacactions” statements. As previously written, my script was generating about 90! selects to the vacations table for each student. Even with just over 1000 students, this took a silly ridiculous amount of time. My poor little script was in dire need of caching.

My new version isn’t as pretty to read and it could still use a bit more work, but it is much quicker:

Code (ruby)
  1. sql = "clinic_detail_id = :clinic AND student_id IN (:students) AND day >= :start_date AND day <= :end_date"
  2. args = { :students => students, :clinic => clinic.id, :start_date => start_date, :end_date => end_date }
  3. cached_assignments = Assignment.find( :all, :conditions => [ sql, args ] )
  4.  
  5. sql = "student_id IN (:students) AND start_date >= :start_date AND end_date <= :end_date"
  6. cached_vacations = Vacation.find( :all, :conditions => [ sql, args ] )
  7.   students = Student.find(:all)
  8.   student_probabilities = students.inject({}) do |hsh, student|
  9.   # find the number of days student will be at school in semester.
  10.   available_days = (start_date .. end_date).inject(0) do |sum, date|
  11.     sum += 1 unless cached_vacations.detect { |v| v.student_id == student.id && ( v.start_date .. v.end_date ).include?( date ) }
  12.   end
  13.  
  14.   # if there are no days left, there is no chance.
  15.   if available_days == 0
  16.     hsh[student] = -1.0/0.0
  17.   else
  18.     assignments_served = cached_assignments.select { |a| a.student_id == student.id && ( start_date .. end_date ).include?( a.day ) }.size
  19.     hsh[student] = ( clinic.duty_length - assignments_served ).to_f / available_days
  20.   end
  21.   hsh
  22. end
  23.  
  24. s = students.sort { |s1, s2| student_probabilities[s2] <=> student_probabilities[s1] }

Do note that if you squint hard enough, these two algorithms look about the same in terms of “big O”.

Sometimes the devil really is in the details. The devil in this case was hiding a 90! constant under an abstraction. Jerk!

Abstractions lie. Be prepared to deal with it.
Or, in the oft spoken words of Tim Gunn, “Make it work.”

Leave a Reply