Leaky Abstractions.
You ever have a programming “Doh!” moment? I had one the other day. You could say I fell face first into a leaky abstraction.
For those of you who are unfamiliar with the phrase “leaky abstraction”, Joel Spolsky coined it in his classic essay titled “The Law Of Leaky Abstractions“. Go read it. Just open a new tab. I’ll still be here.
This week at work, I was writing a prototype for an algorithm in Ruby. Once everything was working in memory, I had to see how well things would work when the algorithm iteratively stored it’s results into a database. I am a lazy man, so I just took my existing classes and made them into ActiveRecords.
Consider the following:
-
-
class Student < ActiveRecord::Base
-
def number_of_days_available_between(start_date, end_date)
-
start_date = start_date.at_beginning_of_day
-
end_date = end_date.to_end_of_day
-
-
count = (start_date .. end_date).inject(0) do |sum, date|
-
params = { :start => date.at_beginning_of_day, :endd => date.to_end_of_day }
-
sum += 1 unless self.vacations.find( :first, :conditions => ["start_date >= :start AND end_date ,= :endd", params ] )
-
sum
-
end
-
end
-
-
def probability_for_clinic
-
available_days = self.number_of_days_available_between(start_date, end_date)
-
-
# if student has no days left, there’s no chance! return negative infinity
-
return -1/0.0 if available_days == 0
-
-
assignment_count = Assignment.count(:conditions => ["student_id = :me AND clinic_detail_id = :clinic", {:me => self.id, :clinic => clinic.id}] )
-
-
clinic_days_left = ( clinic.duty_length - assignment_count ).to_f
-
clinic_days_left / available_days
-
end
-
end
-
-
students = Student.find(:all)
-
student_probabilities = students.inject({}) do |hsh, student|
-
hsh[student] = student.probability_for_clinic( clinic, start_date, end_date)
-
hsh
-
end
-
-
students = students.sort {|s1, s2| student_probabilities[s2] <=> student_probabilities[s1] }
I started it up and waited for what felt like an eternity. Much frustration and a ^C later, I realized I went wrong somewhere.
That’s when I remembered that ActiveRecord could be asked to write everything it was doing to STDOUT. Boy was I shocked!
I restarted the program to be greeted by a sea of “SELECT * FROM Vacactions” statements. As previously written, my script was generating about 90! selects to the vacations table for each student. Even with just over 1000 students, this took a silly ridiculous amount of time. My poor little script was in dire need of caching.
My new version isn’t as pretty to read and it could still use a bit more work, but it is much quicker:
-
sql = "clinic_detail_id = :clinic AND student_id IN (:students) AND day >= :start_date AND day <= :end_date"
-
args = { :students => students, :clinic => clinic.id, :start_date => start_date, :end_date => end_date }
-
cached_assignments = Assignment.find( :all, :conditions => [ sql, args ] )
-
-
sql = "student_id IN (:students) AND start_date >= :start_date AND end_date <= :end_date"
-
cached_vacations = Vacation.find( :all, :conditions => [ sql, args ] )
-
students = Student.find(:all)
-
student_probabilities = students.inject({}) do |hsh, student|
-
# find the number of days student will be at school in semester.
-
available_days = (start_date .. end_date).inject(0) do |sum, date|
-
sum += 1 unless cached_vacations.detect { |v| v.student_id == student.id && ( v.start_date .. v.end_date ).include?( date ) }
-
end
-
-
# if there are no days left, there is no chance.
-
if available_days == 0
-
hsh[student] = -1.0/0.0
-
else
-
assignments_served = cached_assignments.select { |a| a.student_id == student.id && ( start_date .. end_date ).include?( a.day ) }.size
-
hsh[student] = ( clinic.duty_length - assignments_served ).to_f / available_days
-
end
-
hsh
-
end
-
-
s = students.sort { |s1, s2| student_probabilities[s2] <=> student_probabilities[s1] }
Do note that if you squint hard enough, these two algorithms look about the same in terms of “big O”.
Sometimes the devil really is in the details. The devil in this case was hiding a 90! constant under an abstraction. Jerk!
Abstractions lie. Be prepared to deal with it.
Or, in the oft spoken words of Tim Gunn, “Make it work.”