It takes all sorts

Joshua Baker

March 30, 2017

In the first year of computer science, you learn about the different algorithms available for sorting a list of items. There’s the pedestrian bubble sort, which scans the list over and over, swapping adjacent pairs of items into the correct order until there are no pairs left to swap. There’s the methodical insertion sort, which is much like the way a person would sort a stack of papers. There’s the sublime heapsort, which sorts the list at first by partially sorting it in the wrong direction. There’s the chaotic quicksort, reputedly impossible to implement correctly on the first try.

‍

Various sorting algorithms caught in the act

See these sorting algorithms running live here.

Rarely do professional programmers find themselves having to implement a sorting algorithm. These are the guts of library functions, written and tested once and used many thousands of times. But the variety of methods available for sorting a list provides an excellent illustration of the flexibility and creativity available to the programmer. We can prioritise rapid implementation and use a bubble sort. We can prioritise elegance and go with heapsort. Or we can prioritise masochism and write a quicksort.

In agile software development, we’re always making educated guesses about the kind of load that any piece of code will end up bearing as the product takes shape. Functionality that we expect to be customised in the future will be written with a view to extensibility; code that’s expected to be run many times a second will be treated with extra care.

The benefit of learning to make these decisions confidently is best appreciated when things don’t quite turn out as you planned. A few years ago I was starting a project that involved importing a large amount of data to a third party API from CSV files. It was anticipated that it would be important to be able to pause an import in progress and resume it later. To support this I wrote a custom CSV reader that could return the line and byte offset in the file at any point. And accept them as a starting point when the file was opened.

But that important feature never eventuated. As the product became more clever, the overhead of the logic dealing with the API quickly dwarfed the overhead that would have been required to buffer the whole file into memory each time it was read. I had written a quicksort when a bubble sort would have worked.

Of course, it’s just as easy to misjudge the other way. The core interface of our product Timeslice is a calendar widget in which staff can view and add their timesheets. When the code to render the timesheets on the calendar was first written, it was assumed that people would (in general) be working on one task at a time. The case that multiple timesheets are present for the same point in time was handled simply by grouping all timesheets that would overlap — and drawing each one in its own column.

‍

‍

When the facility was developed for a manager to view timesheets for multiple staff members, this approach was shown to be far too simplistic. On the days multiple staff members filled in their timesheets they were drawn in columns so thin the calendar became unusable. With a bit more insight into how the calendar could be used in the future, the original rendering code could have been written with more care.

In an environment where we strive to move fast and create things, a sensitivity to where effort is best spent is essential. You should never compromise correctness. And straight up sloppy coding leads to security issues and makes your colleagues sad. But under real world pressures, sometimes the best solution is a bubble sort.

‍

JavaScript

Software

Algorithms

Sorting Algorithms

It takes all sorts

Recent Posts

Wellington, New Zealand

Get in touch