I hope my confusion at the difference between the memory allocation behavior of the following codes can be forgiven.

julia> using BenchmarkTools
julia> @btime sort!(v) setup=(v=randn(100));
  268.868 ns (0 allocations: 0 bytes)
julia> @btime sort!(v) setup=(v=tuple.(randn(100)));
  714.639 ns (2 allocations: 528 bytes)

Julia often uses the ! symbol on a function to indicate “in-place” but this also is usually a hint that it’s “non-allocating” – which is great if I want to sort things in place without extra memory allocation. (See my previous post.)

(Strictly speaking, this ! simply means ‘mutating’ which means that the function changes the input arguments.)

Neither of these is a contract. (e.g. something in-place might be allocating/non-allocating depending on the types of array element.)

The documentation of sort! simple mentions in-place Sort the vector v in place.

However, in this case, tuple elements can be moved around without allocating. So I, for one, would expect the codes to share the same amount of memory traffic and allocations.

After much debugging, what I found is that julia defaults to QuickSort for numeric types, which is strictly non-allocating whereas it uses MergeSort for non-numeric types. (Yes, the change in algorithms is in the documentation for sort!, yay! What isn’t there is the implication on memory allocations.)

julia> @btime sort!(v;alg=QuickSort) setup=(v=tuple.(randn(100))); 947.789 ns (0 allocations: 0 bytes)

Moreover, this behavior depends only on the element type of the vector, not on the actual values sorted.

julia> @btime sort!(v,by=first) setup=(v=tuple.(randn(100)));
  790.624 ns (2 allocations: 528 bytes)
julia> @btime sort!(v;alg=QuickSort,by=first) setup=(v=tuple.(randn(100)));
  947.789 ns (0 allocations: 0 bytes)

(In this case, by=first means we should use the first element, which means Julia is conceptually sorting a numeric type.)

I think this particular example breaks an implicit user contract. By calling sort! I’m asking Julia not to allocate extra memory in its algorithm. To have a default algorithm choice that will allocate is wrong from a working mental model point of view. Moreover, if this isn’t the case, then the docs for sort! ought to be very strongly worded that sort! can only be recommended for numeric types unless you change the default algorithm. (But then why is that the default choice for sort.)

Finally, to the extend that the algorithm choice is influenced by the element type, it should apply to the actual element type used, not merely the element-type in memory!

I realize these are tricky issues. e.g. MergeSort is tough to have in0place

Many would be likely to say “premature optimization”, why am I fussed about this difference?

Here’s the issue. I need a mental model for how code is translated or I can’t possibly ever be concerned with “optimization” and I must give up. In this case, a simplified mental model of what the tool / language is doing breaks. So making sure the “compiler / language / runtime” engage my mental model for what his happening – when I’m trying to design it for that setting is not premature optimization, it’s simply part of the spec!

Moreover, it seems this issue has come up already more discussion