0
0
Rubyprogramming~15 mins

Sort_by for custom sorting in Ruby - Deep Dive

Choose your learning style9 modes available
Overview - Sort_by for custom sorting
What is it?
Sort_by is a method in Ruby that helps you sort a list of items in a custom way. Instead of sorting items directly, it lets you tell Ruby how to convert each item into a value to sort by. This makes sorting flexible and easy, especially when you want to sort by something inside each item, like a name or a number.
Why it matters
Without sort_by, sorting complex data would be hard and slow because you'd have to write complicated code to compare items. Sort_by makes sorting simple and fast by letting you focus on what part of the item matters for sorting. This saves time and reduces mistakes, making programs easier to write and understand.
Where it fits
Before learning sort_by, you should know basic Ruby arrays and how to use simple sorting methods like sort. After sort_by, you can learn about more advanced sorting techniques, like sort with custom comparison blocks or sorting with multiple criteria.
Mental Model
Core Idea
Sort_by works by first turning each item into a simple value to sort, then sorting those values, making custom sorting easy and efficient.
Think of it like...
Imagine you have a box of books and want to arrange them by the author's last name. Instead of comparing whole books, you write each author's last name on a sticky note, line up the notes alphabetically, and then arrange the books in that order.
Original list: [item1, item2, item3]
Step 1: Map items to keys → [key1, key2, key3]
Step 2: Sort keys → [sorted_key1, sorted_key2, sorted_key3]
Step 3: Rearrange items based on sorted keys → [sorted_item1, sorted_item2, sorted_item3]
Build-Up - 7 Steps
1
FoundationUnderstanding basic sorting in Ruby
🤔
Concept: Learn how Ruby sorts simple lists using the sort method.
In Ruby, you can sort an array of numbers or strings using the sort method: numbers = [5, 2, 9, 1] sorted_numbers = numbers.sort puts sorted_numbers.inspect This will print: [1, 2, 5, 9] Ruby sorts items in their natural order, like numbers from smallest to largest or strings alphabetically.
Result
[1, 2, 5, 9]
Knowing how Ruby sorts simple lists helps you understand why you might need a custom way to sort more complex data.
2
FoundationSorting with a custom block
🤔
Concept: Ruby's sort method can take a block to decide how to compare two items.
You can tell Ruby how to compare items by giving sort a block: words = ["apple", "banana", "pear"] sorted_words = words.sort { |a, b| b.length <=> a.length } puts sorted_words.inspect This sorts words by length from longest to shortest. Output: ["banana", "apple", "pear"]
Result
["banana", "apple", "pear"]
Custom comparison blocks let you control sorting, but they can be slow and complex for big lists.
3
IntermediateIntroducing sort_by for simpler custom sorting
🤔Before reading on: do you think sort_by sorts items directly or sorts based on a transformed value? Commit to your answer.
Concept: sort_by sorts items by first mapping them to a key, then sorting those keys, making sorting faster and easier.
Instead of comparing items directly, sort_by creates a list of keys from each item, sorts those keys, and then arranges the original items accordingly. Example: people = [{name: "Bob", age: 30}, {name: "Alice", age: 25}] sorted_people = people.sort_by { |person| person[:age] } puts sorted_people.inspect Output: [{name: "Alice", age: 25}, {name: "Bob", age: 30}]
Result
[{:name=>"Alice", :age=>25}, {:name=>"Bob", :age=>30}]
Understanding that sort_by sorts based on keys explains why it is often faster and simpler than sort with a block.
4
IntermediateSorting by multiple criteria with sort_by
🤔Before reading on: can sort_by handle sorting by more than one attribute at once? Commit to yes or no.
Concept: sort_by can sort by multiple values by returning an array of keys for each item.
You can sort by several things by returning an array: people = [ {name: "Bob", age: 30}, {name: "Alice", age: 30}, {name: "Charlie", age: 25} ] sorted_people = people.sort_by { |p| [p[:age], p[:name]] } puts sorted_people.inspect This sorts first by age, then by name alphabetically. Output: [{name: "Charlie", age: 25}, {name: "Alice", age: 30}, {name: "Bob", age: 30}]
Result
[{:name=>"Charlie", :age=>25}, {:name=>"Alice", :age=>30}, {:name=>"Bob", :age=>30}]
Knowing you can sort by multiple keys with sort_by helps handle complex sorting needs cleanly.
5
IntermediatePerformance benefits of sort_by
🤔
Concept: sort_by is faster than sort with a block because it calculates keys once per item, not multiple times during comparisons.
When sorting with sort and a block, Ruby compares items many times, calling the block each time. sort_by calls the block once per item to get keys, then sorts keys quickly. Example timing: require 'benchmark' arr = (1..10000).map { rand(10000) } Benchmark.bm do |x| x.report("sort") { arr.sort { |a, b| a <=> b } } x.report("sort_by") { arr.sort_by { |x| x } } end sort_by is usually faster for large lists.
Result
sort_by runs faster than sort with a block on large arrays
Understanding the performance difference helps choose the right method for speed and clarity.
6
AdvancedLimitations and gotchas of sort_by
🤔Before reading on: do you think sort_by always works correctly with keys that can be equal? Commit yes or no.
Concept: sort_by can lose the original order of equal keys, so it is not stable by default in Ruby versions before 2.2.
If two items have the same key, sort_by might reorder them unexpectedly. Example: arr = ["a", "b", "A", "B"] sorted = arr.sort_by { |x| x.downcase } puts sorted.inspect Output might reorder 'a' and 'A' differently than expected. Ruby 2.2+ has stable sort_by, but older versions do not.
Result
Possible unexpected order for items with equal keys in older Ruby versions
Knowing sort_by stability issues prevents bugs when order matters for equal keys.
7
ExpertUsing Schwartzian transform manually
🤔Before reading on: do you think manually implementing sort_by's technique can help in languages without it? Commit yes or no.
Concept: The Schwartzian transform is the manual way to do what sort_by does: decorate, sort, then undecorate.
In Ruby, sort_by does this automatically: # Manual Schwartzian transform arr = ["apple", "banana", "pear"] decorated = arr.map { |x| [x.length, x] } sorted = decorated.sort { |a, b| a[0] <=> b[0] } result = sorted.map { |x| x[1] } puts result.inspect Output: ["pear", "apple", "banana"] This technique is useful in other languages without sort_by.
Result
["pear", "apple", "banana"]
Understanding the underlying transform helps apply custom sorting ideas beyond Ruby.
Under the Hood
sort_by works by first creating a new array where each original item is paired with a key generated by the block. Then Ruby sorts this new array by the keys using a fast, simple comparison. Finally, it extracts the original items in the new order. This reduces the number of times the block runs and speeds up sorting.
Why designed this way?
The design comes from the Schwartzian transform, a known technique to optimize sorting by avoiding repeated expensive calculations. Ruby adopted sort_by to make this pattern easy and efficient, improving performance and code clarity compared to custom sort blocks.
Original array
   │
   ▼
Map each item to [key, item]
   │
   ▼
Sort array by key
   │
   ▼
Extract sorted items
   │
   ▼
Sorted array
Myth Busters - 4 Common Misconceptions
Quick: Does sort_by sort items directly or by keys? Commit to your answer.
Common Belief:sort_by sorts the original items directly like sort does.
Tap to reveal reality
Reality:sort_by sorts by keys generated from items, not the items themselves.
Why it matters:Thinking sort_by sorts items directly can lead to confusion about performance and behavior.
Quick: Is sort_by always stable in all Ruby versions? Commit yes or no.
Common Belief:sort_by always keeps the original order of equal keys (stable sort).
Tap to reveal reality
Reality:Only Ruby 2.2 and later guarantee stable sort_by; earlier versions do not.
Why it matters:Assuming stability can cause bugs when order matters for equal keys in older Ruby versions.
Quick: Can sort_by handle sorting by multiple criteria? Commit yes or no.
Common Belief:sort_by can only sort by one attribute at a time.
Tap to reveal reality
Reality:sort_by can sort by multiple criteria by returning an array of keys.
Why it matters:Not knowing this limits how flexibly you can sort complex data.
Quick: Does using sort with a block always perform better than sort_by? Commit yes or no.
Common Belief:sort with a block is always as fast or faster than sort_by.
Tap to reveal reality
Reality:sort_by is usually faster because it computes keys once, while sort with a block compares items many times.
Why it matters:Choosing sort with a block for large data can cause slow performance.
Expert Zone
1
sort_by's performance gain is most noticeable when the key calculation is expensive or the list is large.
2
In Ruby 2.2+, sort_by is stable, but in older versions, you might need to use the Schwartzian transform manually for stability.
3
Using arrays as keys in sort_by allows complex multi-level sorting, but the order of keys matters and can affect results subtly.
When NOT to use
Avoid sort_by when the key extraction is trivial and the list is very small, as the overhead might not be worth it. Also, if you need a stable sort in Ruby versions before 2.2, consider using sort with a custom comparator or manual Schwartzian transform.
Production Patterns
In real-world Ruby applications, sort_by is commonly used to sort arrays of hashes or objects by attributes like timestamps, names, or scores. It is also used in Rails for sorting ActiveRecord collections efficiently. Developers often chain sort_by with other enumerable methods for clean, readable data transformations.
Connections
Schwartzian transform
sort_by is a built-in implementation of the Schwartzian transform pattern.
Knowing the Schwartzian transform explains why sort_by is efficient and how to implement similar patterns in other languages.
Database ORDER BY clause
Both sort_by and ORDER BY arrange data based on specified keys or columns.
Understanding sort_by helps grasp how databases sort query results by columns, improving data retrieval.
MapReduce programming model
sort_by's map-then-sort-then-extract process resembles the MapReduce steps of mapping data, sorting/shuffling, and reducing.
Recognizing this connection shows how sorting strategies scale from small programs to big data processing.
Common Pitfalls
#1Using sort with a block for expensive key calculations repeatedly.
Wrong approach:arr.sort { |a, b| expensive_calc(a) <=> expensive_calc(b) }
Correct approach:arr.sort_by { |item| expensive_calc(item) }
Root cause:Not realizing sort calls the block many times, causing repeated expensive calculations.
#2Assuming sort_by keeps original order for equal keys in old Ruby versions.
Wrong approach:arr.sort_by { |x| x.key } # expecting stable sort in Ruby 1.9
Correct approach:# Use manual Schwartzian transform for stability arr.map { |x| [x.key, x] }.sort.map { |pair| pair[1] }
Root cause:Unaware that sort_by was unstable before Ruby 2.2, leading to unexpected reorderings.
#3Trying to sort by multiple criteria with separate sort_by calls.
Wrong approach:arr.sort_by { |x| x[:age] }.sort_by { |x| x[:name] }
Correct approach:arr.sort_by { |x| [x[:age], x[:name]] }
Root cause:Not knowing sort_by can take arrays as keys for multi-level sorting.
Key Takeaways
sort_by sorts items by first mapping them to keys, then sorting those keys, making custom sorting simple and efficient.
It is faster than sort with a block because it calculates keys once per item instead of multiple comparisons.
sort_by can sort by multiple criteria by returning an array of keys for each item.
Ruby 2.2 and later have stable sort_by; earlier versions do not, which can affect order of equal keys.
Understanding sort_by's underlying Schwartzian transform helps apply custom sorting techniques in other languages.