Useless benchmarks. Outperforming Unix sort by more than 7 times with 30 lines of Rust

Mikhail Panfilov
3 min readFeb 26, 2020

Hey, Medium! Yesterday I read an interesting article about writing faster analog wc in Haskell. So… After reading I thought about writing some tools in Rust but it so boring I think. And I decided to write a simple sort tool in Rust and benchmark it.

Disclaimer

I know what Unix sort do much more work than rust solution and all of the benchmarks there did just for fun.

Conditions

Testing data

I downloaded that CSV file for testing:

du -sh part.csv                                                              
179M part.csv

Specifications

All tests running on MacBook Pro (16-inch, 2019) 8-Core Intel Core i9 with 16 GB ram.

Testing tool

All tests with provided with hyperfine

hyperfine --version                                                                                                            
hyperfine 1.9.0

Rust built with 1.41 version

rustc -V                                                                                                                   
rustc 1.41.0 (5e1a79984 2020-01-27)

Cargo.toml optimizations

[profile.release]
opt-level = 3

Unix sort version

sort --version                                                                                                                 
2.3-Apple (101.40.1)

Let’s start

Naive implementation

The first implementation that came to my mind looked like this:

use std::collections::BinaryHeap;
use std::env;
use std::fs::File;
use std::io::{BufRead, BufReader, Write};
fn main() -> std::io::Result<()> {
let args: Vec<String> = env::args().collect();
let path = &args[1];
let file = File::open(path).unwrap();
let reader = BufReader::new(file);
let stdout = std::io::stdout();
let mut writer = stdout.lock();
let heap: BinaryHeap<String> =
reader
.lines()
.map(|line| line.unwrap())
.fold(BinaryHeap::new(), |mut heap, line| {
heap.push(line + "\n");
heap
});
write!(
writer,
"{}",
heap.into_sorted_vec()
.iter()
.fold(String::new(), |data, line| data + line)
)
}

It was already almost 4 times faster than the original sort, but it could still be improved.

Difference between original sort and written in Rust

Moving heap to vector

Initially, I assumed that using a heap to store lines of a file would be as efficient as possible, but that was not the case. After using standard Rust mechanism to bring the iterator to a vector, I got 70% speed boost (2.317s versus 3.22s)

Replace heap to a vector
use std::env;
use std::fs::File;
use std::io::{BufRead, BufReader, Write};
fn main() -> std::io::Result<()> {
let args: Vec<String> = env::args().collect();
let path = &args[1];
let file = File::open(path).unwrap();
let reader = BufReader::new(file);
let stdout = std::io::stdout();
let mut writer = stdout.lock();
let mut heap: Vec<String> = reader.lines().map(|line| line.unwrap()).collect();
heap.sort();
write!(
writer,
"{}",
heap.iter().fold(String::new(), |data, line| data + line + "\n")
)
}

It would seem that one could stop here, but what if we use parallel sorting (with the MergeSort algorithm)?

Parallel Version

The parallel version differs from the previous one in just 2 lines of code

+use rayon::prelude::*;-heap.sort();
+heap.par_sort();

Yes, in Rast we just connected the library in Cargo.toml and now we can use parallel sorting! And now we have an increase of 68% compared to the previous version.

Parallel realization

Conclusion

Of course, this code does not even pretend to partially replace the standard sort utility, and everything was written only for fun and useless benchmarks. Thanks for reading!

Updates

1

As readers at the Reddit noted, a utility with the environment LC_ALL = C greatly accelerates the execution time. Therefore, I give a final comparison with the addition of this environment here

LC_ALL = C comparing

--

--