I've been thinking about attempting one of the r/badcode coding challenges for a very long time, but I've always had the problem that I wanted to use Rust (which isn't actually a problem, but I also want to follow best practices all the time when writing Rust, so it's hard to bend my mind to the opposite direction).
I decided to try and combine both of these things; best practices and bad code. Let's see how far I can push that idea.
Requirements
Citing directly from Reddit:
Write a method that accepts an array of letters from the English language that are in alphabetical order, but with one letter missing, and determine which letter is missing.
find_missing(['a', 'b', 'd', 'e', 'f', 'g']) //c is missing
find_missing(['n', 'p']) //o is missing
The input should always be provided in order and with a single correct answer. How to handle invalid input is left undefined.
Well that certainly sounds doable!
After creating our project (since this is a really useful function, we'll do a library crate to make it reusable), we start doing some...
Test-driven development
This is something I was never told to do at my workplace, but I kinda got into it while learning Rust.
Let's write our basic structure and some tests:
/// Find a missing character in an otherwise valid sequence of the alphabet. pub fn find_missing(input: &[char]) -> &'static str { "" } #[cfg(test)] mod tests { use crate::find_missing; #[test] fn abdefg_works() { assert_eq!( find_missing(&['a', 'b', 'd', 'e', 'f', 'g']), "c is missing" ); } #[test] fn np_works() { assert_eq!(find_missing(&['n', 'p']), "o is missing"); } }
We're closely following the specification here, and of course, the test is failing for now.
Let's make it pass:
pub fn find_missing(input: &[char]) -> &'static str { match input { ['a', 'b', 'd', 'e', 'f', 'g'] => "c is missing", ['n', 'p'] => "o is missing", _ => panic!(), } }
Since how to handle invalid input is undefined, we just panic; and yay, the tests pass now!
Let's add some more simple test to check if we correctly handle invalid input:
#[test] #[should_panic] fn abc_panics() { find_missing(&['a', 'b', 'c']); } #[test] #[should_panic] fn cba_panics() { find_missing(&['c', 'b', 'a']); }
And of course, those tests pass without any failures, so I can be pretty proud of myself.
Let's go on by adding a test which isn't in the requirements:
#[test] fn ac_works() { assert_eq!(find_missing(&['a', 'c']), "b is missing"); }
And since this fails, let's add one more match arm to find_missing:
['a', 'c'] => "b is missing",
Now I'm starting to notice that this isn't getting me anywhere; I obviously can't hand-write any valid use-case.
However, Rust provides us with the necessary tools to get around that...
Meet procedural macros
Rust has a great macro system which is perfect for generating things you want to hardcode, but without writing them by hand.
The normal macro_rules!
macros are pretty limited however, which is why we'll
go for the second macro system we have: procedural macros (or proc-macros).
Before we get into it, I'd like to make a fundamental change to how our code works currently:
fn get_valid_inputs() -> HashMap<Vec<char>, &'static str> { [ (vec!['a', 'b', 'd', 'e', 'f', 'g'], "c is missing"), (vec!['n', 'p'], "o is missing"), (vec!['a', 'c'], "b is missing"), ] .iter() .cloned() .collect() } pub fn find_missing(input: &[char]) -> &'static str { let valid_inputs = get_valid_inputs(); valid_inputs[input] }
All our tests still pass, this just makes it easier to grasp what we're gonna do next (and it helps us by neatly putting EVERYTHING on the heap).
After creating a new project in a subfolder and setting proc-macro = true
in
the [lib]
section of its Cargo.toml, we can add this basic skeleton for a
function-like proc macro:
extern crate proc_macro; use proc_macro::TokenStream; #[proc_macro] pub fn generate_valid_inputs(_: TokenStream) -> TokenStream { r#""""#.parse().unwrap() }
After adding the subcrate (I called it badcodegen) to our dependencies, we have to add this to the head of the main lib's lib.rs:
#![feature(proc_macro_hygiene)] use badcodegen::generate_valid_inputs;
To make stuff like this work:
let valid_inputs = generate_valid_inputs!();
We depend on nightly now since we want to assign the macro's output to a variable, but whatever.
We want to replace the contents of get_valid_inputs()
with just this macro:
fn get_valid_inputs() -> HashMap<Vec<char>, &'static str> { generate_valid_inputs!() }
But that won't work right now because it's creating ""
, an &'static str
,
and not a hashmap.
Let's fix that by updating the proc macro's lib.rs as well (this adds a new
dependency, quote = "1"
):
extern crate proc_macro; use proc_macro::TokenStream; use quote::quote; #[proc_macro] pub fn generate_valid_inputs(_: TokenStream) -> TokenStream { TokenStream::from(quote! { [ (vec!['a', 'b', 'd', 'e', 'f', 'g'], "c is missing"), (vec!['n', 'p'], "o is missing"), (vec!['a', 'c'], "b is missing"), ] .iter() .cloned() .collect() }) }
Now everything compiles and the tests pass as well; that's because, you know, we haven't changed any code, we only moved it so far (and made it a fair bit more complicated).
Let's get deep into the weeds
If it wasn't already obvious, I want to achieve the following: Having every possible valid input in a gigantic Hashmap, stored as the key, with the result as the value, so it's totally fast and usable.
I'll start by defining the alphabet (in badcodegen's lib.rs):
fn gimme_alphabet() -> Vec<char> { // Because more `as` is always better! ('a' as u8..='z' as u8).map(|v| v as char).collect() }
And let's throw in a test as well, just to be sure:
#[test] fn correct_alphabet() { assert_eq!( gimme_alphabet(), vec![ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z' ] ) }
Since I had to think hard here, I'll just dump the code:
#[proc_macro] pub fn generate_valid_inputs(_: TokenStream) -> TokenStream { let inputs: Vec<_> = (3..=26) .map(|n| { let pi = (0..=(26 - n)).map(|i| { let abc = gimme_alphabet(); let wr: Vec<_> = (&abc[i..n + i]).iter().copied().collect(); let o: Vec<_> = (1..n - 1) .map(|n| { let mut v = wr.clone(); let f = format!("{} is missing", v.remove(n)); quote! { (vec![#(#v),*], #f) } }) .collect(); quote! { #(#o),* } }); quote! { #(#pi),* } }) .collect(); TokenStream::from(quote! { [ #(#inputs),* ] .iter() .cloned() .collect() }) }
In short: Create every possible valid input and return a statement creating a Hashmap (or whatever) mapping them directly to each desired output.
Results
For comparison, I've created a version without the proc macro, which you can see at the GitHub repo; I've also added a test binary so I can see how the size is affected there.
The nice tool cargo-expand can show us the real difference between the two
by running cargo expand --lib
in each crate's root.
The macro-less version with the hardcoded possible inputs expands to just 23
lines, while the proc-macro powered one yields a whopping 17811 lines.
However, the actual size difference of the binaries is just about 500KB; not that much, really.
Conclusion
In the end, I've realized that this could be made much worse by doing it at runtime instead of compile-time - my abuse of heap-allocated memory is only showing at compile time (where the proc-macro aspect already takes away a lot of speed). So in the end, it's not really bad code or bad performance, it's just a totally overengineered way of hardcoding every possible value.
While probably not a valid entry for the competition, I hope I could at least amuse someone with it. Maybe you could file a PR to make the code worse (I'm sure that could be done).