Parallel letter frequency
Finally some sort of concurrency/parallelism code. However that means lifetime management is going to bite really hard if you are not careful (and not using crates that improves quality of life).
use std::collections::HashMap;
use std::sync::mpsc;
use std::thread;
pub fn frequency(input: &[&str], worker_count: usize) -> HashMap<char, usize> {
if input.len() == 0 {
return HashMap::new();
}
let (tx, rx) = mpsc::channel();
input
.chunks((input.len() as f64 / worker_count as f64).ceil() as usize)
.for_each(move |chunk| {
let ttx = mpsc::Sender::clone(&tx);
let chunk = chunk.iter().map(|&x| String::from(x)).collect::<Vec<_>>();
thread::spawn(move || {
chunk.iter().for_each(|sentence| {
ttx.send(
sentence
.to_lowercase()
.chars()
.filter(|x| x.is_alphabetic())
.fold(HashMap::new(), |mut current, incoming| {
*current.entry(incoming).or_insert(0) += 1;
current
}),
).unwrap()
});
});
});
rx.iter().fold(HashMap::new(), |result, incoming| {
incoming
.into_iter()
.fold(result, |mut current, (key, count)| {
*current.entry(key).or_insert(0) += count;
current
})
})
}
Strangely I don’t really see much performance improvement on my computer, perhaps because docker or other stuff is constantly running which takes a lot of CPU time. However it seems to work at my mentor’s end so I guess this is it, my first piece of concurrency code in Rust. The way it works, at least for this example reminds me of Go channel and a little bit of go routine though. Not too interested in digging further to figure out how it really works, I am picking up this language for fun (:
My mentor told me that crossbeam and rayon seems to be able to fix my struggle with the lifetime so I can avoid writing code that practically does nothing but cloning in line 16.
There will be async/await being introduced in the future, which I suppose is similar to asyncio/coroutine in Python 3. I suppose that would be more similar to how goroutine works.