Topics:

Discord Bots via @mysteriouspants

Repo: https://github.com/mysteriouspants/mysteriousbot

Chris shared his Discord bot which discourages people from talking politics outside the #politics channel (indeed a noble cause). He used Serenity which is an opinionated Rust library for the Discord API. Chris didn't like some of the opinions (specifically that bot commands should start with a character like ! or ~) but luckily the framework had an escape method that runs on any message, so he used that to make his own handler.

One advantage of writing the project in Rust was that it made deploying the bot really simple. As Chris writes:

"to deploy you just take the built binary, scp it somewhere, and run it. There's no grand dependency chain from the OS to worry about, even the TLS layer is handled internally thanks to RusTLS."

Chris posted an in-depth article about it on his blog, check it out!

Digital Signal Processing via @jacobrosenthal

Repo: https://www.arm.com/resources/education/books/dsptextbook

I've has been working through the Digital Signal Processing using Arm Cortex-M based Microcontrollers book. The goal was to better understand translating C code to Rust, attempting to utilize functional paradigms, and exlore and fill in the math shortcomings in no_std embedded Rust.

Ive got a nice setup https://github.com/jacobrosenthal/dsp-discoveryf4-rust/ where I'm building up the examples from the book in Rust and some nice patterns around graphing on command line, or in browser when necessary, as well as running them on an actual device and doing timing calculations to see how Rust fares.

DSP

Digital Signal Processing is about trying to find the signal in the noise. THis is a very short crash course on signals and systems, mostly from the point of view of translating c code to a functional style.

Signals

There are five fundamental discrete time signals, unit pulse, unit step, unit ramp, exponential, and sinusoidal. We can use these to make simulated streams of data to practice against as well as to build up filters we'll run against that data.

A unit pulse is 1 at 0, and 0 everywhere else. Writing things the C way, you have to be careful not to overflow the array.

let mut unit_pulse = [0f32; N];
for n in 0..N {
    if n == 0 {
        unit_pulse[n] = 1.0;
    } else {
        unit_pulse[n] = 0.0;
    }
}

In cases when we dont need random access to any previous elements we can do a litte more idiomatic Rust by using an iter_mut over the existing buffer to keep us from accessing outside of our array.

let mut unit_pulse = [0f32; N];
unit_pulse.iter_mut().enumerate().for_each(|(n, val)| {
    if n == 0 {
        *val = 1.0;
    } else {
        *val = 0.0;
    }
});

An even more fully functional approach is available though. Well use a range, and map each of those values.

type N = heapless::consts::U10;
let unit_pulse_iter = (0..(N::to_usize())).map(|n| if n == 0 { 1.0 } else { 0.0 });

Theres obviously quite a bit of visual 'noise' now with the length encoded as a type, and the turbofish in the collect. Luckily we can go back to arrays and remove the turbofish shortly when const generics are merged.

Note iterators are lazy and have to be consumed. Our iterator is thus is a rather small data strucutre thats very cheap to clone, which means iterators are very memory efficient on top of their safety. In no_std Rust we dont have Vec, but there is a library called Heapless that has a vec type. When it comes time to fetch all those values, well need to use that instead of array using collect

let unit_pulse = unit_pulse_iter.collect::<heapless::Vec<f32>>();

Another fundamental is a unit step. Its just 1 everywhere.

let unit_step = core::iter::repeat(1.0).take(N::to_usize());

We can combine fundamental signals to make discrete time systems. In this case we subtract two unit steps to create a window filter.

let window = core::iter::repeat(0.0)
    .take(4)
    .chain(unit_step.clone())
    .take(N::to_usize())
    .zip(unit_step.clone())
    .map(|(us_delay, us)| us - us_delay);

We saw a bunch of new stuff.

  • A core function called repeat lets us get an infinity iterator of a primitive, in this case 0.
  • Then we can take 4 of those zeros.
  • We can then chain our unit step at the end of those 4 zeros, effectively delaying our unit step by 4.
  • Then we can zip that delayed unit step signal with another unit step signal so we get one of each of them at a time.
  • Finally we again use map to subtract the delayed unit step from the naked unit step.

The result is four 1s and the rest zeros.

"window": 1.0000, 1.0000, 1.0000, 1.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000
⡉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉                                            1.0
⠄
⠂
⡁
⠄
⠂
⡁
⠄
⠂
⡁
⠄
⠂
⡁
⠄
⠂
⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠉⠁⠈ ⠁⠈ ⠁ 0.0
0.0                    4.0                               10.0

A digression. An exponential signal looks like

let exponential = (0..(N::to_usize())).map(|val| A.powf(val as f32));
"exponential": 1.0000, 0.8000, 0.6400, 0.5120, 0.4096, 0.3277, 0.2621, 0.2097, 0.1678, 0.1342
                                                             1.0
⠄                                                            
⠂                                                            
⡁⠄                                                            
⠂                                                            
⡁⠄                                                            
⠂⡁                                                            
⠄                                                     ⠁       0.1
0.0                                                      10.0

Now we can filter our exponential signal with our window signal.

let exp_window = exponential.zip(window).map(|(ex, x5)| ex * x5);
"exp_window": 1.0000, 0.8000, 0.6400, 0.5120, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000
⡁                                                             1.0
⠄                                                            
⠂                                                            
⡁     ⠁                                                      
⠄                                                            
⠂           ⠄                                                
⡁                                                            
⠄                 ⠂                                          
⠂                                                            
⡁                                                            
⠄                                                            
⠂                                                            
⡁                                                            
⠄                                                            
⠂                                                            
⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁⠈ ⠁ 0.0
0.0                                                      10.0

Rust math performance and cmsis-dsp-sys

For these basic uses rust is as or more performant than C. Not to mention I've found at least three out of bounds accesses in the C code when porting it to Rust. However for bigger algorithms a completely C or Rust approach eventually falls behind highly tuned assembly language and specialized instructions. The Rust assembly language implementation is working through stabilization now, and intrinsics are beings slowly added over time, but the fact is for larger algorithms the most performant code is going to be in C for the time being.

For Arm microcontrollers these algorithms are all shipped from the instruction set proider in a library called CMSIS. I've created cmsis-dsp-sys which provides Rust bindings for the prebuilt cmsis-dsp libraries so we can stay productive and benchmark our pure Rust to see how were doing.

We can measure the time it takes to run a function on an arm microcontroller using the dwt peripheral.

let cycles: ClockDuration = dwt.measure(|| unsafe {
    //Your function here
});

Youd want to be running these algorithms possible many times per second and have that still not affect your power too badly. This measurement is in cycles where, for my microcontroller at 168mhz, 1 cycle is 0.000000006 seconds.

Lets see how some implementations perform.

The gold standard, calling into the static cmsis implementation of an fft.

cargo embed --release --example 4_5_fft_calculations.rs
16:18:20.478 cycles: 36710

My pure Rust unoptimized naive dft implementation is pretty bad.

cargo embed --release --example 4_1_dft_calculations.rs
16:17:17.106 cycles: 12066122

And finally a pure Rust no_std fft library called microfft I found on crates.io is actually very competitive but albeit with far less functionality than the cmsis libraries.

cargo embed --release --example 4_5_fft_calculations_microfft.rs
16:18:44.091 cycles: 40512

Graphing Libraries, Terminal User Interaces, and Cargo Embed

The native graphs above were generated using textplots: A library for plotting graphs in terminal.

But occasionally for more resolution graphs I reach for plotly: A more sophisticated plotting libary which is powered by Plotly.js (which is built on d3.js).

For on device data, I have some forks going of cargo-embed to capture remote data to .dat files for graphing. cargo-embed is a general purpose tool for doing cargo builds, uploading to deivce, and receiving data back.

Better yet I'm also working on doing live graphs of on device data. cargo-embed uses tui-rs a library for building rich terminal user interfaces and dashboards under the hood and I have some forks of that going.

Magic Wand TensorFlow Lite Example for the PyGamer

As an alternative to writing your own algorithms to process signals, you can train a machine learning (ML) model to come up with the relevant rules for you. TensorFlow is a library that represents computations as a dataflow graph, which is widely used in ML. There is a scaled-down version of TensorFlow, also from Google, called Tensorflow Micro Lite, which can run on constrained systems like microcontrollers. A "magic wand" is a common demonstration which uses a model trained on the accelerometer data to detect 'spells' (e.g. wave the device in a circular motion) and change the color of an LED based on the inferred motion.

The tfmicro crate was recently announced which provides no-std Rust bindings for the Tensorflow Micro Lite C code. I've got the magic wand example for the AdaFruit PyGamer mostly working though its not merged yet.

More on this in the future for sure.

Managing Game State via @DanielPBank

I started working on a simple game to get some practical experience working with structs, impl blocks, and the ownership model. Some interesting topics came up around making games with Rust.

ECS

The first recommendation was that I could look at common patterns that game developers use to inform my research. Entity–component–system (ECS) is an architectural pattern that is mostly used in game development. There is a game engine written in Rust called Amethyst which comes with an ECS crate called specs. Chris also shared a RustConf keynote by Catherine West where she shares good (and bad) practices for developing game engines in general, and how Rust helps guide you to those patterns.

As always, a good resource for any topic in Rust is the Are We ____ site, in this case: https://www.arewegameyet.rs/

Crates You Should Know

  • serenity: Rust library for the Discord API
  • specs: ECS system written in Rust
  • cmsis-dsp-sys: Rust bindings for the prebuilt cmsis-dsp math
  • tfmicro: no-std Rust Bindings for TFMicro