Is this community still alive? If yes I've got a Rust-question.

@[email protected] · 2 years ago

Is this community still alive? If yes I've got a Rust-question.

@[email protected] · edit-2 2 years ago

Hey! Thank you very much! This is an incredibly well made, probably labor-intensive and (nice!) comment! (and yeah a few code-pieces seem to disappear, but i think i understand the original meaning.

that cleared a lot up to be honest. I have been using rust for a while now, but i think all the more advanced features that i didn’t really have to deep-dive into before are now used all at once in the embedded context. it’s all very dense to read when only looking into the source code (or the docs). But your explanations helped tremendously (i will read them again tomorrow though.

It’s really fascinating what rust makes possible here. I haven’t really programmed too much in c++ in the embedded context, but i guess i would have to basically rewrite a lot of software if i want to use it on a different device, right?

Regarding the 8 or 16 values of the DS-values, i am not quite sure myself. I’ve found two examples where a Max7219-chip is used together with a raspberry Pi pico with Rust. One implemented the max7219-struct itself and didn’t use the max7219-crate and used the value 16 for DS… This example works on my setup.

The other example is using the max7219 and it needs DS=8 otherwise it doesn’t compile. It kinda works, but there seems to be some errors when i use it: if I use write_raw to set all the pixels on the display certain values seem to change the display’s state. at a certain point it changes its intensity and changes into all-pixels-on-mode suddenly. This shouldn’t happen if i only use wrote_raw.

But with your explanations i might understand a little more of the stuff that i used in the code. Thank you very much!

@orclev · edit-2 2 years ago

Honestly I’m suspecting that the driver crate is just broken, and that it is supposed to be using a value of 16 for the DS parameter. The trait constraint the from_spi function should have applied should be Write with a u16 generic, not u8 which would then allow you to use 16 as the DS parameter when initializing the Spi instance. If I had a Max7219 chip at hand I would try modifying the driver crate to verify if that’s the case, but I don’t unfortunately. Maybe open an issue on the driver repo describing the behavior you’re seeing (and maybe link him back to this thread) to see what he thinks?

As for the case with C++ code it is often more device specific, but it can also cheat a certain amount. Rust is all about safety, it doesn’t let you make a bunch of mistakes that are possible in C++. The upshot of that is that when you get a piece of Rust code to compile, it’s more often than not correct. That’s somewhat on the skill of the person writing the libraries though, you can certainly write code that can be used wrong, but a good author can often define their APIs in such a way that it’s impossible to use it incorrectly. As in the example above, the Spi instance is being constrained to a DS of 8 due to the way the Max7219 crate is defined, it’s impossible to accidentally use a DS of 16 with it, it just happens that it seems like that constraint is wrong in this case.

C++ in contrast lets you take shortcuts. For instance you can define a bunch of constants and use ifdefs to conditionally set them at compile time. For example you can see this random driver I found using a google search that it defines the Max7219 class as taking a PinName class/struct/enum (not sure which honestly) which I’m sure is defined elsewhere as the raw pin identifier constant exposed by the underlying hardware. That driver for instance does not enforce that the pin has been configured into the proper PushPull mode prior to it being passed to the driver, it’s on you as the user of the library to make sure everything has been properly setup before hand. It’s “easier” in that everything is basic, but it’s also error prone as it doesn’t double check your work, you’ll just get a crash at runtime.

C/C++ is very low level, barely higher than assembly. If you’re armed with the datasheets for everything you can probably make it work, but you need to be very sure you’re getting all the details right. Rust on the other hand tries to force you to use things correctly. Ideally you should have just been able to grab the Max7219 crate, and just use it and everything would work. The fact it isn’t suggests there’s a possible bug in the crate, rather than that you’re just using it wrong, as it really should be impossible to use it wrong.

@[email protected] · 2 years ago

Hey thank you! it might actually be that the driver has an error. For me somebody pointing that out is actually very helpful, as I always suspect that I’m doing something wrong. But playing around with the other example that kind of implemented the max7219 interface from scratch (using the u16 for the data send) was pretty fun!

I guess I will try changing the original max719-crate from u8 to u16 tomorrow and see what happens. I also posted an issue about that on the GitHub.

@orclev · edit-2 2 years ago

I decided to crack open the source of the Max7219 crate to get a better idea of what’s going on.

Reading the chips datasheet it looks like it’s expecting 16 bit packets sent in little endian format on the wire. The high byte consists of a 4 bit segment address (or command) and then 4 bits of padding. The low byte is interpreted depending on the address or command in the high byte as well as what the currently set decoding mode is.

Looking at the code for the crate, I see in the Spi struct it declares a buffer like so buffer: [u8; MAX_DISPLAYS * 2],. I believe a more correct version of that declaration would be buffer: [u16; MAX_DISPLAYS],. Then looking at the actual implementation of the write_raw method I see this:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        let offset = addr * 2;
        let max_bytes = self.devices * 2;
        self.buffer = [0; MAX_DISPLAYS * 2];

        self.buffer[offset] = header;
        self.buffer[offset + 1] = data;

        self.spi
            .write(&amp;self.buffer[0..max_bytes])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

where once again a bunch of double counting of u8s is being done. I think a more accurate version of that would be:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        self.buffer = [0; MAX_DISPLAYS];

        self.buffer[addr] = u16::from_ne_bytes([header, data]);
 
        self.spi
            .write(&amp;self.buffer[0..self.devices])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

This skips messing around with packing the u8 bytes into pairs via address calculations and instead uses the from_ne_bytes function to directly pack the address/header byte and the data byte into a little endian u16 suitable for serialization across the SPI bus. I’m not 100% sure that from_ne_bytes is correct in this case, as I’m not entirely clear how that would interact with the native endianness of the CPU and the SPI controller, but I’m hoping that by explicitly putting the header in the high byte that it would respect that. Some experimentation would be necessary there I think to make sure it was actually portable.

@[email protected] · 2 years ago

Hi, Thank you. It took me a while, but I experimented around a little bit. I have not yet tried to fix the max7219-library though. I think it is from_be_bytes (the other one didn’t work).

But one thing that I am not understanding (I think this is a “can’t tell the forest from the trees”-situation) is how exactly multiple 8x8-matrices are connected i.e. how the data-stream looks exactly.

In your example (from the max7219-library) it seems like if I use 4 devices I send 4 times a u16 out and the 4 connected Max7219’s figure out themselves which one is meant?

@orclev · 2 years ago

So it took me a little while to figure out between reading the datasheet for the Max7219 and looking at the source code. Basically it’s taking advantage of a feature of the Max7219 that allows daisy chaining multiple chips off the same SPI connection. In order to take advantage of this feature you would take N Max7219 chips and wire all their CS and CLK pins together with your controller, and then run the connection from the controller to the first chips DIN port, and then the DOUT port from the first chip to the DIN port of the next chip. Keep chaining DOUT to DIN to daisy chain all the chips together.

In the datasheet for the Max7219 there’s this section:

For the MAX7219, serial data at DIN, sent in 16-bit packets, is shifted into the internal 16-bit shift register with each rising edge of CLK regardless of the state of LOAD. For the MAX7221, CS must be low to clock data in or out. The data is then latched into either the digit or control registers on the rising edge of LOAD/CS. LOAD/CS must go high concurrently with or after the 16th rising clock edge, but before the next rising clock edge or data will be lost. Data at DIN is propagated through the shift register and appears at DOUT 16.5 clock cycles later

Essentially what that all boils down to, is that each Max7219 maintains a 16 bit internal shift register, so as each bit is received on DIN it’s pushed onto the register, and the highest bit of the register gets pushed out to DOUT. When you daisy chain multiple chips together it’s effectively like concatenating all their shift registers together. So if you have 4 chips, that’s 64 bits of register. If you write 64 bits out to MOSI the first 16 bits will end up on the farthest out chip, the next 16 in the next closest, etc. Switching the CS pin from low to high is the trigger for the Max7219 to actually lock in and read the contents of those shift registers. The way the driver crates code is structured that’s the purpose of the buffer field in the various Connector structs. So if you have say 4 chips, you need 4 x u16 storage, and each write cycle you write all 4 u16 values out, one to each daisy chained device. Technically the driver is less efficient than it could be, in that it takes advantage of the fact that writing 0 to a chip is a no-op, so in practice while it does write to every device each time, when you call write_raw it actually 0s the buffer for all but the selected chip.

If you think about a sequence of chips, lets say once again 4 of them labeled A to D. They would be connected like so:

RP-Pico-MOSI----DIN-A-DOUT----DIN-B-DOUT----DIN-C-DOUT----DIN-D-DOUT
       -CS----------CS------------CS------------CS------------CS
       -CLK---------CLK-----------CLK-----------CLK-----------CLK

Then you write to all four chips like so:

Set CS low
Write u16 for D
Write u16 for C
Write u16 for B
Write u16 for A
Set CS high

@[email protected] · 2 years ago

Thank you.

Ive actually read the section you quoted a few times and my brain just couldn’t parse it. But i finally understand how the max7219 makes this. I’ve thought about it completely wrong. It just shifting through all the bits from chip to chip so obvious now.

I think i will go a step back and not use spi for a while and just do the bit-banging -thingy first to get more familiar first.

I’ve read somewhere that Is faster and I guess it’s cheaper for the cpu to use as the cpu doesn’t have to set the pin outs high or low with each cycle. Instead (i guess) the cpu can simply call a spi-out-funtctio one time and the spi does its thing for a while while the cpu can do other things.

But right now I don’t do much yet on the rest of the CPU, so i can afford to do it manually.

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

could it be that the max7219-crate is incomplete here? The write-funtion you corrected seems like it was copied 1to1 from the cpp-lib (LedControl).

@orclev · 2 years ago

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

Nope. The Write trait is indicating the size of the “packet” that’s written on the SPI bus, it’s the equivalent of the DS generic off the Spi struct. The way SPI works is, when you toggle CS low, the device is notified that it needs to start listening on MOSI, at which point you’re free to start sending it packets. There’s no requirement that you only send a single packet, you can send as many as you want, however many devices will have special rules about processing with respect to the state of the CS pin. E.G. just like with the Max7219 it’s common for devices to buffer commands and not actually process them until CS is sent high.

The only reason why the Write and the Spi generic are important is because it defines the minimum number of bits that will be written to the bus (or more concretely it’s the stride size the SPI controller uses when reading and writing from its buffers). That’s why using u8/8 as the parameter mostly works except for occasionally demonstrating strange behavior. Using u16 guarantees that it always writes a number of bits that’s a multiple of 16, while using u8 can allow for essentially a half packet to be written.

As for bit banging vs. SPI controller, it’s essentially the same thing as DMA if you’re familiar with that concept. Using bit banging the CPU is spending time toggling the various pins off and on, which although fast, is still relatively slow by communication standards and puts an upper limit on the speed data is transmitted on the SPI bus that’s directly tied to the frequency of the CPU and the number of cycles it takes to toggle a pin (minimum two pin toggles, maybe one for MOSI, two for CLK). Using the SPI controller on the other hand, the CPU writes bytes into memory and then passes essentially a couple of pointers to the SPI controller then flips some bits in a register. The CPU does need to pause occasionally to refill the buffers, but that’s a relatively fast operation and is mostly decoupled from the actual bus speed of SPI.

Manually implementing SPI with bit banging is probably a good learning exercise, but understanding how to properly use the SPI controller is also good to know. For an extra challenge you can usually also setup the SPI buffer to be managed using DMA for the most optimal way to handle things. I would suggest configuring a u16 buffer sized based on the number of devices and then using DMA to write its contents out using the SPI buffer would be a very educational exercise.