Is this community still alive? If yes I've got a Rust-question.

@[email protected] · 2 years ago

Is this community still alive? If yes I've got a Rust-question.

@orclev · edit-2 2 years ago

I decided to crack open the source of the Max7219 crate to get a better idea of what’s going on.

Reading the chips datasheet it looks like it’s expecting 16 bit packets sent in little endian format on the wire. The high byte consists of a 4 bit segment address (or command) and then 4 bits of padding. The low byte is interpreted depending on the address or command in the high byte as well as what the currently set decoding mode is.

Looking at the code for the crate, I see in the Spi struct it declares a buffer like so buffer: [u8; MAX_DISPLAYS * 2],. I believe a more correct version of that declaration would be buffer: [u16; MAX_DISPLAYS],. Then looking at the actual implementation of the write_raw method I see this:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        let offset = addr * 2;
        let max_bytes = self.devices * 2;
        self.buffer = [0; MAX_DISPLAYS * 2];

        self.buffer[offset] = header;
        self.buffer[offset + 1] = data;

        self.spi
            .write(&amp;self.buffer[0..max_bytes])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

where once again a bunch of double counting of u8s is being done. I think a more accurate version of that would be:

    fn write_raw(&amp;mut self, addr: usize, header: u8, data: u8) -> Result {
        self.buffer = [0; MAX_DISPLAYS];

        self.buffer[addr] = u16::from_ne_bytes([header, data]);
 
        self.spi
            .write(&amp;self.buffer[0..self.devices])
            .map_err(|_| DataError::Spi)?;

        Ok(())
    }

This skips messing around with packing the u8 bytes into pairs via address calculations and instead uses the from_ne_bytes function to directly pack the address/header byte and the data byte into a little endian u16 suitable for serialization across the SPI bus. I’m not 100% sure that from_ne_bytes is correct in this case, as I’m not entirely clear how that would interact with the native endianness of the CPU and the SPI controller, but I’m hoping that by explicitly putting the header in the high byte that it would respect that. Some experimentation would be necessary there I think to make sure it was actually portable.

@[email protected] · 2 years ago

Hi, Thank you. It took me a while, but I experimented around a little bit. I have not yet tried to fix the max7219-library though. I think it is from_be_bytes (the other one didn’t work).

But one thing that I am not understanding (I think this is a “can’t tell the forest from the trees”-situation) is how exactly multiple 8x8-matrices are connected i.e. how the data-stream looks exactly.

In your example (from the max7219-library) it seems like if I use 4 devices I send 4 times a u16 out and the 4 connected Max7219’s figure out themselves which one is meant?

@orclev · 2 years ago

So it took me a little while to figure out between reading the datasheet for the Max7219 and looking at the source code. Basically it’s taking advantage of a feature of the Max7219 that allows daisy chaining multiple chips off the same SPI connection. In order to take advantage of this feature you would take N Max7219 chips and wire all their CS and CLK pins together with your controller, and then run the connection from the controller to the first chips DIN port, and then the DOUT port from the first chip to the DIN port of the next chip. Keep chaining DOUT to DIN to daisy chain all the chips together.

In the datasheet for the Max7219 there’s this section:

For the MAX7219, serial data at DIN, sent in 16-bit packets, is shifted into the internal 16-bit shift register with each rising edge of CLK regardless of the state of LOAD. For the MAX7221, CS must be low to clock data in or out. The data is then latched into either the digit or control registers on the rising edge of LOAD/CS. LOAD/CS must go high concurrently with or after the 16th rising clock edge, but before the next rising clock edge or data will be lost. Data at DIN is propagated through the shift register and appears at DOUT 16.5 clock cycles later

Essentially what that all boils down to, is that each Max7219 maintains a 16 bit internal shift register, so as each bit is received on DIN it’s pushed onto the register, and the highest bit of the register gets pushed out to DOUT. When you daisy chain multiple chips together it’s effectively like concatenating all their shift registers together. So if you have 4 chips, that’s 64 bits of register. If you write 64 bits out to MOSI the first 16 bits will end up on the farthest out chip, the next 16 in the next closest, etc. Switching the CS pin from low to high is the trigger for the Max7219 to actually lock in and read the contents of those shift registers. The way the driver crates code is structured that’s the purpose of the buffer field in the various Connector structs. So if you have say 4 chips, you need 4 x u16 storage, and each write cycle you write all 4 u16 values out, one to each daisy chained device. Technically the driver is less efficient than it could be, in that it takes advantage of the fact that writing 0 to a chip is a no-op, so in practice while it does write to every device each time, when you call write_raw it actually 0s the buffer for all but the selected chip.

If you think about a sequence of chips, lets say once again 4 of them labeled A to D. They would be connected like so:

RP-Pico-MOSI----DIN-A-DOUT----DIN-B-DOUT----DIN-C-DOUT----DIN-D-DOUT
       -CS----------CS------------CS------------CS------------CS
       -CLK---------CLK-----------CLK-----------CLK-----------CLK

Then you write to all four chips like so:

Set CS low
Write u16 for D
Write u16 for C
Write u16 for B
Write u16 for A
Set CS high

@[email protected] · 2 years ago

Thank you.

Ive actually read the section you quoted a few times and my brain just couldn’t parse it. But i finally understand how the max7219 makes this. I’ve thought about it completely wrong. It just shifting through all the bits from chip to chip so obvious now.

I think i will go a step back and not use spi for a while and just do the bit-banging -thingy first to get more familiar first.

I’ve read somewhere that Is faster and I guess it’s cheaper for the cpu to use as the cpu doesn’t have to set the pin outs high or low with each cycle. Instead (i guess) the cpu can simply call a spi-out-funtctio one time and the spi does its thing for a while while the cpu can do other things.

But right now I don’t do much yet on the rest of the CPU, so i can afford to do it manually.

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

could it be that the max7219-crate is incomplete here? The write-funtion you corrected seems like it was copied 1to1 from the cpp-lib (LedControl).

@orclev · 2 years ago

Just one other question regarding multiple displays: as e.g. 4 displays requires 4x16bits does this mean that there would have to be a Write-trait implemented somewhere (or Write<[u16;4]>)?

Nope. The Write trait is indicating the size of the “packet” that’s written on the SPI bus, it’s the equivalent of the DS generic off the Spi struct. The way SPI works is, when you toggle CS low, the device is notified that it needs to start listening on MOSI, at which point you’re free to start sending it packets. There’s no requirement that you only send a single packet, you can send as many as you want, however many devices will have special rules about processing with respect to the state of the CS pin. E.G. just like with the Max7219 it’s common for devices to buffer commands and not actually process them until CS is sent high.

The only reason why the Write and the Spi generic are important is because it defines the minimum number of bits that will be written to the bus (or more concretely it’s the stride size the SPI controller uses when reading and writing from its buffers). That’s why using u8/8 as the parameter mostly works except for occasionally demonstrating strange behavior. Using u16 guarantees that it always writes a number of bits that’s a multiple of 16, while using u8 can allow for essentially a half packet to be written.

As for bit banging vs. SPI controller, it’s essentially the same thing as DMA if you’re familiar with that concept. Using bit banging the CPU is spending time toggling the various pins off and on, which although fast, is still relatively slow by communication standards and puts an upper limit on the speed data is transmitted on the SPI bus that’s directly tied to the frequency of the CPU and the number of cycles it takes to toggle a pin (minimum two pin toggles, maybe one for MOSI, two for CLK). Using the SPI controller on the other hand, the CPU writes bytes into memory and then passes essentially a couple of pointers to the SPI controller then flips some bits in a register. The CPU does need to pause occasionally to refill the buffers, but that’s a relatively fast operation and is mostly decoupled from the actual bus speed of SPI.

Manually implementing SPI with bit banging is probably a good learning exercise, but understanding how to properly use the SPI controller is also good to know. For an extra challenge you can usually also setup the SPI buffer to be managed using DMA for the most optimal way to handle things. I would suggest configuring a u16 buffer sized based on the number of devices and then using DMA to write its contents out using the SPI buffer would be a very educational exercise.