-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spi
implementation relies on fragile timing
#657
Comments
@Felix-El, I skimmed your post, but I will need some additional time to go through it in detail. I suspect the root cause is a problem I noticed before but haven't fixed yet. Maybe you can confirm? I realized that I've been enabling the receiver ( Right now, I'm working to get #450 ported to D11 and D21 chips. Then I want to go through and incorporate it into the |
@bradleyharden, I have only used the |
@Felix-El, I finally took the time to read your first post. I'm confused though. You talk about being 0, 1 or 2 reads behind (0, -1, -2). But unless I'm mistaken, that has nothing to do with overflow, at least not directly. There's a separate TX buffer, and you can fill that with words, so that it always has the next byte ready to transmit. And I believe the TX buffer is at least 2 deep, so you should be able to reach -2 pretty much immediately. If overflow is happening, it's not because you're filling the TX buffer too quickly, it's because the RX buffer isn't being read fast enough. Your solution does indeed prevent overflow, but it does it in a brute force manner, by simply never transmitting the next word until you receive the current word. As you say, that has a number of downsides. Let's take a step back for a second. I believe the implementation you showed is for a My first thought is that you should re-evaluate your architecture. At the very least, it seems like you should be using high priority interrupts for this. But with only 48 clock cycles, interrupt latency is likely problematic. A better approach is probably to move to DMA. If my assumptions above are correct, I'm honestly surprised it works as well as it does without DMA. I had a similar problem, where I needed to service a SPI bus quickly, and I ran into problems immediately. I switched to DMA and haven't had a problem since. One last question. Are you familiar with RTIC? It's really useful when you need hard real-time guarantees. I would highly recommend it. |
@bradleyharden, thanks for looking into this.
TX buffer is strictly speaking just one 1 deep but the shift register could be considered as the second storage element.
Correct, that's exactly what I meant. If we send off 2 TX words, we're at risk we can't pick up the first RX word soon enough before the second is received - this is happening asynchronously at the SPI controller's pace.
Absolutely.
Yes that is the case.
Yup.
Well, a 20 MHz clock is just the toggle rate of the SCK line. If at times a master does not manage to provide/read data at "real-time" speeds, there should only be a gap between words on the bus (idle time, SCK held). Or at least that is my expectation.
Yes exactly, without DMA we are at risk to trigger the reported behaivor, i.e. if interrupt processing takes too long inbetween TX write and RX read.
Is there an example of using DMA with this crate? I'm generally open to also use it. Anyway, this does not resolve this issue at hand.
I have not used RTIC so far and the solution I found is good enough for my use case. However, I'm worried about the implied real-time requirements of using SPI here. Being the master and having a HW SPI controller (not bit-bang) I don't expect to be forced to observe those many details. Let's call this hidden complexity. I understand the solution I chose is conservative but at least it works reliably. Should this behavior not be default? |
For what it's worth, I ran into this issue on one of my projects as well, talking to a stepper driver (TMC5130A). Your alterations are working for me. |
I'm glad it was of use to someone else too. |
I've not yet studied this, but think it might relate to point 2 in #751 . |
I'm working on a project where a Feather M4 board connects to a Wiznet W5500 Ethernet chip via SPI for reading and processing raw Ethernet frames. The ATSAMD is running at 120 MHz and the SPI Master via SERCOM1 is operated at 20 MHz.
While generally all parts reached a working state by themselves (SPI communication, USB ACM, GPIOs and a custom SysTick based timer) I soon started seeing issues which were hard to diagnose. The SPI SERCOM started reporting overflow errors after some time (after 10-60s into operation), and that only when building with the release profile, unoptimized seemed to work.
After days of digging I've realized that my SysTick exception (at only 1ms rate, doing nothing but an increment of a SW counter) could influence the timing enough for SPI
write
/transfer
to break! I expect any other interrupt could do the same.The issue seems to be the implementation of Duplex
write
andtransfer
withinsrc\sercom\spi\impl_ehal_thumbv7em.rs
.As they are really similar, let's take
transfer
as the base for discussion:The
while
loop is there to ensure that we read as many words as we write. This alone however is not sufficient. As the SERCOM only offers a buffer for one word, reading may not limp behind more than one word. Let's call this situation(-1)
.To understand the issue I recorded the contents of
INTFLAGS
of the SERCOM in response to sending just one single word (by writing theDATA
register). It turns out to be:[0, ..., 1, ..., 5, ..., 7, ...]
whereDRE=1, TXC=2, RXC=4
.As you can see, the
DRE
bit is set first, so depending on execution times (optimization, interrupts, ...) it is unpredictable what happens. Assume we start in the first iteration of thewhile
loop:Iteration 1
: Firstif
hits becauseDRE
is set => transmit first word. Secondif
does not hit asRXC
was clear at time of sampling. We're one read behind (-1).Iteration N>1
: now it depends at what time we take the next snapshot of theINTFLAGS
DRE
andRXC
are observed, and the firstif
hits and we transmit (-2) but shortly after (hopefully) the secondif
hits and we manage to read a word to get back to (-1) avoiding overflow.DRE
is observed -> same situation asIteration 1
. We're now (-2) and in the following iterations we'll have to catch up reading with even closer timing to avoid the overflow.It appears the existing code is quite sensitive to timing variations - one could say it has real-time requirements of its own.
I was able to write a workaround for my project using a thin newtype as wrapper:
This solved the issue in my case so I'm sharing these findings. I believe a conservative implementation like this should be shipped with the crate, as a safer default, even though I see a few subtle issues:
I'm willing to help improve the implementation but I feel like I neither have enough experience nor the tools to analyze the outcome at the waveform level.
(Find attached the Rust program I used for recording the
INTFLAGS
register after sending a word. This is all assembly but SERCOM1 must be configured (in Rust) prior to running this.)Crates used:
The text was updated successfully, but these errors were encountered: