The Youscope demo is a hard act to follow up on, but I’ve had this scope screenshot for a while now and I thought it was so neat that I wanted to talk about it a bit on the blog.
Before I dive into the post, let me say that the best accessory I ever bought for my scope (a Tek TDS5104B–don’t let anyone ever tell you an Agilent is better than a Tek!) is the P6245 active probe. After suffering for many years with a Kikusui 100 MHz analog scope with second-hand passive probes, the quality of measurements I get with the active probe and the TDS5104B brings a tear to my eye. For the first time ever I can see traces that actually look like the darn SPICE simulations.
Here’s the background behind what we’re looking at in the screenshot. The top trace is the data line of a memory bus; the bottom line is the clock. The trigger is set on the data line–not the clock line as you’d expect. The test pattern being run is a repeated 0xFFFFFFFF-0x00000000 transition using four different bus drive strengths in sequence on the microprocessor. You can see each of the four drive strengths on the bottom clock trace, for example, at point “C”.
The particular bug this trace is supposed to capture is a ground bounce problem, as shown at point “B”. The rising transition “F” is what causes the ground bounce at “B” (note that the dip at “B” does not happen on anywhere else). The reason “F” trails the ground bounce event is because the cable of the passive probe used for the Data line has a fairly long delay–another measurement was used to calibrate out this fixed delay. The time difference of F – A approximates the delay differential between the active and the passive probes. You can also clearly see the rise time contrast between the two probes in this screen shot.
I think it is so cool that one can so clearly capture a problem as difficult to characterize as on-chip ground bounce using only conventional probes on external signals.
The other really neat thing about this set of scope traces is that it shows the timing of both the microprocessor->memory and the memory->microprocessor in a single shot. This is because I’m triggering off of the data line, not the clock line. On the data trace, you can clearly see the Processor-driven data trajectory (“D”) and the Memory-driven data trajectory (“E”). On the clock trace, you can see two phase-shifted versions of the clock. One of them is the clock timing relative to when the CPU drives the bus–this is G-F. The other one is the clock timing relative to when the Memory drives the bus–this is C-F (minus a clock period). So in a single picture, you can divine the available setup/hold margin in both directions of the bus!
You can also see other “good things” in the scope trace, such as the auxiliary measurements for frequency, amplitude, duty cycle, and rise time. You can also see how the clock trace is pretty well formed overall, with minimal over/undershoot, and you can get an idea of how much a passive scope probe introduces overshoot artifacts by contrasting it against the top trace. This picture is truly worthy of a thousand words!
OK, so I’m a real geek for getting so excited about a screen shot like this, but really, after teaching this sort of stuff for several years at MIT and then running countless simulations of chips to validate scenarios like this, it’s somehow very satisfying to be able to go into lab and actually see that the real world does match up with theory so nicely (even if it is a bug). If I ever teach digital design again, this shot is going into my slides and my problem sets.
And, yes, this was a problem found on the chumby DVT prototypes some months ago and it has since been resolved. This ground bounce, under certain conditions, would upset the internal clock multiplier of the CPU. The fix involved multiple improvements to the board layout, but in the end nothing can compensate for the relatively high inductance designed into the chip package. Therefore, the most important fix was to use a much higher clock frequency reference so the multiplication factor was only on the order of 20x instead of 10,000x (a 16 MHz reference instead of a 32.768 kHz reference). Reducing the period of ground-bounce noise integration by a factor of 500 resolved the stability problems of the internal VCO of the CPU’s clock multiplier PLL.
What top-level problem(s) did this cause to the user that indicated there was an issue with the ground bounce?
I have a couple of active designs where the main CPU clock is a PLL multipled 32.768kHz rock (Freescale MC68332). Never again. I’ll run a 4MHz clock box through a divider chain if need be. (In my case, it’s not the PLL that bothers me, it’s that that little crystal is so sensitive and hard to start if the board isn’t perfect, or if it’s a little bit cold.)
Re: top-level problems–that was a tough one. superficially, it looks like the system crashes, so it’s hard to tell the difference between this and a less severe kernel panic or application crash without attaching to the diagnostic port. The key thing that tipped me off toward the ground bounce direction is that units could run quite stably for days and then they would start crashing within hours or minutes when they started decoding video–the key being that video largely trashes the data cache, causing a lot of back-to-back ground-bounce inducing bus traffic.
The other diagnostic that was a telling indicator was the serial port would occasionally send a bad character when the clock rate drifted sufficiently–since the update time was roughly 32kHz, there was sufficient time between update edges to allow the serial port’s baud rate to drift out of spec. USB was also challenged by this but the result wasn’t as dramatic because the USB devices were fairly forgiving in tracking the drift and also USB has packet retry mechanisms that masked most errors.
[…] bunnie’s blog » Blog Archive » More Scope Pr0n – [via] Link. […]
[…] Before I dive into the post, let me say that the best accessory I ever bought for my scope (a Tek TDS5104B–don’t let anyone ever tell you an Agilent is better than a Tek!) is the P6245 active probe. After suffering for many years with a Kikusui 100 MHz analog scope with second-hand passive probes, the quality of measurements I get with the active probe and the TDS5104B brings a tear to my eye. For the first time ever I can see traces that actually look like the darn SPICE simulations…bunnie’s blog » Blog Archive » More Scope Pr0n – [via Link. […]
Let me get this straight … you were using a $20 100 MHz probe with a $20K ‘scope that has 1 GHz analog bandwidth? WTF?
This gentleman has an excellent article describing how to build a high quality DC-1GHz probe: http://emcesd.com/1ghzprob.htm . He also has many articles on his site describing how to probe accurately.
hahah no…the 100 MHz probes were with the *old* analog scope. It is amazing how much mileage I got out of that thing though…
The new scope came with 4x P5050’s which are 500 MHz probes, but I also bought one 1 GHz active probe as well, and as noted, it was well worth the purchase. Let’s just say the TDS5104B was the first scope I’ve had which was “worthy” of an active probe.
Eric — cool link! My grad adviser at MIT (Tom Knight) taught me about those probes and I built a few of them. I ended up using those, plus a couple of pieces of long coax with taps cut into them to teach transmission line theory for 6.004 back when Gill Pratt was in charge of it.
Actually I did a run-off just recently between a coax probe like that and the active probe, and the active probe does have better AC performance, but only slightly. Darn, I have a hard copy of it here but I can’t find the screenshot of the result!
The big detractor of the coax is its stiffness–it gets unwieldy to put more than one on a board at a time, and it puts a lot of stress on the lead you are measuring. Another problem is the loss of gain, but then again the active probe does have its own issues with a slight gain offset. The coax does also present a greater load on the circuit than the active probe does, which can affect the circuit behavior that you are trying to capture.
I work at a test equipment company (operations side) I have an abundance of .pdf test equipment manuals if anyone needs any. Also links to other free manual download sites. This is not a solicitation I swear!
This is classic poor plane capacitance coupled with package/trace/via inductance.
Did you go to a 4/6 layer PCB?
I highly recommend reading this:
www_speedingedge_com/PDF-Files/emcvccgndbounce.pdf
Lee Ritchey also has a book that has lots of real world examples.
It seems you have a good base on this topic, you seem to know what you’re talking about and I do appreciate and enjoyed this post. I shall return to your site! Thank you.