Thursday, August 28, 2014

DDR3 data routing for x16 parts

I tried to length match each byte pair last night and immediately ran into troubles. While I had left room to allow for tuning wiggles, it turns out that the mismatch is too great. Figure 1 shows the location of the lower and upper bytes in the route. My goal was to route both of the data bytes on the same layer.

Figure 1. Single Layer Data Route with Bytes Annotated.

Typically, six layer DDR3 routes require four signal layers, one ground plane, and one power plane. So far, I've assumed that if I only used a single DDR3 part I could drop the number of signal layers from four to three by avoiding the connection of control/address signals between each DDR3 IC.

Everything was going great until I started to tune the data bytes. The problem was with the lower byte (D0 to D7). Since the Lattice ECP5 requires that the same DQS group (data byte, strobe, and byte enable) go to the same DQS group on the ECP5, my route groups the bytes together. This is fine for the upper byte, but half of the lower byte (D1, D3, D5, D7, & DM0) have to swing wide to the left to escape the DDR3 IC without crossing the upper byte. This swing horribly mismatches the trace lengths within the byte:

  • D0 length: 877 mils
  • D1 length: 1371 mils
  • D2 length: 861 mils
  • D3 length: 1532 mils
  • D4 length: 956 mils
  • D5 length: 1619 mils
  • D6 length: 892 mils
  • D7 length: 1460 mils
The average mismatch between odd and even lower byte bits is 599 mils. Figures 2 and 3 show D0 before and after tuning. Since I could only remove 291 mils of mismatch, I'm doubtful that I will be able to tune the lower byte as routed.

Figure 2. Non-tuned D0 Route (Length = 877 mils)

Figure 3. Tuned D0 Route (Length = 1168 mils)

This leaves me with a few ideas about how to proceed:
  1. Use a fourth signal layer to route the lower byte. While this doesn't meet my goal for such a simple DDR3 design, it looks like a pretty standard solution.
  2. Try to more aggressively select ECP5 DQS groups to route the entire x16 data set. This may be possible because the DQS groups are stacked roughly in pairs of two from the top to the bottom of the ECP5 (per side). This approach might work, but it works heavily against my address/control routing method where I bring all address/control nets in on the top layer.
  3. Switch to two x8 DDR3 ICs. With each byte coming from a different IC I can add more vertical space between the parts to ease routing
After taking to Kevin, I've decided to go with option 1 as long as the impact on the rest of the layout is minimal.

Creative Commons License
The PCB layout shown in this post by Custom Embedded Solutions, LLC. is licensed under a Creative Commons Attribution 4.0 International License.

No comments: