Friday, November 28, 2014

And the sparks fly...

I'm still trying to wrap my head around it, but unfortunately I destroyed our only working Titan prototype while testing it. The build went very well, and I got a lot of good data from testing the card. While Kevin and I consider our next steps, I'll review how we got here.

Building

The rev B board assembly went very smoothly. I simplified the build by not loading the USB programming circuit, and I was able to place and reflow the board myself (including the two BGA parts). The only items that required manual rework after the build were two twisted capacitors and the switching power supply IC in a QFN package at U10. U10 also skewed in the same direction on the rev A build so we are going to examine this issue more deeply later.

Figure 1. U10, Skewed by a Pad.

We decided to simple remove U10 and bypass the 12V to 3.3V power supply and use the PCI Express connector's 3.3V rail instead. With U10 removed, this just required a simple mod-wire.

Testing

Prior to powering the board I measured the impedance to ground on each voltage rail:

RailLocation MeasuredImpedance
3.3VC866 kOhms
2.5V_AC604.8 kOhms
1.1V_AC314.3 kOhms
1.1VC5979 Ohms
2.5VC674.1 kOhms
1.5VC272 kOhms
     Table 1. Impedance Measurements.

RailLocation MeasuredVoltage
3.3VC863.472 V
2.5V_AC602.49 V
1.1V_AC311.099 V
1.1VC591.096 V
2.5VC672.480 V
1.5VC271.499 V
     Table 2. Voltage Measurements.

Using a Lattice USB programmer, I used Diamond to validate that the ECP5 was identified with a JTAG scan. I then loaded my titan-wiggle test project and two LED expansion boards to validate That 64 I/O pins (32 to each expansion header) all work as my video shows.

Next I rebooted the Linux PC that was powering Titan and used the lspci command see if the PCI Express interface enumerated; It did not enumerate. I was able to validate the PERSTn (the PCI Express reset signal) and REFCLK were being received by the FPGA. Titan-wiggle uses PERSTn as its reset; I simply issued a warm reboot (linux reboot command) and saw that the LED counter reset during the PC's reboot. I then changed the LED counter from the 50MHz oscillator to the 125MHz output from the PCIe core and the LED counting pattern speed increased as I expected.

This is when the damage to Titan occurred. The mod-wire bypassing U10 was connected to a 3.3V via very close to the PCIe connector. With Titan installed into the PC and powered I accidentally nudged the board and the mod-wire popped off the via. The wire was under tension and swung across the card, most likely contacting the 12V rail on nearby caps. This would have shorted the ECP5's 3.3V rail directly to 12V. I saw a spark and a little smoke. As we all know, it is nearly impossible to return all of the smoke to an IC once it escapes :p  I powered off the unit and checked the voltage rails and found that the 3.3V, 2.5V, and 1.1V rails were all shorted to ground.

While this is disheartening, mistakes do happen, and hopefully we can recover quickly. For now I will be focusing on firmware development using the Lattice PCIe Development Kit and developing theories about why the PCI Express interface did not enumerate

No comments: