Circuits for the Deskew Scheme
In this chapter deskew circuits, developed for the F-RISC project, are discussed. These have been simulated using the 50 GHz baseline HBT models. Since the differential logic and the power supply voltage of -5.2 V have been chosen for rest of the chips of the F-RISC project, the deskew circuits are also designed for the same power supply voltage and are differential. Since low chip yields are expected due to the experimental nature of the current GaAs/AlGaAs HBT technology, it is paramount that the circuits be as simple as possible. Also, it should be kept in mind that there is more than one way to realize the deskew scheme presented in this thesis. The set of circuit designs described in this chapter is just one among many possible implementations.
Fig. 3.1. Block diagram of the PLL
3.1. Analysis of the CLOCK Loops
The overall block diagram of a clock loop and a PLL can be modelled by a standard closed loop control system shown in Fig. 3.1. In the following, the equations modelling the general closed loop behaviour of the system are derived. From these equations the parameters of the filter stage can be determined to meet the overall deskew specification. The phase detector and the Voltage Controlled Delay Element (VCDE) blocks are represented by the constants, KPD and KVCDE , respectively. The filter block is represented by the transfer function F(s). A continuous time linear system is assumed during the analysis.
A simple integrator circuit has been found to be sufficient to be used as the filter block of PLL for the task of the clock deskew. It produces the desired over-damped closed-loop response while providing sufficient open-loop gain that results in the small steady state phase detector error, e ss. The over-damped system is desirable for several reasons: first, there is no overshoot or ringing behaviour on the PLL outputs as would be the case in an under-damped system. An over-damped system is more stable during the acquisition mode of the PLLs. Second, once the system has been locked a very stable output is necessary to avoid jitter; an over-damped system is more immune to the ever present noise. The block diagram of the differential integrator is shown in Fig. 3.2. The transfer function of the circuit, assuming a finite constant amplifier gain of A, can be derived as,
where, t 1 = RC(1+A) (See Appendix A). Note, from Eq. 3.1, it is evident that the DC gain of the transfer function is equal to A. The parameters R and C are chosen to be as large as possible so that the bandwidth of the filter is minimized. It was deemed that a resistor of 3 KW (240 m m2) and a capacitor of 6 pF (0.05 mm2) result in an acceptable filter bandwidth and chip area.
In the actual implementation of the circuit a small capacitor of 0.4 pF connects the middle points of the input resistors. It is used to prevent the high frequency noise signals generated from the UP and DOWN signals associated with the digital phase detectors from reaching the inputs of the amplifier circuit. A certain amount of time is required for the amplifier circuit to physically respond to the input signals. If the input signal changes too fast for the slew rate of the amplifier circuit the output signal has some high frequency noise resulting in jitter.
A more accurate filter transfer function than Eq. 3.1 is obtained if a first order approximation of the amplifier is utilized, where A = A0/(1+st 0). (See Appendix A):Eq. 3.2
The open loop gain, G(s), can be expressed as,
with K0 = KPDKVCDEA0.
The closed loop system equation can be expressed as,
Similarly, the expected steady state error in response to the worst case step input is,
After applying the finite value theorem,
, where R0 = Initial Skew. Eq. 3.6
For the design requirement in the ideal case of perfect delay matches of the drivers and receivers is,
Which leads to,
For an initial skew of R0 = 110 ps, the requirement of K0 ³ 10 needs to be met.
In the actual circuit, the gain of the filter, A, is adjusted such that the above condition is met. The reason for this is that KPD and KVCDE are more difficult to vary than the gain, A, of the filter. The design procedure that was followed, treats KPD and KVCDE parameters as more or less fixed.
With the KPD and KVCDE parameters shown in Table 3.1, the requirement on the gain of the filter is A ³ 31.
The actual gain of the integrator circuit was determined with SPICE and was found to be A » 220 (Fig. 3.14).
Another design specification requires that the closed loop system (Eq. 3.4) exihibit an over damped behaviour. Inserting the parameters of the design described in this thesis (see Table 3.1) into the denominator of Eq. 3.4 yields,
D(s) = s2 + (2.832´ 109)s + 4.676´ 1016 = s2 + 2x w ns + w n2.
Solving for x results in x = 6.55 > 1, as desired.
Fig. 3.2. Simple differential integrator filter circuit
Table 3.1. The design parameters
0.0796 m s
3.98 m s
In the next section, the designs of phase detector, VCDEs, and filter circuits (more specifically the amplifier block within it) are described.
Fig. 3.3. VCDE circuit
Fig. 3.4. Delay vs. Control Voltage
3.2. PLL Design
The PLL shown in Fig. 3.1 can be partitioned into three functional blocks; VCDEs, phase detector, and filter. The circuit design for each of the blocks are presented in the following.
The PLLs in the deskew scheme utilize VCDEs instead of more conventional Voltage Controlled Oscillators (VCOs). Fig. 3.3 shows the VCDE circuit and Fig. 3.4 shows the delay vs. control voltage curve of four cascaded VCDEs. This circuit is based on the VCDE circuit shown in [GreubPat]. From this curve, KVCDE parameter can be obtained; since a clock loop contains two VCDEs actual KVCDE is twice the slope of the curve. The asymmetry in the VCDE characteristic means that separate limiter threshold voltages are required for positive and negative delays. With the delay characteristic shown in Fig. 3.4, larger negative voltage is needed to achieve the same amount of delay change than obtained by positive voltage.
The control voltage from the filter is applied at CON2J1 and CON2J0. The clock signal is input to CLKI1J1 and CLKI1J0; its delayed output appears at CLKQ1J1 and CLKQ1J0. The DELAY2 subcircuit has a fixed delay associated with it. The CON2J1 and CON2J0 control the relative proportion of the current output of the resistive current source, R5, that is flowing in the Q1_1, Q1_2 transistor pair and in the Q1_3, Q1_4 transistor pair. Transistor pair Q1_1, Q1_2 forms a differential amplifier that amplifies input clock signals CLKI1J1, CLKI1J0 to produce output differential signal at their collectors, CLKQ1J1 and CLKQ1J0. The gain of the Q1_1, Q1_2 amplifier is determined from the current flowing in Q1_5 and the collector resistors R1, R2. Silmilarly, the transistor pair Q1_3, Q1_4 forms a differential amplifier that amplifies the DELAY2 output signal appearing at the bases of Q1_3, Q1_4. The gain of this transistor pair is again determined by the current flowing in Q1_6 and the collector resistors R1, R2.
On the rising edge of CLKI1, transistor Q1_1 immediately pulls down the VCDE output, CLKQ1J0, and transistor Q1_2 pulls up CLKQ1J1. Sometime later, the change in DELAY2 output reaches the bases of Q1_3 and Q1_4, thereby causing Q1_3 to pull down further CLKQ1J0 and Q1_4 to pull up further CLKQ1J1. The current output of the resistive current source, R5, is shared by R1 and R2.
If the control voltage, CON2, is large and positive, all of the current is directed to Q1_1 and Q1_2 transistors causing Q1_3 and Q1_4 to be cut off; any change in the input appears immediately at the output with the minimum delay. On the other hand, if CON2 is large and negative, all of the current is directed to Q1_3 and Q1_4 transistors resulting in the cut-off state for Q1_1 and Q1_2 transistors. Any change in the input signal now is delayed by the DELAY2 subcircuit resulting in the maximum delay of the output signal. For an in-between level of the control voltage, the delay in the output is also in-between the extreme delay ranges of former two cases; amount of delay being proportional to the applied voltage.
Fig. 3.5a. Ideal Digital Phase Detector characteristics
Fig. 3.5b.Actual Digital Phase Detector characteristics
Fig. 3.6a. Expected UP and DOWN signals as the PLLs are locked
Fig. 3.6b. SPICE simulation of UP and DOWN signals
3.2.2. Phase Detector
The phase detector characteristics for the second deskew scheme should be such that its output vs. phase difference curve goes through the origin instead of at a point shifted by p /2 such as on four quadrant multiplier. Fig. 3.5a shows the ideal digital phase detector characteristics. This characteristic ensures that when the PLLs are locked the two input signals are in synchronous with each other instead of being off by p /2. The ideal phase detector range is 4p . The actual characteristic is shown in Fig. 3.5b.
Fig. 3.8 shows the digital phase detector circuit. R3, R3B, V3, and V3B are the input signals. The master-slave latches are used to generate the UP and DOWN pulses of appropriate durations during every period; these represent the output signals and are connected to the filter circuit through the Gate Keeper (GATEKPR) circuit. The pulse widths of the UP and DOWN signals depend on which one of the two inputs first detects the rising clock edge and the duration between the rising edges of the inputs. As the PLLs are locked, the UP and DOWN pulses tend toward narrow pulses of equal widths(Fig. 3.6). The narrow pulses of equal widths indicate that VCDEs have been appropriately adjusted so that the two clock inputs of the phase detector, R3, R3B, V3, V3B (Fig. 3.8), are in phase. With an ideal phase detector, the narrow width pulses would not exist. Instead, once the PLLs have locked, the UP and DOWN signals would exhibit logic low flat lines. These pulses are a common mode signal that eventually gets rejected by the differential filter circuit.
The phase detector circuit of Fig. 3.8 has two potential lock states as shown in Fig. 3.7; they are labelled as Case 1 and Case 2. The pulses of the same set of clock inputs, R and V, can be paired off with each other in two different ways. Both cases correspond to the two positions of the separate curves as indicated on the phase detector characteristics of Fig. 3.5. For proper deskew, all of the PLLs must lock to the same characteristics, i.e., either to Case 1 or to Case 2. Therefore, a phase detector limiting circuit that insures the PLLs to lock to only one state is needed.
The HIGH and LOW signals are provided from the limiting circuit within the filter circuit to detect the cases in which the phase detector attempts to lock to the improper input clock pairs. These signals are generated from within the filter whenever its output voltage to the VCDEs are at either extreme ends of the range. When the phase detector initially tries to lock to the Case 2 state, the PLL will attempt to lock to the position labelled as, 2p , in Fig. 3.5; the improper state.
To reach this improper state the filter needs to output a voltage of large magnitude; this voltage can not be larger than the maximum tolerable input voltage of the VCDEs. The HIGH and LOW signals are generated by exploiting this fact. They are generated by sensing whether the filter output voltage has reached a certain threshold voltage, which has been set near the end of the filter output range.
The HIGH and LOW signals of the filter behave more like the analog signals than the digital ones in that their edges exhibit very slow rising or falling time. Hence, the two Schmitt triggers (Fig. 3.12) are utilized to make the signals more sharply defined to the digital circuits RS3 shown in Fig. 3.9. Upon the receipt of either the HIGH or LOW signals, the phase detector appropriately corrects itself and proceeds to lock to the proper pairs of the input clock pulses.
The chain of events following the generation of the HIGH signal are as follows; similar complementary events occur for the LOW signal. The Schmitt trigger, upon the receipt of the HIGH signal, asserts logic high the Force Down (FD3) input of the GATEKPR (Fig. 3.13). The FD3 is asserted logic high, whereas the FU3 remains logic low. Then, the filter output changes its direction and moves downward. Moments later, the HIGH signal from the filter will be deasserted. However, the filter output continues to move downward until the SR3 is cleared by the third Schmitt trigger, which senses the change in the timing relationship between the UP2 and DOWN2 pulses. The change in the relationship indicates the change in the state of the phase detector; the state has jumped from the Case 2 to the Case 1. The phase detector is now set to lock onto the proper state that results in the proper clock deskew.
Initially, the rising edges of the clock inputs arrive at different instants on the two PD1MSLCs resulting in a long pulse width for either the UP2 or DOWN2 output signals. When both of the PD1MSLCs detect the rising clock edges, the OR1 asserts CLR1 signal to logic high causing the UP2 and DOWN2 outputs to be asserted logic low. The circuit is now ready for another pair of clock input pulses to begin the cycle described above.
The subcircuits of the phase detector circuit are shown in Fig. 3.9 through Fig. 3.13. The internal signals of the subcircuits except the OR1 are differential. To compensate for the lower noise margin of the singled-ended signal OR1 circuit, the noise margin for the PD1MSLC had been doubled over the normal margin present in other differential circuits.
The RS3 is a simple set-reset flip flop. The PD1MSLC is unique to the phase detector circuit. Its uniqueness is a result of the level one clear signal, CLR1J1. The proper operation of the phase detector demands that the PD1MSLC be able to clear itself with as little a delay as possible. The level one signal input has the minimum propagation delay among the three possible levels. The rest of the circuit comprises a simple master-slave latch with a level two output.
The OR1 of Fig. 3.11 receives two single-ended signals, A2J1 and B2J1, but outputs a differential signal, OR1J1 and OR1J0. The base of Q1_3, labelled as ORREF, is maintained at the voltage that corresponds to the midpoint of the voltage ranges of A2J1 and B2J1 inputs. Q1_6, Q1_7, R4, and R6 are used to speed up the output change by shifting the threshold voltage, ORREF. For example, if both of the inputs, A2J1 and B2J1, are asserted logic low, the voltage at the ORREF rises due to Q1_6 and R4; the magnitude of the differential voltage input is increased, which speeds up the output transition. The right portion of the circuit, from Q1_8 to Q1_13, are used to track the output of the PD1MSLC. The Q1_8, Q1_9, and R7 of the OR1 track the Q1_18, Q1_20, and R9 of the PD1MSLC of Fig. 3.10. By connecting the base of Q1_8 to the R8 and R9, the midpoint of voltage level of the PD1MSLC output is obtained; R8 and R9 of the OR1 track R6 of the PD1MLSC, Q1_10 tracks Q1_10, Q1_11 tracks Q1_13, Q1_12 and Q1_13 track Q1_16 and Q1_17.
For the Schmitt trigger to change its output, the input must change by a certain minimum amount (threshold). When the input has changed by greater than this threshold, the output changes swiftly making it ideal to convert the slowly changing analog input signal to more digital output signal. The Schmitt trigger shown in Fig. 3.12 utilizes the positive feed back during the output transition. This circuit is based on the Schmitt trigger circuit in [GreubThe].
For the output to change from logic low to logic high the input must be large enough to cause change in the voltage between the ML and MR. This is achieved when the voltage across R3 is equal in magnitude to the voltage difference between the TL and TR. The voltage across R3 in turn depends on the input voltage, D1J1 and D1J0.
The GATEKPR has three proper modes of operation. In the first mode, it becomes a transparent pathway for the UP2 and DOWN2 pulses to the filter; the FU3 and FD3 are asserted logic low. In the second mode, the FU3 is asserted logic high which results in logic high assertion at the output. On the other hand, if the FD3 is asserted logic high then the output is asserted logic low; the third mode.
Fig. 3.7. Two potential lock states for phase detector
Fig. 3.8. Phase Detector circuit
Fig. 3.9. RS3 circuit
Fig. 3.10. PD1MSLC circuit
Fig. 3.11. OR1 of Phase Detector circuit
Fig. 3.12. Schmitt Trigger (Level 3)
Fig. 3.13. Gate Keeper (GATEKPR)
A simple integrator filter circuit shown in Fig. 3.2 contains an amplifier circuit. An ideal integrator has an op-amplifier with an infinite gain or very large gain that can be approximated to be infinite. However, with a supply voltage limited to -5.2 V and the need for a small device count to improve the yield, the amplifier circuit shown in Fig. 3.14 has been designed to have gain of 220. This circuit can be partitioned into three stages; two amplifier stages and a part of the limiter stage, which is shared with the phase detector. In analyzing the circuit, it should be noted that the Vbe of the HBT is » 1.4V.
The gain of the first amplifier stage is expressed by,
Similarly, the gain of the second smplifier stage can be expressed by,
The gain of the two stages from these simple calculations turns out to be, A1A2 » 266. It is in reasonable agreement with the PSPICE simulation result (gain of 220) as shown in Fig. 3.15. The stability of the amplifier circuit was another design concern besides the need to generate enough gain. The two capacitors, C1 and C2, are used to place a dominant pole at low frequency; it is placed to take advantage of the miller effect via the high gain of the first amplifier stage. Fig. 3.15 shows the frequency and the phase responses of the circuit. It can be readily seen that the phase margin is greater than 45° , insuring that the amplifier remains stable at any frequency. Similar plots for the integrator filter is shown in Fig. 3.16.
Two Schottky diodes, D1 and D2, have turn-on voltages of about 750 mV. These diodes are used to give some "head room" to the Q1_3 and Q1_4 of the first amplifier stage to prevent them from going into saturation.
The third stage, located at the right side of Fig. 3.14, constantly senses the filter output to determine whether it is within the phase detector limits of Fig. 3.5 and outputs either HIGH or LOW signal, when the filter output ventures out of the limits. Simple comparators are found to be sufficient for the task. The Q1_17 and Q1_18 transistor pair forms a comparator that generates the LOW signal; the Q1_12 and Q1_13 transistor pair forms a comparator that generates the HIGH signal. The bases of Q1_17 and Q1_13 are connected to LFREF and RFREF, respectively. They are the threshold reference voltage signals of the respective comparators. When the filter output is within the limits, Q1_12 and Q1_18 transistors are turned on, resulting in logic low assertions for the HIGH and LOW signals. But as the filter output becomes more negative, VOUT will eventually be lower than LFREF, turning off Q1_18 and turning on Q1_17; LOW is then asserted logic high. Similarly, as the filter output becomes more positive, VOUTB eventually becomes lower than RFREF, turning off Q1_12 and turning on Q1_13 instead. This time, HIGH is asserted logic high.
Fig. 3.14. Amplifier circuit
Fig. 3.15. Amplifer frequency and phase response plots
Fig. 3.16. Filter frequency and phase response plots
Fig. 3.17a. Driver circuit
Fig. 3.17b. Receiver circuit
3.2.4. Drivers and Receivers
The differential driver and receiver circuits are shown in Fig. 3.17. The driver does not have collector pull up resistors since it is designed to drive transmission lines with 50W characteristics impedance. Transistors have been clustered to increase the current drive capability. The Schottky diodes are used to ensure that VCBs of Q3_1 and Q3_2 are less than the collector-base breakdown voltage. The receiver inputs are terminated with 50W resistors (R1 and R4).
3.2.5. Lock condition sensor
The lock condition sensor shown in Fig. 3.19 asserts LOCK signal when it determines that the PLLs have locked onto the input clock signals. There are two SORAND gates in this circuit. The top one is used as an OR gate while the bottom one is used as an AND gate; CML gates have complementary functions.
On the top signal path, UP and DOWN signals are ORed, inverted, delayed (by LOCKDELs), and sent to the data inputs of a master-slave latch (MSLC1). On the bottom signal path, UP and DOWN signals are ANDed and sent to the write inputs of the MSLC1. As shown in Fig. 3.18, ZIN is delayed by a small amount relative to ZOUT. The delay does not affect the MSLC1 output when the PLLs are in acquisition state due to the relatively large pulse widths of ZIN, but as the PLLs begin to lock, the widths of the ZIN pulses diminish and eventually become smaller than that of the fine delay inserted in the top path. At this point, ZIN would be presenting an opposite data to the data inputs of the MSLC1, resulting in the assertion of LOCK signal. Another MSLC1 gate has been added to clean up the quality of the LOCK signal.
Fig. 3.18a. LOCK generation timing diagram
Fig. 3.18b. SPICE simulation of LOCK generation
Fig. 3.19. Lock sensor
Fig. 3.20. Overview of the deskewing process
Fig. 3.21. Inital skew of 110 ps
Fig. 3.22. Final skew
Fig. 3.23. The filter output behaviour in response to the activation of the limiting circuit
3.3. Overall circuit simulation results
Fig. 3.20 shows the overall behaviour of the two clock signals being deskewed with substantial initial skew. The zoomed up view of the simulation in Fig. 3.21 shows the initial skew of over 110 ps between the clock signals at two chips due to large mismatch in the clock distribution lengths. The same clock signals after the deskew circuit has responded shows a substantial reduction in the skew to less than 5 ps (Fig. 3.22).
In Fig. 3.23, the bouncing behaviour of the filter output due to the phase detector limiting circuit is shown. The initial change in the direction of the filter output is due to the forced improper lock by the simulation set up to test the limiting circuit.