Chapter Two


Voltage-Controlled Oscillator Design

 

During the course of the F-RISC/G processor design and implementation, discrepancies were found between the predicted speeds using the Rockwell-supplied models and experimental measurements. In order to explore the capabilities of the process and explain the difference in predicted and experimental performance, a high-speed, wide-bandwidth voltage-controlled oscillator (VCO) was developed as a "challenge" chip. This circuit would test the process and CAD tool performance as well as the design and layout skills of the F-RISC/G designers. The VCO was developed using digital CML circuits optimized for extreme high-speed operation but with the full-swing characteristics of digital logic. It was designed for a bandwidth of 0.25 - 20 GHz but operated about 32% slower. Using newer device and interconnection models, the performance of the VCO may now be accurately predicted. The VCO circuits were designed to be compatible with the F-RISC/G circuits and have been incorporated into the high-speed clock deskew circuit..

Overview

In 1994, several discrepancies had been uncovered between the circuit performance predicted by the Rockwell-supplied device and interconnection models and actual experimental measurements. We had designed relatively large circuits that were functional but operated at significantly lower speeds than expected. Furthermore, separate circuits on the same chip would often work but not at the same frequency or power supply level, severely inhibiting testing. The test chip yields were lower than expected but we believed that they would improve with time and the addition of better fabrication tools into the process (in particular, a stepper from Canon with higher resolution).

In part to improve the knowledge of the Rockwell process, the Advanced Research Projects Agency (ARPA) sponsored the High-Speed Circuit Design (HSCD) project. This project was intended to promote the design of very high-speed circuits by combining experimental data from fabricated test structures with CAD tools from Cadence Design Automation and OEA Associates. Four test chips were submitted for fabrication, namely a chip containing simple passive and active test structures, a standard cell test chip, the original RPI testchip, and a high-speed VCO. The passive/active test structures were intended to provide basic, low-level process information such as dielectric spacings and constants, wire and via resistances, and device characteristics via ring-oscillators and de-embedded transistors. The standard cell test chip was designed to test a representative sample of the F-RISC/G cell library ("real" cells) as well as the boundary-scan test circuitry. The original testchip was included both as a control for comparison to the first RPI fabrication run and in the hopes of obtaining more robust circuits that could provide more test data than before.

While the passive/active structures would provide basic device and process information and the original and standard-cell test chips would give hard data on how fast normal circuits would operate, we would still have no concrete evidence regarding the maximum speed that was possible with the Rockwell process. In order to push the upper end of the design spectrum, a voltage controlled oscillator (VCO) was created using fully-custom layouts and new or improved (but F-RISC/G compatible) circuit designs. Using the information gathered from the VCO and the other test structures, improved models were developed that predicted the observed VCO performance and indicated that the F-RISC/G register file would have to be redesigned to achieve the required speed. This chapter describes the VCO design including the architecture, circuits, layout and test results.

Voltage Controlled Oscillators

A voltage-controlled oscillator (VCO) is simply a circuit that generates an oscillating signal at a frequency proportional to an externally applied voltage. These types of circuits are useful for tracking and matching signal frequencies as they shift due to thermal variations, power supply fluctuations, and other sources of frequency phase-shifts. VCOs are found particularly often in phase-locked loops (PLLs) used for clock generation and synchronization. PLLs combine the variable frequency characteristics of the VCO with a phase detector circuit in order to track a signal as it changes frequency. The F-RISC/G clock deskew circuit uses a PLL to reduce the skew between different clock paths to less than 5 ps for a 2 GHz signal. There are basically three types of oscillators that may be voltage-controlled: push-pull oscillators, relaxation oscillators, and ring-oscillators.

Common-Base Oscillators

In the last few years there have been a number of VCOs in the literature based upon a common-base (CB) amplifier. These circuits have a relatively high efficiency and a decent but limited tuning range [WANG92, KHATI89, ADAR91]. One common drawback to this type of oscillator is its tendency to change frequency with any variation in the load, referred to as the frequency-pull effect. An example of a CB oscillator is shown below in Figure 2.1.

Figure 2.1 - CB voltage-controlled oscillator with common-base amplifier

Relaxation Oscillators (Multivibrators)

Relaxation oscillators (or multivibrators) are the most commonly used type of oscillator in IC designs. The circuit oscillates by continuously charging and discharging a capacitor between two voltage levels. A multivibrator may be controlled via a voltage-controlled current source that supplies the current for charging and discharging the capacitor (an example is shown in Figure 2.2 below). Although multivibrators are relatively simple and require few devices, the oscillation frequency is partially dependent upon the capacitor value and is rather sensitive to thermal effects due to their dependence upon the device VBE. These circuits are capable of high speeds (a peak value of 7.4 GHz has been reported by [SOYUE92]).

Figure 2.2 - Voltage controlled, emitter-coupled multivibrator (from [GRAY93])

Ring Oscillators

Ring oscillator VCOs are fundamentally different from multivibrators. The basic circuit block is a variable delay cell that is voltage-controlled. By connecting several delay elements as a ring oscillator, the circuit will oscillate with a period equal to the voltage-controlled cell delay multiplied by twice the number of delay stages (the signal must propagate through the inverting path twice to return to its original value). One example of a ring-based VCO is shown below in Figure 2.3.

Figure 2.3 - Basic voltage-controlled ring oscillator

Quadrature Frequency Multiplication

Unlike multivibrators, voltage-controlled ring oscillators are not highly sensitive to temperature or capacitor values. In addition, the ring oscillator circuits may be created with multiple quadrature outputs that may be used to double the frequency. Quadrature outputs are two signals that are 90 out of phase and consequently may be multiplied to obtain twice the input frequency. Quadrature signals are useful in many PLL applications and are possible using multivibrator circuits. Multiple quadrature outputs, however, are unique to voltage-controlled ring oscillators and makes possible multiplication by factors of 4 and higher.

To describe quadrature frequency multiplication mathematically, consider a signal A=sin(w t), and another B=sin(w t+p /2) that is 90 out of phase. Multiplying the signals we get

A*B = sin (w t)*sin(w t+p /2)

= sin (w t)*cos(w t)

= sin (2w t)

A more graphical explanation is shown in Figure 2.4 where two signals have the same frequency but a phase shift of 90 . Because each signal has a high or low value for only half of a cycle, a 90 phase shift offsets one signal by of the period and consequently the two signals have the same value for only of the cycle before changing. By combining the signals using an exclusive-OR (XOR) logic function, a signal with frequency 2w o may be generated.

Figure 2.4 - Frequency doubling of signals via exclusive-OR

Note that the 2w o signal duty cycle is dependent upon both the phase shift f and the duty cycles of the two input signals. A 50% duty cycle is preferred for PLL applications because the phase detectors are sensitive to the difference and will usually respond by producing a constant DC offset that may or may not require compensation. In addition, any shift in the duty cycle will degrade the VCO output signal, reducing the 2w o frequency in favor of a combination of lower frequencies. This is evident in Figure 2.5 where the first cycle in the 2w o waveform has been shortened by d while the next cycle has increased by the same amount, resulting in a waveform with an increasing component at frequency w 0. The problem becomes even worse when the two input signals have different duty cycles.

Figure 2.5 - Frequency doubling with source duty cycles below 50%

In order to generate a signal at four times the ring oscillator frequency, the 2w o signal must be doubled again. This is possible only when two signals are available in quadrature at a frequency of 2w o, such as in the four-element ring oscillator shown in Figure 2.6. The core signals in quadrature (0 and 90 , 45 and 135 ) are separated by one delay stage in the oscillator. Because the signal has to pass through the delay elements twice to complete one cycle, each stage in the cycle represents a * (p /4), or 45 phase shift in the signal and thus signals separated by one stage are in quadrature. Furthermore, the two sets of quadrature outputs are themselves in quadrature with a phase shift of 90 relative to the doubled frequency (or, alternatively, 45 relative to the core frequency) and may now be combined to generate a signal at 4X the core frequency. The waveforms corresponding to the diagram in Figure 2.6 are shown in Figure 2.7.

Figure 2.6 - Voltage controlled ring oscillator with 2X and 4X frequency generation

 

Figure 2.7 - Waveforms for voltage-controlled oscillator core, 2X and 4X signals

High-Speed VCO Architecture

In order to probe the upper limits of the Rockwell process, a high-speed wide-bandwidth voltage controlled oscillator was conceived as a "challenge" chip. The circuit was intended to test the high-end performance of the Rockwell HBT process, the capabilities of the F-RISC/G CAD tools, and finally our prowess as high-speed circuit designers. The VCO is a ring-oscillator using variable delay cells and two levels of frequency doublers. In order to achieve the best performance possible, the entire circuit was handcrafted and rigorously optimized using SPICE and the Compass CAD tools. Due to the frequency-doubling scheme employed, symmetry is essential and has been a primary goal of the physical design process.

Top Level Architecture

The VCO is composed of three distinct components: a variable-delay ring-oscillator core, a frequency multiplier, and a frequency divider [CAMP295, BUCHW92] (Figure 2.8). The VCO core utilizes Greub delay elements [GREUB89] arranged as a ring-oscillator. The core frequency is fed into the frequency multiplier that exploits the quadrature nature of the ring oscillator to generate signals at two and four times the core frequency. The frequency divider takes an input signal at either the core frequency or one of the multiples and divides it by factors of 2, 4 or 8. In addition to the VCO, a 24-stage ring oscillator was included in the circuit to provide an independent gauge of device speed across different sites on the wafer, different wafers and even different fabrication runs.

Figure 2.8 - Voltage-controlled oscillator (VCO) architecture

Because the cost of fabricating chips is so high, testable circuits are at a premium and thus redundancy is crucial. Three multiplexers are included to provide multiple signal paths in the event of a failure. An external oscillating signal may be fed into the chip if there is a problem with the core oscillator. Unused pad sites were connected to various points on the power rails to check for voltage droop.

To create the multiple signal paths, three 4:1 multiplexers are included in the architecture. The core-signal multiplexer (MUX1) selects from between the frequency-multiplied signals (core frequency, 2X, 4X) and the external high-speed clock input. The output-signal multiplexer selects the output signal from among the core, core-signal multiplexer, frequency-divided, and ring oscillator signals. The divider multiplexer (not shown in Figure 2.8) is contained within the frequency-divider block and selects the frequency divisor (1, 2, 4 or 8).

Signal Propagation and Feedthrough

Signal propagation and feedthrough is an important aspect in the design of high-frequency circuits, especially for control circuits like the multiplexers that need to prevent interference between the selected signal and the other inputs. This problem is made worse when the input signals are subharmonics. At the upper frequency limits, the current switches cannot switch fully on or off and begin to operate in an analog mode that causes problems with DC offsets and feedthrough of subharmonic signals. Consider a signal s at frequency w that has a DC offset D V and a small "parasitic" subharmonic component at w :

Eq. 1

where the amplitude A1 of the desired frequency w is much greater than the amplitude A2 of the nearest subharmonic w /2. When this signal passes through a current switch operating in analog mode, the output signal becomes

Eq. 2

where the gain terms G decrease as the frequency increases, thus GDC > GAC(w /2) > GAC(w ). As a result, the DC and parasitic subharmonic components increase (relative to the desired signal) after each stage as the signal propagates through the multiplexers, buffers, and the output driver. In the extreme, the output signal becomes the subharmonic with a large DC offset. Even at lower frequencies where the devices switch completely, the DC and subharmonic components must be suppressed as much as possible since they cause duty cycle shifts and phase noise.

Voltage Controlled Oscillator Core and Frequency Multiplier

The oscillator core is composed of four voltage-controlled delay elements connected in series, forming a ring oscillator (see Figure 2.9). An inverted feedback path changes the input value, flipping the input signal constantly and creating an oscillation. Because there is only one inversion in the chain, the signal must pass through every delay element twice in order to complete one cycle. The core was designed to operate between 2 and 5 GHz that would permit an output frequency range of between 0.25 GHz (divide-by-8) and 20 GHz (multiply-by-4).

Figure 2.9 - VCO core signal generator and frequency multiplier

Because symmetry is essential in reducing phase error when using frequency doublers, the layout of the VCO core and frequency multipliers was crucial. The four delay elements were arranged in a square (Figure 2.10) to closely match the parasitics on all core interconnections. In order to match the XOR interconnection parasitics, the 2X frequency doublers were placed on the top and bottom of the square and the 4X circuit was placed on one of the remaining sides, halfway between the two 2X circuits. Based upon this arrangement and with the requirement of quadrature inputs to the XOR gates, the signal interconnections were placed and routed.

The final arrangement is not perfectly balanced due to an additional crossover on two of the four output signals. However, both inputs to XOR1 are matched as are the inputs to XOR2, hence there is no steady state phase error between the delay elements and the first level of frequency multiplication. Since the inputs to XOR1 are slightly more heavily loaded than the inputs of XOR2, there is some slight phase error entering the second-level frequency doubler XOR3.

Figure 2.10 - Closely-balanced layout of VCO core, 2X and 4X multipliers

One alternative physical arrangement for the core and the multipliers is shown in Figure 2.11 below. While the previous layout had a steady-state phase error between the 2X core signals, this new design has shifted the phase error back on level to the inputs of the 2X multipliers (XOR1 and XOR2). One benefit of this arrangement is shorter interconnection paths between the 2X and 4X circuits.

Figure 2.11 - Alternative VCO core and multiplier placement and routing

By inspection it is not obvious which arrangement is preferable in terms of steady-state phase error, but by modeling the XOR gates as ideal analog multipliers the effects may be roughly quantified. Consider the first arrangement in which both input signals to one XOR have more capacitive loading than the other, resulting in a phase shift f on both input signals. Consequently the output of the first set of multipliers is

Out1 * Out3 = sin(q +f ) * sin(q +f +p /2)

= sin(q +f ) * cos(q +f )

= sin(2q +2f ) Eq. 3

Out2 * Out4 = sin(q +p /4) * sin(q +p /4+p /2)

= sin(q +p /4) * cos(q +p /4)

= sin(2q +p /2)

= cos(2q ) Eq. 4

where the p /4 terms are due to the 45 phase shift between consecutive stages. Note that the phase shift on the input signals for the first product has generated a phase shift on one of the 2X output signals. Multiplying the 2X signals together produces

Out1 * Out2 * Out3 * Out4 = sin(2q +2f ) * cos(2q )

= 1/8 [sin(4q +2f ) + sin(2f )] Eq. 5

Consequently, the final output signal is at four times the input frequency, has a phase shift of 2f and a DC offset of sin(2f ). A plot of the 2X and 4X signals is shown below in Figure 2.12 where the DC offset can clearly be seen along with the reduced amplitude of the 4X signal.

Figure 2.12 - Waveforms for frequency multiplier circuits with different interpath loads

For the second arrangement where one input to each first-level multiplier is more heavily loaded, the equations become

Out1 * Out3 = sin(q ) * sin(q +f +p /2)

= sin(q ) * cos(q +f )

= [sin(2q +f ) - sin(f )] Eq. 6

Out2 * Out4 = sin(q +p /4) * sin(q +f +p /4+p /2)

= sin(q +p /4) * cos(q +f +p /4)

= [sin(2q +f +p /2) - sin(f )]

= [cos(2q +f ) - sin(f )] Eq. 7

Out1 * Out2 * Out3 * Out4 = [sin(2q +f )-sin(f )] * [sin(2q +f +p /2)-sin(f )]

= [ sin(4q +2f ) - sin(f ) (sin(2q +f )

+ cos(2q +f )) + sin22f ] Eq. 8

This signal has a component at 4X the core frequency but it also has a harmonic at 2X the core that degrades the overall performance. This is clearly shown in Figure 2.13 below along with the DC offset component.

Figure 2.13 - Waveforms for frequency multiplier circuits with different intrapath loads

Because the equations are dependent upon the phase shift generated by the imbalanced parasitic capacitance, it is useful to determine how the equations react as the phase shift increases. Figure 2.14 below shows the 4X output as the phase shift on both inputs to one 2X multiplier varies. As the phase shift increases, the frequency and amplitude remain the same but the offset increases and shifts the waveform downward.

Figure 2.14 - Output signal with increasing phase shift for both inputs to one 2X multiplier

Figure 2.15 below shows for the same input how a phase shift on one input signal of both 2X multipliers affects the waveform. In contrast to Figure 2.14, the waveform rapidly degrades, losing the 4X core frequency component while the 2X frequency grows larger. Clearly if a phase shift must be present in the multiplier circuit, it is preferable to focus it upon the inputs to only one gate rather than spreading it between the two first-level multipliers.

Figure 2.15 - Output signal with increasing phase shift for one input to both 2X multipliers

Frequency Divider

In contrast to the complexity of the VCO core and frequency multiplier circuit design and layout, the frequency divider chain is much simpler. Because the VCO signal closely resembles a digital waveform (or at least has sharp rise and fall times and an amplitude much greater than VT), high-speed digital circuits are used in the divider.

Figure 2.16 - High-speed frequency divider chain

The division of frequency is accomplished using high-performance digital toggle flip-flops (TFFs) that change state only when the input signal has performed two level transitions and returned to the original value. Three TFFs are used to provide divisors of 2, 4 and 8 (Figure 2.16). A divisor of 1 was also provided as a means for verifying the operation of the input. All four divider output signals ( 1, 2, 4, and 8) pass through a 4:1 multiplexer that forwards the selected signal to the output-signal multiplexer.

Because the signal frequency drops by a factor of 2 after every TFF stage, the operating requirements for each flip-flop decrease at every stage. The first TFF in the chain is a high-power cell (2 mA IC) while the remaining circuits have medium power levels (~1.2 mA IC). Two high-speed buffer/driver circuits are included in the chain for isolation as well as increased signal gain. The input driver circuit is a multistage, high-gain amplifier that conditions the signal for the first TFF. Because the input signal has a maximum frequency of 20 GHz, the driver circuit must be capable of high-speed and wide-bandwidth operation, consequently many gain stages are required. Because the output of the first divider has a maximum frequency of 10 GHz, the buffer circuit between the first and second stages does not have such severe operating requirements as the input driver. A high-speed buffer that is used elsewhere in the VCO was selected as the interstage buffer.

Circuit Design

The circuits in the VCO are based upon standard current-mode logic (CML) but with high-bandwidth operation in mind. Some of the considerations for wide bandwidth and high-speed operation include symmetric circuit topology, improved signal isolation at high frequencies and cancellation of signal feedthrough. This section describes the major components of the VCO and their respective design considerations.

Voltage-Controlled Delay Elements

Clearly one of the most important components in a voltage-controlled ring-oscillator is the delay element. These circuits establish the characteristics of the output waveforms as well as the overall frequency range for the circuit. In order to provide a 50% duty cycle, the delay elements should have fairly equal rise and fall times. A delay element with a rather high maximum frequency of operation has been described in the literature, however, it isn’t directly compatible with the F-RISC/G circuits. A different style of circuit was selected instead, developed and patented by Professor Hans Greub [GREUB89]. Both circuits are discussed in detail below.

Buchwald Delay Element

One example of a delay element is shown below in Figure 2.17. The circuit is fully differential, removing any concerns about duty-cycle shifts. The input signal Y1:  switches the QAT:QAF device pair that determines the Y2: output values. There are two frequency adjustments available, namely STEER and VCNTR. The STEER signal provides coarse adjustment of the frequency by roughly varying the amount of current flowing through the circuit. The VCNTR signal provides a method for fine-tuning through the reverse-biased base-emitter junction capacitances. The authors found through simulation that the cell delay including loading effects was approximately 1.5/fmax to 2/fmax where fmax is the maximum device frequency of operation.

Figure 2.17 - Ring oscillator delay cell [BUCHW92]

Despite the apparent simplicity, there are a few drawbacks to this circuit. Although it is quite fast, the circuit does have a limited tuning range due in part to the resistive pull-ups, the maximum current limits and the amount of capacitive loading possible from the reverse-biased junctions. At the lower end of the operating range, the lower current levels limit the output voltage swing. Emitter followers may be placed on the output nodes but this would only help isolate the collector nodes from the loading of the next delay stage and not fix the underlying problem. The limitation of the lower range is a fundamental problem with this type of circuit.

The authors state that the STEER inputs provided a tuning range of 400 MHz with a center frequency of 6.25 GHz and at bias currents between 1 and 5 mA. The VCNTR path provided an additional 200 MHz tuning capability (over a range of 0 to -6 V), yielding a maximum VCO frequency of 6.8 GHz. It should be noted that the paper does not explicitly state that the measured results were from the 4X frequency multiplier but this can be inferred from their relation of the multiplier output to the fmax device characteristic.

Greub Delay Element

One of the drawbacks with the Buchwald delay element is the relationship between the output voltage swing and the frequency of oscillation. An alternative design (shown below in Figure 2.18) avoids this problem by maintaining a constant current but varying the contribution of slow and fast signal paths.

Figure 2.18 - Delay element [GREUB89]

There are two paths through the circuit, with one path (QSX) intentionally slower than the other (QFX). The contributions from both paths are combined based upon the current balance through the differential pair mixer at the bottom of the current tree. Because the delays through the paths are different, the slow path may be favored by reducing the current through the fast path devices and vice versa. The relationship between the slow and fast path currents is shown in Figure 2.19(a). As the control voltage is increased, the ratio shifts from the slow path to the fast path until the current is almost completely shifted. The effect upon the combined path delay (the circuit delay without the emitter followers) is shown in Figure 2.19(b) from which it is clear that the delay is reduced as the fast path contributes more current. The total delay is also plotted and shows that the emitter followers add a relatively constant amount to the total path delay.

(a) Slow and fast path device currents

(b) Signal delay for slow, combined (slow + fast), and total paths

Figure 2.19 - Delay element signal path characteristics

 

 

 

 

 

The delay element demonstrates reasonably good linear performance in the middle of its range and nearly linear outside, making it a satisfactory candidate for phase-locked loops. Furthermore, the frequency range is rather broad that allows for a wider PLL tracking range and consequently more robust operation. One downside is that the crossover frequency, f (VCONTROL=0), is not centered within the frequency range and is offset towards the high end. However, this is simply a side effect of the parameter choices and can be easily offset.

Frequency Multiplication

As discussed earlier, combining two signal taps that are in quadrature may double the frequency of a signal. In order to maintain the 50% duty cycle characteristic of the input signals, the mixing circuit must have equal delay paths through the circuit, hence a fully symmetric circuit is required along with balanced layout.

Analog multipliers and modulators are generally used to convert quadrature inputs into a double-frequency output. Analog multipliers generate an output signal that is (ideally) the linear product of two input signals, i.e. VOUT=KVIN1VIN2 where K is a gain constant. The most common type of analog multiplier, shown in Figure 2.20, is called a Gilbert multiplier [GILBE74]. The Gilbert multiplier is a four-quadrant multiplier, meaning that it operates regardless of the signal polarities.

There are basically three modes of operation for the circuit depending upon the amplitude of the input signals relative to the thermal voltage VT. If both input signals are small with respect to VT, the devices remain in the linear region and the circuit operates as a linear multiplier. If one signal becomes relatively large, one set of devices begins to operate as switches and the circuit becomes an analog modulator (described below). Finally, if both signals are large, the circuit has an exclusive-OR (XOR) logic transfer characteristic.

Analog modulators (or mixers) provide a linear response for only one of the input signals, commonly referred to as the modulating input. The other signal (the carrier input) is driven by a large amplitude signal (relative to VT) at a higher frequency. Because the carrier input signal magnitude is large, the devices behave like switches and the slower modulating input is effectively multiplied by a square wave at the carrier frequency [GREBN84]. The Gilbert multiplier is often referred to as a balanced modulator due to the symmetry of each signal path. Note, however, that the symmetry lies only within each differential path and that the two paths are not symmetric with respect to each other. In other words, the delay for each half of either differential pair is identical but the delays between the differential pairs are not. Consequently, when the circuit is operated as a phase detector, the differences between the two paths can lead to unwanted steady-state phase errors. To compensate, [SCHMI90] has combined two Gilbert cells in order to cancel out the phase error.

Figure 2.20 - Gilbert analog multiplier with differential current output [GRAY93]

Buchwald Fully Balanced Mixer

In [BUCHW92], a Gilbert multiplier was used to generate a signal at twice the input frequency. Due to the asymmetry between the two input signal paths, two multipliers were used in order to cancel out the phase error introduced by the asymmetry (Figure 2.21). By switching the connections for each multiplier, the phase errors from each circuit half will be equal in magnitude but opposite in sign. Consequently, the phase error is eliminated by summing the output of the multipliers (at least to the degree of matching between the two circuits).

Figure 2.21 - Fully-balanced mixer using parallel Gilbert multipliers [SCHMI90]

Novel Fully Balanced XOR

Despite the symmetry of the Buchwald mixer, a different circuit (Figure 2.22) was developed that had better digital performance at higher frequencies. Unlike the previous circuit, this circuit cannot function as an analog multiplier or modulator because there are no mechanisms for multiplication.

The circuit operates by generating product pairs corresponding to the XOR function, namely a0b0, a1b1, a0b1, and a1b0 where

Eq. 9

At any point, one of the product pairs will be logically high and will turn on its corresponding device in Q1-Q4. Because the four devices are coupled at the emitter nodes, only one will be active at any one time (this also ensures quick switching as one product pair rises high and the previous high pair drops low). Because the input paths are perfectly symmetrical both within each path and between the paths, the circuit is fully balanced. Emitter followers (not shown) are provided on the output nodes to isolate the circuit from capacitive loading.

Figure 2.22 - Novel fully balanced XOR [CAMP295]

One benefit/drawback of a fully balanced circuit is that it may be sensitive to phase shifts on the input signals. SPICE simulations have shown that the XOR is indeed very sensitive to phase shifts and consequently is an excellent phase detector. As such, it has been used within a phase-locked loop as part of the clock deskew circuit designed to synchronize a 2 GHz clock across the F-RISC/G multi-chip module to within 5 ps [NAH93].

A comparison between the two modulator circuits has been performed using SPICE and the output waveforms are shown below in Figure 2.23. At lower frequencies, the XOR circuit has sharper rise times than the dual-Gilbert multiplier circuit, mainly because the XOR is essentially a digital circuit.

Figure 2.23 - Comparison between balanced Gilbert multiplier and fully-balanced XOR

High-Speed Multiplexer

In order to improve the functionality and flexibility of the VCO, several multiplexers were included to increase the number of signal paths. The multiplexer circuit (Figure 2.24) is based upon standard current-mode logic but with special modifications to improve performance at high frequencies. The 4:1 multiplexing function is implemented using two levels of 2:1 multiplexers. The first level selects two of the four inputs (IN0 or IN1, IN2 or IN3) and the second selects one of the signals from the first level. As with other circuits on the critical path, the mux was optimized for matched-capacitance in order to improve high-frequency operation.

Figure 2.24 - High-speed 4:1 multiplexer

Due to the redundant nature of the VCO, the multiplexer input signals are often harmonics of the core frequency. As a result, when the 4X signal is selected there may be significant feedthrough from the subharmonic signals that can add lower-frequency oscillations to the output signal. Because the signal attenuation increases with frequency, the gain of the low-speed path (IN0 / IN1 in the figure) was decreased by reducing the pull-up value from 250 W to 175 W (or conversely, the subharmonic gain was decreased). Inserting emitter-followers between the first and second level multiplexers also increased the high-speed path gain. While this did reduce the problem somewhat, further reductions were necessary and prompted the design of a compensating buffer with enable.

Compensating Buffer-with-Enable

Although the multiplexer circuit does not allow current to flow through deselected branches of the circuit tree, lower-speed signals were still leaking through to the output nodes and affecting the signal of interest. Based upon this observation, a compensating buffer-with-enable (Figure 2.25) was designed that used a "dummy" current switch (CS2 in Figure 2.24) to cancel out the feedthrough. Because no current flows through the dummy current switch, it operates much like the deselected current switches in the multiplexer and allows some amount of feedthrough. By inverting the base connections, when the circuit is disabled the feedthrough of the dummy current switch cancels out feedthrough from the other current switch. In order to reduce feedthrough as much as possible, these buffers have been placed on all the inputs to every multiplexer with the exception of the 4X signal and the divider mux inputs.

Figure 2.25 - Compensating buffer-with-enable incorporating a "dummy" current switch for feedthrough compensation

High-Gain Differential Amplifier

One of the more difficult aspects of the design was the amplifier that drives the output pads. Due to the high frequency and large swing / large current (400 mV, 8 mA) requirements, this aspect of the design consumed a great deal of the design time. Simulations with SPICE indicate that the amplifier is still the limiting factor in the design (the multiplexers operate above 20 GHz while the XORs operate up to 25 GHz). The differential amplifier circuit is shown below in Figure 2.26.

Figure 2.26 - High-gain, wide-bandwidth differential amplifier

Ring Oscillator

In order to maintain a simple gauge of device speed across multiple wafers and different fabrication runs, a 16 stage unloaded ring oscillator was included in the VCO. The ring oscillator consists of 16 inverters with minimal wire lengths between the stages in order to obtain the maximum device performance. The output of the ring oscillator passes through a compensating buffer-with-enable and connects to the output-signal multiplexer.

Physical Design and Layout

To produce robust multi-GHz circuits, the physical layout must be considered to be nearly as important as the circuit design itself. As described earlier, mismatched parasitic loading can have disastrous effects upon the circuit performance. For this reason, the VCO was implemented entirely in custom cells using handcrafted layouts. The design process was highly iterative, shifting between SPICE simulation, physical layout, and capacitance extraction. The entire layout and optimization process required more than 6 months. During this time, special layout techniques (described in Chapter 7, section 6) were developed for producing layouts with closely matched capacitance in order to reduce skew both between and within the differential signals [CAMP195].

Physical Layout Process

The design process began with the core oscillator and progressed down the critical path. Due to the relatively few high-speed I/O signals, pad placement was not a significant constraint. A preliminary system floorplan was developed based upon the critical path in order to determine locations of high-speed signal connectors for each cell. The VCO required the layout of six types of cells but many had with multiple instantiations with subtle differences. The six types of cells were the delay element, XOR, toggle flip-flop (TFF), 4:1 multiplexer, compensating buffer-with-enable, high-gain buffer, and one stage of the ring oscillator.

The core oscillator is composed of four delay elements that are arranged in a square to equalize the interconnection and parasitics between each element. Based upon the floorplan, each delay element has input signal connectors on one side and output connectors on an adjacent side. The location of the static control voltage input is opposite the output connector and is unimportant. As described previously, the multipliers were arranged symmetrically about the core. The output signals were then routed to the core-signal multiplexer, placed just below the multiplier section.

The next item on the critical path is the output-signal multiplexer that was placed below the core-signal multiplexer. The selected signal is then fed into the high-gain differential amplifier and travels to the output pads on the left side of the chip. Consequently, the entire high-speed path is contained within the left half of the chip.

The divider circuit was laid out to the right of the core-signal multiplexer mainly because the output signals of the divider do not require as much attention during placement and routing. This is primarily a result of their reduced-frequency and the lack of any quadrature requirements. The divider multiplexer is placed below the toggle flip-flops and the output signal is routed left to the output-signal multiplexer.

VCO Physical Design

The VCO occupies 2.1 mm x 1.8 mm, uses 412 devices and consumes 2.5 W. Full-custom layout techniques were utilized throughout the design in order to obtain the maximum bandwidth. Techniques for matched-capacitance layout were developed during the design and applied throughout in order to reduce steady-state phase error.

A plot of the finished chip is shown below in Figure 2.27. There are four separate probe sites (one per chip edge) but only two are actually needed for performing most tests. Due to a limited number of multichannel microwave probes, only the main probe site and the high-speed output site are necessary while the divider and external clock probe sites assume a default value when not in use. A test plan for the VCO is contained in Appendix A.

Figure 2.27 - VCO chip plot

The probe sites and individual pad locations are shown in Figure 2.28. All input signals are single-ended with the exception of the high-speed external clock. The low-speed oscilloscope trigger (VCO CORE) is a single-ended output while the high-speed output is differential.

The bottom set of pads (Probe Site #1) are input pads for the VCO control voltage, the two core-signal multiplexer (MUX1) select signals, and the two output-signal multiplexer (MUX2) select signals. There are also two sets of power pads (VCC,VEE) that typically provide the main power source for the circuit. The leftmost pad is an output pad that provides a signal from the VCO core. This pad is intended to operate as a trigger signal for an oscilloscope.

The top pads (Probe Site #3) furnish the frequency divider select signals and another two sets of power pads. The pads are designed to float to VCC when no signal is applied in order to provide a default input value. The four pads listed as "Open" are actually connected via minimally sized wires to various points along the power rails in order to examine power rail droop. These pads do not contain any circuitry and are simply metal.

Probe Site #2 may be used to apply a high-speed differential input signal for testing the frequency multipliers and dividers in the event that the core oscillator isn’t operational. Due to the design of the Cascade differential microwave probe, only one power rail (VCC) may be fed through this site. The high-speed differential output is available from the left pads (Probe Site #4). As with the external clock input probe site, only VCC may be fed onto the chip from this site.

Figure 2.28 - VCO chip partitioning, pad locations and probe sites

Experimental Results

The VCO was designed to work at frequencies up to 20 GHz but the experimental results peaked at 13.66 GHz. The testing process was encumbered by a design error in the oscilloscope trigger output signal (the VCO CORE signal from Probe Site #1). A resistor in the output differential driver was undersized for the amount of current required and consequently it burned out during testing. Without a signal at the core frequency, the oscilloscope triggering became very erratic, especially at higher frequencies. Consequently, it became difficult to measure stable results that limited the amount of data acquired. Some experimental data is shown below in Table 2.1.

VCO Location

Control Voltage (V)

Core Frequency (GHz)

2X Core

4X Core

VEE (V)

IEE (mA)

Wafer 8, Site 11, No. 2

0.6

n/a

n/a

13.66

-6.06

360

Wafer 8, Site 11, No. 2

-1.06

3.33

6.67

13.33

-6.17

378

Wafer 8, Site 11, No. 1

n/a

2.04

n/a

n/a

-6.14

372

Wafer 8, Site 11, No. 1

0.92

2.08

n/a

n/a

-6.78

456

Wafer 8, Site 11, No. 1

-1.56

0.60

1.20

1.20

-6.46

420

Table 2.1 - Experimental results from VCO testing

From the results in Table 2.2 above, the maximum frequency is listed as 13.66 GHz while the minimum is 0.6 GHz, however, this does not accurately represent the actual bandwidth available from one chip. The maximum frequency was obtained from a different chip that had a minimum measured core frequency of 2.04 GHz. The minimum frequency came from a chip with a maximum frequency of 2.08 GHz, consequently using one chip for the maximum frequency and another for the minimum does not present a realistic picture of the oscillator bandwidth. Note that no measurements were made using the frequency divider circuit due to the lack of additional probe test fixtures.

A photograph of the fabricated chip is shown in Figure 2.29. The maximum measured frequency of the VCO was 13.66 GHz with a 90 mV minimum peak-to-peak voltage swing. A photograph of the waveform on an oscilloscope is shown below in Figure 2.30. The disturbance in the waveform is believed to be due to the triggering mechanism of the oscilloscope. To date the 13.66 GHz signal is the highest-frequency ever measured by the F-RISC group.

Figure 2.29 - Photograph of fabricated VCO chip

 

Figure 2.30 - Oscilloscope photograph of 13.66 GHz VCO output

Comparison of Experimental and Simulation Results

Initially several simulations were performed quickly in an attempt to match the experimental results. At the time of testing, there were two device models available (the baseline 50 GHz fT model provided by Rockwell and the 30 GHz fT model fitted to experimental S-parameter data) and one interconnection model (which included the SiNx layer). In an attempt to account for the remaining difference in performance, the interconnection capacitance values were globally increased by 145%. This produced a signal at 13.75 GHz, or 100.66% of the experimental. With further adjustment to the control voltage the measured frequency could be obtained exactly.

Device

Capacitance Model

Maximum Frequency (GHz)

Ring oscillator Freq.(GHz)

Experimental data

(n/a)

13.66

2.04

50 GHz (1990)

(no capacitance)

n/a1

2.86

50 GHz (1990)

2-D VTI Tools

19.8

2.8

30 GHz

(no capacitance)

16.3

2.0

30 GHz

2-D VTI Tools

14.5

1.98

30 GHz

1.45 x 2-D VTI Tools

13.75

1.96

1 Because the VCO was optimized to compensate for significant amounts of capacitance, the output driving stage does not function properly at the high end of the frequency range without capacitance.

Table 2.2 - Comparison of experimental and simulated VCO performance using earlier SPICE models

Although I obtained a very close match between simulations and the experimental results, the method of multiplying the inaccurate capacitance numbers (generated from 2-D models) by 1.45 was not very appealing or even consistent. For this reason, a series of simulations were later performed using capacitance values obtained from QuickCap with both the isotropic and anisotropic reduced interlevel-dielectric interconnection models. The 2-sided base switching model was also used because it closely approximated the performance of the fabricated device when the area factor was doubled (remember, the new devices have the emitter area shrunk by 50%).

Figure 2.31 - Comparison between experimental and simulated VCO 4X signal

The experimental and simulated results are plotted above in Figure 2.31 for several combinations of device and interconnection models. Note that only one data point is available for both experimental measurements above 13.0 GHz. Due to the difficulty in triggering the oscilloscope, is was difficult to adjust the control voltage and observe the output waveform. As a result, data points were observed at discrete control voltages rather than over a finely-grained sweep of the control voltage.

All of the simulations were able to match the 13.66 GHz experimental result but only two were close at the same control voltage of 0.6 V, specifically the Original Device Model & Anisotropic Capacitance and the Switching Device Model & Anisotropic Capacitance combinations. None were very close to the 13.33 GHz result when the control voltage was -1.0 V. The reason for the disparity between the control voltages is not known (due in part to the limited number of data points) but is surmised to be related to variations in the devices and passive elements.

Figure 2.32 - Comparison of simulated and experimental VCO core signal

The core oscillator signals are shown in Figure 2.32 above for the measured data points and the simulations (note again that there is only one data point for each experimental measurement). One of the experimental measurements is not matched at any control voltage while the other two have better results. Of these two, only one (the top one) may be approximately matched at the same control voltage.

Analysis of Experimental Results

While simulations were able to approximate most experimental results with varying degrees of success, there are still instances in which the measured data is quite different than predicted. Consequently, it appears that the process may yet have some significant variation in circuit performance. It should be noted, however, that the limited number of experimental data points makes it difficult to establish whether the outlying experimental results are merely random occurrences or instead indicate a wide or unpredictable variations in the process results.

Summary

As communications technology advances, there will be more need for high-speed, wide-bandwidth frequency generation and synchronization circuits. Conventional relaxation oscillators and common-base buffer amplifier may provide accurate, stable high-frequency signals but they typically lack the wide bandwidth possible with a quadrature-output voltage-controlled ring-oscillator using frequency multipliers. This chapter has presented the analysis, development, implementation and testing of a high-speed, wide bandwidth voltage-controlled oscillator that may be used in phase-locked loops or other frequency generation/synchronization applications. The VCO uses quadrature signal multiplication to generate signals at two and four times the core oscillation frequency. The inclusion of several toggle flip-flops has added frequency division capabilities as well, further increasing the VCO bandwidth. To date the fastest frequency obtained has been 13.66 GHz, or 45.5% of the device fT. For a chip with a maximum frequency of 13.33 GHz, the lowest measured frequency was 2.09 GHz. With the use of the (untested) frequency divider circuits, this should drop by a factor of 8 to 0.255 GHz, for an overall bandwidth of 13.66 GHz to 0.255 GHz. At 2.04 GHz to 13.66 GHz, this circuit has the widest bandwidth for any VCO circuit (digital or analog) ever reported in the literature [].

VCO Statistics

Size

1.9 mm X 1.6 mm

Device Count

412 HBTs

Power Supply

VCC=0, VEE=-6.0

Power Dissipation

2.45 W

Ring Oscillator Stages

16

Table 2.3 - VCO statistics