A 161mW 32Gb/s ADC-Based NRZ SerDes Receiver Front End in 28nm

Runze Chi1, Junkun Chen', Youzhi Gu', Jiangfeng Wu', Yongzhen Chen ^(**){ }^{*}
'School of Electronics and Information Engineering, Tongji University, Shanghai 200092 , China

Abstract

Abstract-A 32-Gb/s NRZ ADC-based SerDes receiver front end is presented in TSMC 28 n m 28 n m 28nm28 \mathrm{~nm} process. The front end consists of a degenerated CML combined with Gm-TIA Continuous Time Equalizer (CTLE) which provides equalization, gain as well as buffering at 16 G H z 16 G H z 16GHz16 \mathrm{GHz}, followed by a 32-way time-interleaved Analog-to-Digital Converter (TI-ADC), which is implemented in a 4 x 8 4 x 8 4x84 x 8 hierarchy. The Gm-TIA structure is utilized as an active inductor, combined with neutralization capacitor, achieving required bandwidth without using passive inductor. The gain, offset and sample clock error calibrations are implemented. This proposed front end operates at 32 G b / s 32 G b / s 32Gb//s32 \mathrm{~Gb} / \mathrm{s} achieving S N R S N R SNR\mathrm{SNR} of 36.4 d B 36.4 d B 36.4dB36.4 \mathrm{~dB} and SFDR 46.25 d B 46.25 d B 46.25dB46.25 \mathrm{~dB} at lower frequency; 29.65 d B 29.65 d B 29.65dB29.65 \mathrm{~dB} and 41.52 d B 41.52 d B 41.52dB41.52 \mathrm{~dB} respectively at Nyquist, consuming 161 m W 161 m W 161mW161 \mathrm{~mW} combined.

Keywords-SerDes, ADC-Based Receiver, CTLE, active inductor, TI-SAR-ADC

1. Introduction

With the standard of wireline transceiver scales more and more aggressively, demand for higher bandwidth wireline transceiver continue to increase. The structure of SerDes receiver is also gradually shifting from slicer-based structure to ADC-based structure to exploit DSP processing [1]. Since the DSP can only use the data from the ADC, the performance of the AFE (including CTLE and ADC) plays a critical role for the wireline receiver. Challenges of AFE includes the tradeoff between power and performance. Detection and calibration circuits are also included to reduce the impact of device mismatch. In this paper, we propose a SerDes receiver front end capable of achieving 16 G H z 16 G H z 16GHz16 \mathrm{GHz} bandwidth while having an ENOB of 4.16 4.16 4.164.16 in simulation.

2. Degenerated CML-Gm-TIA Hybrid CTLE

CTLE serves the function to boost high frequency contents of the input signal, as those are the most heavily attenuated fraction. Thus, it would need a peak at high frequency in its frequency response. The traditional way of getting such peaking is by source degenerated CML, which have the frequency response of automatically depressing the gain at low frequency shown by formula (1), where the input signal generally has a large amplitude, such that circuits blocks afterwards won't be clipped. The structure of the proposed CTLE is shown by Fig 1 . 1 . 1.1 .
H 1 ( s ) = ( g m R D ) ( 1 + s R s C s ) ( 1 + s R s C s + g m R s / 2 ) ( 1 + s R D C D ) H 1 ( s ) = g m R D 1 + s R s C s 1 + s R s C s + g m R s / 2 1 + s R D C D H_(1)(s)=((g_(m)R_(D))(1+sR_(s)C_(s)))/((1+sR_(s)C_(s)+g_(m)R_(s)//2)(1+sR_(D)C_(D)))H_{1}(s)=\frac{\left(g_{m} R_{D}\right)\left(1+s R_{s} C_{s}\right)}{\left(1+s R_{s} C_{s}+g_{m} R_{s} / 2\right)\left(1+s R_{D} C_{D}\right)}
Illustrated by the impedance transfer function, the TIA structure can be regarded as an "active inductor" which is capable of expanding the bandwidth by approximately 30 % 30 % 30%30 \%, maintaining the common mode voltage at the middle of the power supply voltage at the same time. The addition of neutralization capacitor is to counter the output load as well as devices' parasitic capacitor, further expanding the bandwidth
Fig. 1. The structure of hybrid CTLE
With little effort, the impedance of TIA can be derived as:
Z L = v t i t = 1 g m 1 + s R ( C g s C n ) 1 + s ( C g s C n ) / g m Z L = v t i t = 1 g m 1 + s R C g s C n 1 + s C g s C n / g m Z_(L)=(v_(t))/(i_(t))=(1)/(g_(m))(1+sR(C_(gs)-C_(n)))/(1+s(C_(gs)-C_(n))//g_(m))Z_{L}=\frac{v_{t}}{i_{t}}=\frac{1}{g_{m}} \frac{1+s R\left(C_{g s}-C_{n}\right)}{1+s\left(C_{g s}-C_{n}\right) / g_{m}}
Utilizing above-mentioned techniques, this design is capable of achieving 16 G H z 16 G H z 16GHz16 \mathrm{GHz} of bandwidth without using a passive inductor, burning 85 m W 85 m W 85mW85 \mathrm{~mW} at the maximum GBW setting, lower power can be achieved with fewer slices at the expense of lower bandwidth.

3. A 32GS/S Time-Interleaved ADC with Calibration

A compact, easy-to-drive, medium to low resolution ADC is most preferable as in this application, hence we adopt the TISAR structure in this design. The resolution is chosen to be 6 bits to ease the loading to the CTLE as well as offering decent information for DSP afterwards [2].

4. A. Structure of the 32GS/s Time-Interleaved ADC

The TI ADC is realized in a 4 8 4 8 4**84 * 8 hierarchy way, composed by 1 G S / s 6 1 G S / s 6 1GS//s61 \mathrm{GS} / \mathrm{s} 6 bit SAR ADCs, illustrated by Fig 2 . Such sampling structure is capable of relieving the stringent demand of clocking, requiring four 8 G H z 8 G H z 8GHz8 \mathrm{GHz} quad-phase sampling clock only, each one is divided 8 X 8 X 8X8 \mathrm{X} further by a shift-register for every 8 sub-ADCs followed[3].
The calibration of TI-ADC consists of gain, offset and sampling time error. The calibration of gain and offset is relatively straightforward, we derive the gain and offset information by averaging the sum of a period of absolute or real value respectively with respect to a specific channel, which is changeable in digital domain. Then exclude this error by divide or minus the error of every other channels respectively.
Fig. 2. Structure of the proposed TI-ADC B. Clock Calibrations: Duty Cycle Detection and Correction; Quadrature Error Detection and Correction
To secure an ideal sampling position, the 4-phase 8 G H z 8 G H z 8GHz8 \mathrm{GHz} clock needs to be calibrated to maintain its duty cycle as well as quadrature relationship, the errors are detected and corrected.
The error of both duty cycle and quadrature is reflected by the voltage of detection capacitors, which is essentially averaging input clock to extract its common mode voltage. duty cycle error is reflected by common-mode voltage difference between a clock and its own inverse; Quadrature error can be converted to duty cycle error using a XOR between two phaseadjacent clocks.
Correction of these errors is done by an internal logic adjusting the pull-up or pull-down resistance. The purpose of the calibration is to force the common mode voltage of a clock and its inverse to be equal, the XOR of adjacent-phase clocks' common mode to be equal as well, the calibration is done iteratively, covering ± 7.5 p s ± 7.5 p s +-7.5ps\pm 7.5 \mathrm{ps} for each clock edge[4].

5. Simulation Results

The simulated frequency response of the CTLE is shown in Fig 3 . The proposed structure offers a wide range of tuning, corresponding to different channels.
Fig 4 shows a 32 G b / s 32 G b / s 32Gb//s32 \mathrm{~Gb} / \mathrm{s} PRBS 9 input going through several different PCB backplanes, the output eye diagram of the CTLE (eye diagram in pink) comparing with its input (eye diagram in blue), demonstrates its effectiveness, even when the attenuation worsens to the point where the CTLE input is closed.
An FFT of the front end with Nyquist frequency input is demonstrated in Fig 5(a). With DCC and QEC correcting the sample time, the inter-channel offset and gain error calibration on, the performance is summarized in Fig 5 ( b ) 5 ( b ) 5(b)5(\mathrm{~b}) and table I.
Fig. 3. Frequency Response of CTLE under different settings
20.010 .0 .0 .00 .040 .00 20.010 .0 .0 .00 .040 .00 20.010.0.0.00.040.0020.010 .0 .0 .00 .040 .00
(c)
(d)
Fig. 4. Eye Diagram of CTLE input and output paired with PCB traces: (a) 12 c m 12 c m 12cm12 \mathrm{~cm}; (b) 40 c m 40 c m 40cm40 \mathrm{~cm}; (c) 57 c m 57 c m 57cm57 \mathrm{~cm}; (d) 97 c m 97 c m 97cm97 \mathrm{~cm}
(a)
(b)
Fig. 5. (a) FFT of the front end at Nyquist input frequency and (b) performance summary
Fig. 6. Layout of the proposed receiver front end
Figure 6 is the layout of this work. Each sub ADC is as small as 30 30 u m , 32 A D C s 30 30 u m , 32 A D C s 30**30um,32ADCs30 * 30 \mathrm{um}, 32 \mathrm{ADCs} are divided into 4 columns, the CTLE is at the most front end, sandwiched by clock calibration modules. This work is 300 u m 300 u m 300um300 \mathrm{um} in width and 600 u m 600 u m 600um600 \mathrm{um} in height.
TABLE I. ADC-BASED RECEIVER PERFORMANCE COMPANION
Specification Cui [2] Frans [3] Wang [4] This Work
Technology(nm) 28 16 16 28
Power Supply (V) N/A 0.9 , 1.2 , 1.8 0.9 , 1.2 , 1.8 0.9,1.2,1.80.9,1.2,1.8 0.9 , 1.2 0.9 , 1.2 0.9,1.20.9,1.2 1
ADC Sample Rate (Gb/s) 16 28 32.1875 32.1875 32.187532.1875 32
ADC Structure TI-SAR TI-SAR TI-Folding Flash TI-SAR
ADC Resolution(bit) 8 8 6 6
ENOB @Nyquist 5.85 5.85 5.855.85 4.9 4.9 4.94.9 4.31 4.31 4.314.31 4.16 4.16 4.164.16
AFE + ADC Power(mW) 320 370 283.9 283.9 283.9283.9 161
AFE+ADC Power Efficiency(pJ/bit) 10 4.9 4.9 4.94.9 4.41 4.41 4.414.41 5.03 5.03 5.035.03

6. Acknowledgement

This work is implemented with the support of Chinese Academy of Sciences strategic leading science and technology project (XDC07020103).

7. References


  1. [1] K. Zheng et al., "An Inverter-Based Analog Front End for a 56 G B / S 56 G B / S 56GB//S56 \mathrm{~GB} / \mathrm{S} PAM4 Wireline Transceiver in 16NMCMOS," Symp. VLSI Circuits Dig. Tech. Pap., p. 2, 2018 . 2018 . 2018.2018 . ↩︎

  2. [2] D. Cui et al., "A 320 m w 32 G b / s 8 320 m w 32 G b / s 8 320mw32Gb//s8320 \mathrm{mw} \mathrm{} 32 \mathrm{~Gb} / \mathrm{s} 8 b ADC-based PAM-4 analog frontend with programmable gain control and analog peaking in 28 n m C M O S 28 n m C M O S 28nmCMOS28 \mathrm{nmCMOS}," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.Papers, Jan. 2016, pp. 58-59. ↩︎

  3. [3] Y. Frans et al., "A 56-Gb/s PAM 4 wireline transceiver using a 32-way time-interleaved SAR ADC in 16-nm FinFET," IEEE J. Solid-State Circuits, vol. 52, no. 4, pp. 1101-1110, Apr. 2017, ↩︎

  4. [4] L. Wang, Y. Fu, M. LaCroix, E. Chong, and A. C. Carusone, "A 64 G b / s 64 G b / s 64Gb//s64 \mathrm{~Gb} / \mathrm{s} PAM-4 transceiver utilizing an adaptive threshold ADC in 16 n m 16 n m 16nm16 \mathrm{~nm} FinFET," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2018, pp. 110-112. ↩︎

Recommended for you

Srishti Saha
High Dimension Data Analysis - A tutorial and review for Dimensionality Reduction Techniques
High Dimension Data Analysis - A tutorial and review for Dimensionality Reduction Techniques
This article explains and provides a comparative study of a few techniques for dimensionality reduction. It dives into the mathematical explanation of several feature selection and feature transformation techniques, while also providing the algorithmic representation and implementation of some other techniques. Lastly, it also provides a very brief review of various other works done in this space.
29 points
0 issues