# A 20 Gb/s Wireline Receiver with Adaptive CTLE and Half-Rate DFE in 0.13 $\mu$ m Technology Yinhang Zhang, Qingsheng Hu, Yongzheng Zhan # ▶ To cite this version: Yinhang Zhang, Qingsheng Hu, Yongzheng Zhan. A 20 Gb/s Wireline Receiver with Adaptive CTLE and Half-Rate DFE in 0.13 $\,\mu$ m Technology. 15th International Conference on Wired/Wireless Internet Communication (WWIC), Jun 2017, St. Petersburg, Russia. pp.292-303, 10.1007/978-3-319-61382-6 24. hal-01675415 # HAL Id: hal-01675415 https://inria.hal.science/hal-01675415 Submitted on 4 Jan 2018 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. # A 20 Gb/s Wireline Receiver with Adaptive CTLE and Half-rate DFE in 0.13 µm Technology Yinhang Zhang, Qingsheng Hu<sup>(⊠)</sup>, and Yongzheng Zhan Institute of RF- & OE-ICs, Southeast University, Nanjing, 210096, China 736671385@gq.com, qshu@seu.edu.cn Abstract. This paper presents a 20 Gb/s receive equalizer including an adaptive continuous time linear equalizer (CTLE) and a 2-tap half-rate decision feedback equalizer (DFE) in 0.13 $\mu m$ BiCMOS technology for high speed serial link. The CTLE can adjust the ratio of high frequency and low frequency components adaptively by detecting the energy at both ends of a slicer and then generating a control signal by an integrator. Following the CTLE is a half-rate DFE which can get a better trade-off between the working speed and design complexity especially for the case of 20 Gb/s or above. The chip area including pads and chipguarding is about $0.72{\times}0.86~mm^2$ and the power consumption is about 528 mW. Post simulation results show that the horizontal eye opening of the equalized data can be up to 0.9 UI at 20 Gb/s. Keywords: CTLE · DFE · Serial link · Design complexity ## 1 Introduction The requirement of high data rate in wire-line communications has been becoming more and more intense. The chip-to-chip and board-to-board communication are moving towards 20 Gb/s or above. The channel loss, jitter, cross-talk and noise will become more and more serious with the increase of velocity [1-3]. In order to compensate channel loss or combat the inter-symbol interference, a variety of equalization schemes are widely used at the near or far end. Partial response maximum likelihood equalization based on sequence detection can find the most probable transmission sequence from all possible sequences and get the lowest bit error rate [4, 5]. Because the structure of Viterbi decoder is too complex, and difficult to implement with high speed circuit, it is seldom used in high speed serial link. In the receiver, the feed forward equalizer (FFE) has a simple structure and can cancel pre-cursor ISI and post-cursor ISI simultaneously. At high speed, e.g. 20 Gb/s or beyond, it is difficult to design high precision delay line which is easily affected by the process, voltage, temperature (PVT) [6-8]. Similarly, the continuous time linear equalizer (CTLE) can also eliminate the pre-cursor and post-cursor by increasing the high-frequency components of the signal. But the noise and crosstalk is potentially amplified by the CTLE [9, 10]. DFE is a nonlinear structure and it is the most effec- tive equalizer to eliminate the post-cursor. In order to eliminate the influence of the previous symbol on the current symbol, the previous symbol is fed back and subtracted from the current symbol. Since the feedback signals are hard decision signals, the feedback signals do not enhance crosstalk and noise. The structure of DFE equalizer is simple and easy to operate at relatively high rate. [11-14]. The reminder of the paper is structured as follows. We review the principle of the adaptive CTLE based on the slope detection and the half-rate DFE. Then, the circuit design of the key modules is introduced in Section 3. We give the circuit layout and post-simulation results in Section 4. Finally, we draw conclusions in Section 5. #### 2 Arcitecture The overall structure consists of an adaptive CTLE and a half-rate DFE, as shown in Fig. 1. Adaptive CTLE can provide high frequency gain and compensate the loss caused by channel bandwidth degradation. In addition, CTLE can not only make up for some of the shortcomings of DFE, but also reduce the number of taps, which reduces power consumption [15-17]. Due to the inherent drawbacks of linear equalizers, it is necessary to follow a half-rate decision feedback equalizer. Due to the change of the environment, the different backplane materials and the change of the data rate, the improvement of the adaptive ability of CTLE is absolutely necessary. For the adaption CTLE based on spectrum balancing method, the power detector consumes large power. In order to solve these problems, the adaptive CTLE based on slope detection is presented in this paper, as shown in Fig. 2. The signal transmitted through the channel is first sent to a linear equalizer which has two paths: high-pass path and all-pass path. The high-pass path is used to compensate the high-frequency loss by high-frequency peaking, and the all-path is employed to adjust the low-frequency gain. The optimum proportion of high frequency and low frequency is adjusted by feedback control voltage $V_{\rm ctrl}$ . The output of the linear equalizer is adopted to produce a fixed swing and slope signal by the slicer. Slope detector & integrator are used for detecting the slope (energy) deviation between the slicer input and its output. Fig. 1. A typical equalization architecture with CTLE and Half-rate DFE Fig. 3 shows a simplified structure of half-rate DFE, which is composed of odd path and even path. Each path consists of an adder and two Flip-Flops. The rate of clock signal for odd path data and even path data is half of input data, which is illustrated by Fig. 4. The multiplexer restores half-rate data to full-speed data. The advantages of half-rate structure can be described as follows: firstly, the time constraint of the half-rate DFE is the same as the full-rate DFE, but the data duration of half-rate DFE is doubled, so it is easier for the adder to complete the operation accurately. Secondly, the architecture of half-rate DFE can reduce design difficulty of the CDR circuit and clock buffer. Fig. 2. Adaptive CTLE Fig. 3. Half-rate DFE Fig. 4. Timing diagram of half-rate DFE # 3 Circuit Design ### 3.1 Linear equalizer Linear equalizer is mainly used to compensate the loss caused by channel bandwidth degradation. The design of linear equalizer must consider two issues: 1) bandwidth, which determines the range of frequency compensation; 2) boost factor, which is determined by the loss of the channel [17]. The linear equalizer shown in Fig. 5 employs a high-path and an all-path path to compensate the loss of the channel. The bandwidth of high-path is completed by the inductance peaking technique and the bandwidth of the all-path is expanded by pole zero cancellation. The ratio of high frequency and low frequency is determined by the bias voltage of the tail transistor. In other words, the adaptation of the linear equalizer is performed by the feedback control voltage $V_{\rm ctrl}$ , which is produced by the slope detector and integrator. Fig. 6 shows the frequency response curve of the linear equalizer with different $V_{\rm ctrl}$ . Fig. 5. Linear equalizer Fig. 6. Frequency response of linear equalizer with different $V_{ctrl}$ #### 3.2 Slope detector and integrator Fig. 7 shows the slope detector & integrator. Two coupled difference pairs, $M_1$ and $M_2$ , $M_3$ and $M_4$ , are used to detect the energy of the slicer input and output, respectively. The total current flowing through the coupled differential pair $M_1$ and $M_2$ can be expressed as follows: $$I_{\text{out1}} = I_{\text{ds1}} + I_{\text{ds2}} = \frac{\mu_{\text{n}} C_{\text{ox}}}{2} \left(\frac{W}{L}\right) \left(2\left(V_{\text{in,cm}} - V_{\text{in}}\right)^2 + \frac{V_{\text{in,dm}}^2}{2}\right)$$ (1) where the $I_{\rm ds1}$ and $I_{\rm ds2}$ are the drain current of transistor M1 and M2, respectively. $V_{\rm in,com}$ is the common mode input voltage and $V_{\rm in,dm}$ is the differential mode input voltage. The square term in the formula shows that the energy of different slope signal waveforms can be measured by $I_{\rm out1}$ . In Fig. 7, M<sub>6</sub> and M<sub>7</sub> are active load transistor, M<sub>5</sub> and M<sub>6</sub> are mirror current source, similarly, M<sub>7</sub> and M<sub>8</sub> are mirror current source. Therefore, the feedback control voltage generated by the integrator can be described as follows: $$V_{\text{ctrl}} = I_{\text{out2}} R_{\text{L}} - I_{\text{out1}} R_{\text{L}} = \frac{\mu_{\text{n}} C_{\text{ox}}}{4} \left(\frac{W}{L}\right) \left(V_{\text{in2}}^2 - V_{\text{in1}}^2\right) R_{\text{L}}$$ (2) According to Eq. (2), $V_{\rm ctrl}$ is closely related with the difference of slicer both ends signal energy. The $V_{\rm ctrl}$ feeding back to the linear equalizer adjusts the frequency response of the linear equalizer. Fig. 7. Slope detector and integrator # 3.3 Slicer Slicer implemented by two cascaded current mode logic (CML) structure is shown in Fig. 8. The slicer is a limiting amplifier so that it processes the edge and amplitude of the input signal and produces fixed swing and slope signal for the feedback loop. The slicer is a high resolution comparator. The edge of the input signal is processed into an approximate ideal binary signal, so that the slope of the slicer output is almost independent of the slope of the equalizer output signal. #### 3.4 SCFL based Latch The sensitivity of latch has a great influence on the performance of DFE. The D Flip-Flop consists of two stages latches. Each stage latch contains three pairs of transistors, as shown in Fig. 9. $Q_1$ and $Q_2$ are input clock transistors, $Q_3$ and $Q_4$ consist of sampled transistors, $Q_5$ and $Q_6$ are latch transistors. When the $CK_p/CK_n$ is high/low, the $Q_1$ is turned on and the $Q_2$ is turned off, the current $I_{ss}$ flows through the $Q_1$ , and the data is sampled. When $CK_p/CK_n$ is low/high level, the output signal is latched by the latch transistor, the data remains intact. For latch, firstly, the size of clock signal must be large enough to ensure that the $Q_1$ or $Q_2$ can be turned off completely. Secondly, the performance of the latch is determined by the proportional relation between the latch transistor and the sampling transistor. Considering the operating speed, data retention capability and bandwidth, the proportional relation adopted in this design is 0.8. Fig. 8. Slicer ## 3.5 Summer The summer based on CML is shown in Fig. 10. The input signal $V_{\rm in}$ and its feedback signal are summed up in the form of current, and then converted into the output voltage $V_{\rm out}$ . The $V_{\rm out}$ can be expressed as follows: $$V_{\text{out}} = [G_0 V_{\text{in}}(t) - G_1 V_{\text{in}}(t - T) - G_2 V_{\text{in}}(t - 2T)] R_L = [C_0 - C_1 - C_2] I_{\text{SS}} R_L \quad (3)$$ where $G_i(i=0,1,2)$ are the input transconductance, $C_i(i=0,1,2)$ are the tap coefficients. In order to effectively eliminate the post-cursor and achieve a good equalization effect, it is important to select the appropriate tap coefficients $C_i(i=1,2)$ . Typically, it can be obtained by the impulse response of the channel or the adaptive least mean square (LMS) algorithm. Fig. 9. SCFL based Latch Fig. 10. Summer Fig. 11. Multiplexer #### 3.6 Multiplexer Fig. 11 shows the architecture of multiplexer. The 2:1 multiplexer based on SCFL is mainly composed of two cross coupled differential pair transistor. This circuit is driven by half-rate clock, which is same as the master slave flip-flop. The half-rate data can be recovered to full rate data through this circuit. When the $CK_p$ is high, the odd path is selected. When the $CK_n$ is high, the even path is selected. In order to improve the stability of current source, the gate length of the tail current transistor should be larger. #### 3.7 Buffer To drive external 50 ohm load resistance, the buffer based on CML logic was implemented in this work, as shown in Fig. 12. It is composed of multi-stage CML logic. It has the advantage of simple structure and enough bandwidth. In order to improve the driving ability, this paper adopts 3 stages. The size of tail current transistor and difference input transistor should be gradually increased, and the size of load resistance should be reduced gradually. Fig. 12. Buffer Fig. 13. Layout of the proposed equalizer # 4 Layout and Post Simulation Fig. 13 shows the layout of the proposed equalizer including adaptive CTLE and half-rate DFE. The total area including pads and chipguarding is 0.72×0.86 mm<sup>2</sup>. The circuit has been submitted for fabrication. Fig. 14 shows the simulation results of feedback control voltage $V_{\rm ctrl+}$ and $V_{\rm ctrl-}$ . It can be seen that the feedback control voltage can be stable after 5 ns. Fig. 15(a) gives the input eye diagram. The output eye diagram of adaptive CTLE and half-rate DFE are given in Fig. 15(b) and Fig. 15(c), respectively. It is can be seen that the eye opening is less than 0.6 UI after adaptive CTLE equalization, and the eye opening can be further widened by half-rate DFE at the rate of 20 Gb/s. Fig. 14. Simulation results of feedback control voltage ( $V_{\rm ctrl+}$ and $V_{\rm ctrl-}$ (c) Output eye diagram of half-rate DFE Fig. 15. Simulation results of 20 Gb/s # **5** Conclusions and Future Work In this brief, in order to deal with higher channel loss, a half-rate DFE combined with an analog adaptive CTLE based on slope detection is employed at the receiver. The adaptive CTLE can adjusts the range of high-frequency peaking and compensate the channel loss. It can effectively reduce the burden of DFE and cut down the number of taps. Eye opening is up to 0.9 UI after the CTLE and half-rate DFE equalization at 20 Gb/s. It improves that the half-rate DFE can further improve the signal integrity of serial link at the rate of 20 Gb/s. In future work, on the one hand, because the ISI is mainly caused by the post-cursor, we will focus on the adaption performance of DFE. On the other hand, when the loss of the link reaches up to 35 dB attenuation at Nyquist frequency at 25 Gb/s or above, higher order modulation mode (such as PAM4), or other modulation methods (such as ENRZ) should be deeply studied. #### References - 1. Yuan, G., Alaa R., Taee, A.: Design techniques for decision feedback equalization for multi-giga-bit-pe-second serial data links: a state-of-the-art review. IET Circuits, Devices & Systems. **52**(8), 118-130 (2014) - Bulzacchelli, J.: Equalization for electrical links: current design techniques and future directions. IEEE Solid State Circuits Magazine. 7(4), 23-31(2015) - 3. Wang, X., Hu, Q.: Analysis and optimization of combined equalizer for high speed serial link. In: 2015 IEEE 9th International Conference on Anti-counterfeiting, Security, and Identification, pp. 43-46. Xiamen (2015) - Pervez, M., Cathy, Y., Adam, H.: Partial response maximum likelihood equalization and detection for DSP based SerDes with cross talk and practical equalization. In: DesignCon 2014, pp. 1-27. Santa Clara (2014) - Pervez, M., Cathy, Y., Adam, H.: Partial response and noise predictive maximum likelihood (PRML/NPML) equalization and detection for high speed serial link systems. In: DesignCon 2013, pp. 1160-1184. Santa Clara (2013) - 6. Boesch. R., Zheng. K., Murmann. B.: A 0.003 mm<sup>2</sup> 5.2 mW/tap 20 GBd inductor-less 5-tap analog RX-FFE. In: IEEE Symposium on VLSI Circuits, pp. 1-2. HI (2016) - Kocaman. N., Ali, T., Rao, P.: A 3.8 mW/Gbps quad-channel 8.5–13 Gbps serial link with a 5 tap DFE and a 4 tap transmit FFE in 28 nm CMOS. IEEE Journal of Solid-State Circuits. 51(4), 881-892 (2016) - 8. Kao, S., Liu, S.: A 7.5-Gb/s one-tap-FFE transmitter with adaptive far-end crosstalk cancellation using duty cycle detection. IEEE Journal of Solid-State Circuits. **48**(2), 391-404 (2013) - 9. Preibisch, J., Reuschel, J., Scharff, K.: Impact of continuous time linear equalizer variability on eye opening of high-speed links. In: 2016 IEEE 20th Workshop on Signal and Power Integrity, pp. 1-4. Turin (2016) - Feng, Z., Hu, Q.: A 6.25 Gb/s decision feedback equalizer in 0.18 μm CMOS technology for high-speed SerDes. In: 2011 7th International Conference on Wireless Communications, Networking and Mobile Computing, pp. 1-4 (2011) - 11. Yuan, S., Wang, Z., Zheng, X.: A 10 Gb/s speculative decision feedback equalizer with a novel implementation of adaption in 65 nm CMOS technology. In: IEEE International Conference on Electron Devices and Solid-State Circuits, pp. 1-2. Chengdu (2014) - 12. Parikh, S., Kao, T., Hidaka, Y., Jiang, J.: A 32 Gb/s wireline receiver with a low-frequency equalizer, CTLE and 2-tap DFE in 28 nm CMOS. In: IEEE Solid State Circuits Conference, pp. 28-29. Lisbon (2013) - 13. Navid, R., Chen, E., Hossain, M.: A 40 Gb/s serial link transceiver in 28 nm CMOS technology. IEEE Journal of Solid-State Circuits. **50**(4), 814-827 (2015) - 14. Zhang, G., Chaudhair, P., Green, M.: A BiCMOS 10 Gb/s adaptive cable equalizer. In: IEEE Solid State Circuits Conference, pp. 149-152. Bangkok (2003) - Jiang, C., Hu, Q.: A 6.25 Gb/s adaptive analog equalizer in 0.18 μm CMOS technology for high-speed SerDes. In: 2012 2nd International Conference on Computer Science and Network Technology, pp. 266-270. Changchun (2012) - Kim, Y., Lee, T., Kim, L.: A 21-Gbit/s 1.63-pJ/bit adaptive CTLE and one-tap DFE with single loop spectrum balancing method. IEEE Transactions on Very Large Scale Integration Systems. 24(2), 789-793 (2016) - 17. Ibrahim, S., Razavi, B.: Low-power CMOS equalizer design for 20-Gb/s systems. IEEE Journal of Solid-State Circuits. **46**(6), 1321-1336 (2011)