

# A Methodology for Evaluating SI Artefacts in DDR4-3DS PHY using Channel Modelling

Aditya S Kumar, Samsung Semiconductor India R & D(SSIR), Bangalore, INDIA (a2019.kumar@samsung.com) Gowdra Bomanna Chethan, Samsung Semiconductor India R & D(SSIR), Bangalore, INDIA (gb.chethan@samsung.com) Shivani Maurya, Samsung Semiconductor India R & D(SSIR), Bangalore, INDIA (shivani.m@samsung.com) Anil Deshpande, Samsung Semiconductor India R & D(SSIR), Bangalore, INDIA (anil.pande@samsung.com) Somasunder Kattepura Sreenath, Samsung Semiconductor India R & D(SSIR), Bangalore, INDIA (soma.ks@samsung.com)

*Abstract*— Data is the world's most valuable resource. The amount of data to be processed has exploded tremendously with the growth of IT and networking. With increasing diversity and range of application, the memory footprint has increased manifold requiring high data throughput resulting in miniaturization of devices and higher operating frequencies. Leveraging the benefits of advanced fabrication techniques, new memory designs involving 3D stacking have recently emerged. A Memory system consists of Memory Controller, PHY and DRAM devices connected externally on the system board. The PHY layer handles the interaction between Memory Controller and the external Memory. This paper proposes an in-house developed parameterized and configurable Channel Model that models silicon effects like cross-coupling, noise, skew, distortion etc happening during this interaction in order to thoroughly check the functionality of DDR-PHY. This modelling ensures a good signal-integrity for such emerging memory architectures involving complex geometries and replicates real time environment in DV.

Keywords— Crosstalk, Channel Modelling, Skew, Signal Integrity, Delay, Distortion, Design Under Test (DUT), Double Data Rate(DDR), DDR4-3DS, System on Chip(SoC), VLSI.

#### I. INTRODUCTION

The rapidly shrinking feature size and increase in design density of Integrated Circuits (IC), coupled with shorter design cycles, have presented many challenges to Very Large Scale Integration (VLSI) circuit designers to come up with new memory architectures to address the main requirements like increase in bandwidth, power, signal integrity etc. With increasing time to market pressures, there is a need to catch bugs early in design cycle to minimize costs and avoid multiple re-spins. This puts a greater responsibility on the VLSI Verification engineers to carefully design their test-bench(TB) considering the worst case scenarios and rigorously test the design corners where there is a possibility of failure.

Memory is an integral part of a computing system. Data produced by a variety of sources like smartphones, PCs, IoT devices, wearables and other networking systems is stored and retrieved from memory modules. The increasing interactions between data, algorithms, and analytics of big data, connected data and individuals are opening enormous new prospects. With applications (like ML, Data Mining etc) becoming data intensive, a key factor affecting the performance of these systems is the sluggish performance of memory. This slow performance is attributed to the placement of memory away from the processor. Efforts have been made to place the memory close to the processor (HBM) and use complex architectures (3D stacking) to improve the performance and capacity. With the increasing diversity and range of applications that use memory, it becomes critical to design high density memories having greater reliability and better quality for next-generation platforms.

The memory chip architectures have evolved from UDIMM, RDIMM, LRDIMM to Three-dimensional(3D) LRDIMM. With each advancement, the chips and interconnects are coming closer, making the cross-talk effects between them significant. The electrical and magnetic interaction affect the timing and signal integrity of the values driven on the wires leading to increase in signal-to-noise ratio and unnecessary power consumption. These effects can be reduced by proper routing of lines, by using special encoding schemes and error correction and detection logic (like ECC) but they cannot be avoided. With data breaches and hacking becoming common, impetus is on having good system reliability to tolerate these effects.

A Memory system consists of - Memory Controller, PHY and DRAM devices connected externally on the systemboard. PHY is used to maintain the signal integrity and translation of instructions from memory controller



to memory. Since a lot of routing is involved in this system, silicon effects like skew, distortion and cross talk has to be handled by PHY.

This paper proposes a parameterized and configurable Channel Model [1] to evaluate the Signal Integrity (SI) artefacts arising in a Memory system in complex memory architecture and illustrates its advantage in Digital Verification of DDR-PHY IP w.r.t DDR4-3DS memory. The paper is divided into V Section. Section I explains the motivation and relevance of a Channel Model in a Memory System. Section II briefly describes the architectural features of DDR4-3DS. Section III presents the various features of proposed Channel Model and its placement in TB. Section IV explains the usefulness of proposed Channel Model in verification of DDR-PHY in the context of DDR4-3DS, while Section V concludes the paper with scope for future enhancements.

## II. DDR4-3DS MEMORY MODEL

The 3D integration using through-silicon via (TSV) is becoming an advantageous option for increasing density and improving performance instead of conventional device scaling and multi-die packaging. By stacking chips with the TSV connections, large bandwidth and reduction in interconnect length can be achieved, translating to reduction of power consumption, parasitic resistance and capacitance. The difference between a quad die package and 3DS TSV can be visualized from Fig. 1. TSV solutions are intended to meet the demand from edge computing applications, which require shorter reaction time and different structures using silicon via (TSV) techniques for chip stacking. The TSV memory market has seen a variety of devices being released, with DDR4 3DS being one of the promising alternatives.



Figure 1. Architectural comparison of quad die and 3DS model [2]

3D Stacked memory is able to provide large memory storing capabilities by way of multiple logical ranks (2,4 or 8) which are stacked one on top of another and are selected using Chip ID (C0,C1 etc..). The decoding logic for selecting the logical ranks is given in Fig. 2 and Table I, where CS\_n, C0, ODT and CKE are chip select, channel ID, on-die termination enable and clock enable respectively which are the inputs. The bottom die is configured as master and the remaining die in the stack are configured as slaves. The master die provides isolation (or buffering) to the slave die thus adding capacity without adding bus loading. Such systems have the potential of intra-stack operations which at one hand can improve timing, bus speeds, and signal integrity, but on the other hand increase the complexity involved in verification.



Figure 2. 2-1-1-1(2H) device [3]



TABLE I. DECODING LOGIC FOR CHIP ID IN DDR4-3DS[3]

| Logical | CS_n | C0 |
|---------|------|----|
| 0       | L    | L  |
| 1       | L    | Н  |

Fig 3 shows a read-write sequence which starts with an Activate command to activate the selected row in a bank and ends with a Precharge command to deactivate the row. Similar read-write sequences are issued with randomized values of Chip ID(CID), BA and BG. In addition to randomization of chip ID to select different logical ranks, banks and bank groups for each rank as shown in Fig. 3, an in house channel model has been used to replicate the condition of actual testing environment where routing and interconnection delay play a critical role and cannot be eliminated. With the memory dies being more closely placed in the design, there are more chances of cross talk and interference. In order to test the working of PHY under such critical conditions, a Channel model has been incorporated in the Verification environment and has been discussed in the next Section.



Figure 3. Read-Write operation with randomized values of CID, BA, BG

III. CHANNEL MODEL(CM)

The proposed Channel model has been incorporated in the existing TB architecture for verification as shown in Fig 4.



Figure 4. PHY Testbench

The verification environment consists of four basic components- Memory Controller (MC), PHY, Channel Model and the Memory. The memory control uses DFI Protocol to interact with the PHY. A n in-house DFI UVC model to be used as driver to DDR-PHY IP. The PHY does translation and necessary conditioning on the signals it receives from the DFI and sends it to channel model. The channel model modifies the model the noise and skew caused by the coupling capacitance and other factors affecting signal integrity. It also introduces glitches, distortion and crosstalk on signals by mimicking the actual board & line delays introduced in silicon. The transactions finally reach the DDR memory model which is in-house developed or taken from different Memory VIP vendors, communicates with PHY through the Channel Model. The Channel Model behaves similarly when



the Memory interacts with MC in the receive path. Functional checks, data integrity checks and coverage have also been integrated for ensuring the quality of design through regression testing.

The proposed Channel Model has following features [1].

### A. Interface Line Delay

Timing relationship of a signal can be greatly altered by changes in the surrounding and routing path in a tightly coupled system. In addition to the finite circuit delay caused by the propagation delay of transistors, delay in interconnects due to finite inductive and capacitive reactance plays a critical role, particularly at deep submicron level.

The interface line delay component of the channel model takes into account the Placement and routing of DRAM in SoC (Transport Delay), External placement of DRAM in SoC (Board Delay) and DDR placement on DIMM. The delay can be applied on data and control lines going into and out of the DRAM within the limits given by the specifications. Support has been added for changing the delays at any given time in the simulation which allows the user to create multiple scenarios both directed and random.

#### B. Cross Talk Distortion

The phenomenon due to which a signal transmitted on one circuit or channel creates an undesired effect in another circuit or channel, like causing functional failure in the chips (signal value change or timing changes) is referred to as Cross talk [4]. The change in the timing due to cross talk has been considered in Interface line delay.

The distortion due to cross talk can reduce the eye (window) where the signals are valid, commonly referred to as valid window margin. A mathematical model has been developed, which takes into account signal transitions on the neighbouring lines to apply appropriate distortion values on the left and right edges (similar to capacitive coupling model). Since cross talk applied on a signal depends on its immediate left and right neighbouring wire values, cross talk is applied on the signal if the value on either sides is opposite to the value of the signal. Similar analysis can be extended to other neighbours as well for a more accuracy.

### C. Setup and Hold Stresses

The valid window width of the data driven on wires are prone to noises that may be encountered while transfer between PHY and DRAM (e.g., Vref fluctuation). Setup and hold violations can occur if the change in valid window width is beyond the permissible limit. To simulate such behaviour, 'X' or 'Z' have been introduced on one edge or both the edges of the valid window, which can be compensated by PHY. User can specify whether he wants to introduce distortion in the form on '0','1','X' or 'Z'.

#### D. Duty Cycle Variation

Due to difference in the fabrication of different transistors there is always a possibility of difference in the rise and fall time of transistors resulting in asymmetric on and off durations. With the usage of clock trees in IC, buffers placed for meeting timing requirements may also be source of duty cycle variation. Jitter may also be introduced in the clock due to variation in temperature and while switching to different frequencies

The above mentioned sources of duty cycle variation have been used in DV to stress test the data eye. In order to stress the read and write data calibration feature, the duty cycle has been varied on the fly with configurable values. There are facilities to introduce duty cycle modulation separately in both read and write path with different values throughout the simulation.

#### E. Glitch Insertion

With SoCs becoming complex, electrical interactions can cause significant effects on the performance e.g., signal integrity (cross-talk and power-supply noise), thermal effects, and process variations. Such interactions cause transient changes in signal values (glitches) and potential silicon bugs. To catch these bugs in the initial stages of DV, glitches are introduced from TB.

The channel model is developed in a parameterized way. For each signal an instance of the basic element of channel model is created by specifying some basic parameters - a) bit width of the signal b) number of ranks supported c) the signal is driven or received w.r.t PHY and d) a switch to enable/disable different features. Fig 5 illustrates the structure of a channel model element and how it functions in Drive and Receive Path. A CM feature specific list of parameters has been provided in Table II.

The different features mentioned above are generated by approximating the manifestation of the effects in digital time domain simulating using common System Verilog constructs like interfaces, always, forever, assign



statements. The different parameter values of CM are varied in accordance to the specifications of PHY Design and the timings values specified by JEDEC.



Figure. 5. a) Basic Channel Model Element b) Channel model in Drive and Receive path

| SKEW                                                        | DISTORTON                                                                                                                  |  |  |  |  |  |  |
|-------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|
| a) MAX SKEW VALUE<br>b) MIN SKEW VALUE                      | <ul><li>a) DISTORTION VALUE (0,X or Z)</li><li>b) MAX LEFT/RIGHT DISTORTION</li><li>c) MIN LEFT/RIGHT DISTORTION</li></ul> |  |  |  |  |  |  |
| CROSS TALK                                                  | DUTY CYCLE DISTORTION                                                                                                      |  |  |  |  |  |  |
| a) SCALING FACTOR<br>b) MAX DISTORTION<br>c) MIN DISTORTION | a) MIN % OF DUTY DISTORTION<br>b) MAX % OF DUTY DISTORTION                                                                 |  |  |  |  |  |  |
| GLITCH                                                      |                                                                                                                            |  |  |  |  |  |  |
| a) GLITH DURATION<br>b) NO. OF GLITCH PER UI                |                                                                                                                            |  |  |  |  |  |  |

## TABLE II. PARAMETER VALUES FOR EACH CM FEATURE

Fig. 6 shows a pictorial representation of the types of skew and distortion that can be applied via the Channel Model as discussed above.





Figure. 6. Channel Model applying a) Skew b) Distortion c) Duty cycle distortion d) Glitch on the signals

## IV. VERIFICATION OF PHY USING CHANNEL MODEL

The proposed parameterized and reconfigurable Channel Model as discussed in Section III has been employed for the DV of PHY supporting DDR4-3DS Memory. The use of a Channel Model in DDR-PHY verification is necessary due to the misalignment of interface signal caused due to several factors like routing, voltage variation, cross-talk, electrical interaction during silicon characterization of SoC. In all these scenarios, PHY should be capable enough to tune itself to account for this and compensate the effects of these silicon factors. Had the channel model not been employed we would not have been able to verify the training algorithms supported by PHY at RTL level and would be able to catch errors only in silicon. Thus, Channel model is an abstract model which represents the physical world and enables us to do IO characterization and Validation of PHY by providing different features like skew, distortion, glitch etc. There are features in Channel Model which facilitate the application of Same skew/distortion to all bits of a signal or different skew/distortion on each bit of the signals. In the Verification of DDR4-3DS Channel model was used for all the interface signals like CK DQ, DQS, A<sub>0-17</sub>, RAS\_n, CAS\_n, DM, CS\_n etc. One can also specify same or different skew/distortion values for multiple ranks present in the memory. This Section describes few instances where Channel Model proved to be useful for determining the tolerances supported by the PHY and catching training failures, resulting in RTL and Spec Updates.

Memory performs the main function of storing and retrieving data. For correct data to be read and written it is important that the data clock is perfectly aligned with the data eye. Write and Read DQ calibration ensures that DQS (data clock) is at the centre of data eye. Different features supported by the Channel model can be applied on the DQ line to ensure the working of PHY under all such conditions.

Fig 7. shows each DQ bit being delayed by a different value, with skew on DQ[7] depicted clearly. It is also possible to apply skew on the signals to advance DQ in comparison to DQS.

Distortion is applied decrease the valid window of DQ signals before performing DQ calibration. The distortion can be due to static noise or dynamic noise due to cross talk. The dynamic noise model (xtalk model) is based on capacitance coupling model. In this approach the value of each bit is compared with the value of the neighbouring bits to its left and right, and if the value of the neighbours is different from the value of that bit, dynamic noise (using score value) in an 8-bit bus while considering the effect of only two neighbours to its left and right. Fig. 7 shows the block diagram of the cross talk model along with the waveform of the output signal, after application of static and dynamic noise on the left and right edges.



| Name             | ¢۲ | 7,145ns | 2  | 7,146n | 9  |    | Z  | 7,147 | 'ns |   |    | 27,148 | Bins |    | 27,149ns |  |
|------------------|----|---------|----|--------|----|----|----|-------|-----|---|----|--------|------|----|----------|--|
| <b>⊕Б</b> СК     |    |         |    |        |    |    |    |       |     |   |    |        |      |    |          |  |
|                  |    | FF      | 60 | (40    | В4 | _X | 0c | 01    | ) 5 | 0 | 78 | 30     | FF   |    |          |  |
|                  |    | FF      |    |        | 6C | 4  | 0  | в4    | 00  | X | 01 | 50     | 78   | 3C | FF       |  |
|                  |    |         |    |        |    | ٦  |    |       |     |   |    |        |      |    |          |  |
| ⊡  \DQ_skewed[7] |    |         |    |        |    |    |    |       |     |   |    |        |      |    |          |  |
|                  |    |         | s  | kew    |    |    |    |       |     |   |    |        |      |    |          |  |

Figure 7. Skew applied on DQ bus by Channel Model

| Bit | Bit | Bit     | Score | Score | Dynamic |
|-----|-----|---------|-------|-------|---------|
| 0   | 1   | 1,2     | 1     | 1*5   | 5       |
| 1   | 0   | 0,2,3   | 2     | 2*5   | 10      |
| 2   | 1   | 0,1,3,4 | 3     | 3*5   | 15      |
| 3   | 0   | 1,2,4,5 | 1     | 1*5   | 5       |
| 4   | 0   | 2,3,5,6 | 1     | 1*5   | 5       |
| 5   | 0   | 3,4,6,7 | 1     | 1*5   | 5       |
| 6   | 0   | 4,5,7   | 1     | 1*5   | 5       |
| 7   | 1   | 5,6     | 2     | 2*5   | 10      |

Fig 8 shows distortion of 'X' being applied on the DQ bus. Fig. 8a) shows the distortion applied on DQ[7:0] bits from top view, while Fig. 8b) shows a magnified image indicating Left window distortion (LW1,LW2), Right window distortion (RW1,RW2) and Valid window(VW1,VW2) for DQ[5] bit. In the figure left window distortion LW1 of 33% of total window and right window distortion RW1 of 30% has been applied, leaving 37% duration for the valid data eye.



b) Figure 8. Distortion applied on DQ by Channel Model



The duty cycle of signals can also be varied to stress test the design and verify amount of tolerance and compensation values supported by the PHY design. Fig 9 shows duration of DQ bit increased by 20% for D1 and decreased by the same amount for D2 in the Read path.



Figure 9. Duty cycle variation applied on DQ bit by Channel Model

Though only Channel Model features have been described w.r.t DQ signals, a combination of skew and distortion can be used on all the signal lines to find the corner case where there is a possibility of unpredictable behaviour by the PHY.

## IV. CONCLUSION AND FUT URE ENHANCEMENTS

With the IC technology evolving very rapidly, there is a need to have commensurate improvement in DV practices as well. The design to market pressures and improvements in fabrication and design methodologies have made it even more crucial to replicate real time scenarios in DV environment to catch potential bugs early in design. Such practices can improve the quality of DV and improve the confidence in design quality. Finally, this paper illustrates how channel modelling is effective in increasing the DV confidence of PHY Designs with DDR4 3DS Memory. The Channel ID signal which is a unique feature supported by 3DS model has been varied to select different logical ranks to rule out any design miss w.r.t PHY. The proposed channel model can be used with other memories and designs, with appropriate modifications for modelling the complex interaction between two modules. PHY RTL are designed to support a particular channel whose behavioural model is extracted from the analog world. The DV quality can be enhanced further by adding different X-models and complex X-talk model (cross talk model) for channel to take into account the continuous analog domain effects like variable voltage, channel losses, attenuation, impedances in the digital domain.

## REFERENCES

[1] T.R. Jena, G. Kumar, G. B. Chethan, S. K. Sounderrajan, S. K. Sreenath, "DV methodology to model scalable/reusable component to handle IO Delays/Noise/Crosstalk in Multilane DDR PHY IF," DVCon India,2019

[2] J.S. Choi, "JEDEC Next Big Thing : DDR4 3DS Server Forum 2014", Samsung

[3] JEDEC Standard, AddendumNo 1 to JESD79-4,3D Stacked DRAM Interface Line Delay:

[4] Crosstalk Analysis and Its Impact in 7nm Technology, eInfochips (An Arrow Company), Jan, 2019.

[Online]. Available: https://medium.com/@einfochips/crosstalk-analysis-and-its-impact-on-timing-in-7nm-technology-abcfb795190f