

#### Translating and Adapting to the "real" world: SerDes Mixed Signal Verification using UVM

#### Akhila Madhu Kumar, Karl Herterich

Intel Of Canada







- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator module
- Pre-emphasis driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





### **Basics of SerDes**

- SERDES: SERializer DESerializer
- Used to transmit high speed IO data over a serial link at speeds greater than 2.5Gbps
- Tx: transmits parallel data to high speed receiver serial links by keeping data integrity
- Rx: receives data from serial link, recovers the clock using clock data recovery circuits (CDR) and sends the parallel data to the next-stage
- Tx and Rx data paths can have Built-in-Self-Test (BIST) engines to encode and decode a specific BIST data pattern in the stream and check error injection capability
- The datapath components make sure that the bit error rate (BER) is within the tolerance limit





### SerDes PMA Layer

- PMA: Physical Media Attachment Layer
- Primary application of PMA is to transmit the data in Tx and Rx data paths by doing parallel-to-serial and serial-to-parallel conversions respectively.
- The final Rx parallel data to the Physical Coding Sublayer (PCS) should meet the BER(bit-error ratio) requirements, in order to get a proper eye at the receiver end.
- The data received at the Rx side is attenuated due to a lossy channel between transmitter and receiver.
- Continuous Time Linear Equalizer (CTLE) works as a high pass filter to compensate for channel attenuation.







- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator module
- Pre-emphasis driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





## **APMA BMOD Development flow**

FAST Behavioral Models

- Netlisted down to major block level. Sub-blocks are modelled behaviorally
- Faster simulation performance and supports simulator flows like Xprop, hence improving code quality
- Very useful in finding bugs in APMA<-</li>
   >DPMA interface and quicker PMA simulation bring-up
- Interface data is of "logic" type

Accurate Behavioral models

- Netlisted down to leaf cell level. Transistor blocks are modelled
- Very close to the actual schematics from functionality standpoint
- Used to check schematic functionality and connectivity. Helps in finding schematic bugs
- Interface data is of "logic" type

AMS "mode" in Behavioral models

- Based on analog schematic and system simulation data to derive ideal models which contain voltage and current information
- Verilog model enhanced to capture analog mixed signal behavior using a combination of "logic" and "real" signals
- Useful in validating critical features in PMA like calibrations, Rx adaptation and equalization





- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator module
- Pre-emphasis driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





#### PMA Testbench Architecture (with non-AMS BMOD)







#### PMA Testbench Architecture (with AMS BMOD)







- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator Blocks
- Pre-emphasis Driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





#### **Translator Blocks**

- Application: Keeps the end-to-end data checkers and pattern generators same across BMODs, with and without AMS "real" mode
- Functionality: To convert the "real" type data to "logic" type and vice versa, taking Tx
  equalization characteristics and the CTLE and DFE tap gain values into account





#### Real<->Logic Conversion Functions

```
function logic2 t pam4Real2Logic (real p norm, real
m norm) ;
 logic code[2] ;
 real lev p 1, lev p 0, lev n 1, lev n 0;
 begin
    if($test$plusargs("PAM LEV P 1")) begin
       $value$plusargs("PAM LEV P 1=%d",lev p 1);
    if($test$plusargs("PAM LEV P 0")) begin
       $value$plusargs("PAM LEV P 0=%d",lev p 0);
    if($test$plusargs("PAM LEV N 1")) begin
       $value$plusargs("PAM LEV N 1=%d",lev n 1);
    if($test$plusargs("PAM LEV N 0")) begin
       $value$plusargs("PAM LEV N 0=%d",lev n 0);
    end
      code[0] = (p norm > lev p 1) ? 1'b1 :
                (p norm > -lev p 0) ? 1'b1 : 1'b0;
      code[1] = (m norm > lev n 1) ? 1'b1 :
                (m norm > -lev n 0) ? 1'bz : 1'b0;
    return code
  end
  endfunction
```

```
function logic2 t NRZReal2Logic (real p norm, real
m norm, real threshold) ;
logic code[2] ;
  begin
    code[0] = p norm >= threshold ? 1'b1 : 1'b0;
    code[1] = m norm >= threshold ? 1'b1 : 1'b0;
    return code ;
  end
  endfunction
function real2 t pam4Logic2Real (logic p, logic n) ;
  real vnorm[2];
  begin
            (p==1'b1 \&\& n==1'b1) vnorm = '{1.0,1.0};
    if
    else if (p==1'b0 \&\& n==1'b0) vnorm = '{-1.0,-1.0};
    else if (p===1'b1 && n===1'b0) vnorm = '{1.0,-1.0};
    else if (p==1'b1 \& a==1'bz) vnorm = '{1.0/3.0, -1.0/3.0};
    else if (p==1'bz \& a==1'b0) vnorm = '{1.0/3.0, -1.0/3.0};
    else if (p===1'b0 && n===1'bz) vnorm = '{-1.0/3.0,1.0/3.0};
    else if (p===1'bz && n===1'b1) vnorm = '{-1.0/3.0,1.0/3.0};
    else if (p===1'b0 && n===1'b1) vnorm = '{-1.0,1.0} ;
    else
                                 vnorm = '\{0,0\};
    return vnorm ;
  end
  endfunction
```





- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator Blocks
- Pre-emphasis Driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





#### **Emphasis and Equalization**

- Data transmission loss can be compensated for at the transmitting and the receiving end. At the transmitter, it can be compensated either by boosting the higher frequency content (preemphasis) or by decreasing the low frequency content (de-emphasis).
- Pre-emphasis and equalization are techniques to prevent data loss and invert the channel's frequency response i.e. invert of a low pass filter.
- Ideally implemented in Tx as Feed Forward Equalizer and in Rx as CTLE and Decision Feedback Equalizer (DFE))





### Tx Pre-Emphasized output

- Tx output voltage equation without Pre-emphasis:  $V_{outp} = |C_0| * x[n]$ 
  - Sample output levels-> 0V to (C<sub>0</sub> \* Vpp)
- Tx output voltage equation with Pre-emphasis:  $V_{outp} = |C_{-1}| * x[n-1] + |C_0| * x[n] + |C_1| * x[n+1]$ 
  - Sample output levels-> 0V, ( $C_0 * Vpp$ ), ( $C_0 + C_{+1}$ ) \* Vpp etc.

| X[n-1],X[n],       | Pre-emphasized                          | Normalized Voltage                                | Final decoded |
|--------------------|-----------------------------------------|---------------------------------------------------|---------------|
| X[n+1]             | Tx serial output                        | post translator block                             | bit value at  |
|                    |                                         | (with pre-emp mid-                                | the Tx        |
|                    |                                         | threshold                                         | transactor    |
| 0 <mark>1</mark> 0 | C <sub>o</sub>                          | <b>C</b> <sub>0</sub> * Vpp                       | 1             |
| 0 <mark>1</mark> 1 | <b>C</b> <sub>0</sub> + C <sub>+1</sub> | ( <b>C</b> <sub>0</sub> + C <sub>+1</sub> ) * Vpp | 1             |
| 1 <mark>0</mark> 0 | C <sub>-1</sub>                         | C <sub>-1</sub> * Vpp                             | 0             |
| 1 <mark>0</mark> 1 | C <sub>-1</sub> + C <sub>+1</sub>       | (C <sub>-1</sub> + C <sub>+1</sub> ) * Vpp        | 0             |



- PCIe spec has set of rules to determine the values of coefficients(C<sub>-1</sub> C<sub>0</sub> C<sub>+1</sub>).
- C<sub>0</sub> is always greater than C<sub>+1</sub>
   and C<sub>1</sub>



Pre-emphasis data on Tx handled by Translator block





#### Pre-Emphasis Driver on Rx path

<u>Rx translator output</u>: For a Tx voltage swing of V<sub>txswing</sub>, pre-emphasized Tx output voltage value of t<sub>xp</sub> and desired Rx voltage of V<sub>rxp</sub>, the translator module output can be derived as:

$$rx_{xlator_p} = \frac{(tx_p - Vtxswing)}{Vtxswing} * Vrx_p$$

 <u>Pre-emphasis driver output</u>: Run a reverse function of scaling and normalizing done in the Rx translator block(to cancel the translation done) and output 1 if the results is a factor of C<sub>0</sub>

$$rx_{premph_{p}} = \frac{rx_{xlator_{p}}}{Vrx_{p}} * Vtxswing + Vtxswing Translator Block Translator Block Translator Block Pre-emphasis Driver Pre-emphasis Driver Pre-emphasis Driver Pre-emphasis Driver Pre-emphasis Driver Pre-emphasis Driver Pre-emphasis data on Rx handled by Pre-emphasis driver$$





- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator Blocks
- Pre-emphasis Driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion







# High Level TB Architecture for PMA

#### APMA Behavioral Model (BMOD) Usage Flow







- Basics of SerDes Physical Media Attachment (PMA) layer
- Analog Behavioral model flow
- PMA Testbench architecture
- Translator Blocks
- Pre-emphasis Driver modelling
- Layered UVM adapter sequence
- Results
- Conclusion





#### Results – Pre-emphasized Data With Loopback

- The first group is the **pre-emphasized output** from APMA Tx serial pads
- The second group shows the Tx translated output from transactor which predicts the 0s and 1s correctly
- Third group shows the **Rx translated output** which uses the Tx pad data and scales based on translator Rx path algorithm
- Fourth group shows the final output from pre-emphasis driver which corrects the swing based on CTLE requirements (0 translated to -0.171 and 1 translated to 0.171 in the example shown)



















#### Thank You!

Questions ?











#### Future Work

- Mixed signal simulations with sub-modules replaced by schematics
- Extensive use of MSV control knobs present as part of APMA models
- Bring-in channel models along with pre-emphasis driver and fine tune the settings accordingly





#### Fast And Accurate BMOD Examples

```
module apma rx clkgen (
module apma rx clkgen FAST (
                                                                   output ock ckd n,
output ock ckd n,
                                                                   output ock ckd p,
output ock ckd p,
                                                                   output ock divck n,
output ock divck n,
                                                                   output ock divck p,
output ock divck p,
                                                                   input ick n,
input ick n,
                                                                   input ick p,
input ick p,
                                                                   input pa vss,
input pa vss,
                                                                  input vdd
input vdd
                                                                  );
);
                                                                  reg state, stateb;
req state, stateb;
                                                                   . . . .
. . . .
assign ock divck n = ck div1 n;
assign ock divck p = ck div1 p;
                                                                  bfr I0 ( .qn(ock divck n), .qp(ock divck p),
                                                                        .dn(ck div1 n), .dp(ck div1 p),
assign pd clk n = ~pd clk p;
                                                                        .pa vss(pa vss), .vdd(vdd));
. . .
                                                                  dff 2x1 I1 ( .q(state), .clk(pd clk p), .d(stateb),
endmodule
                                                                        .pa vss(pa vss), .vdd(vdd));
                                                                   inv2 1x1 I2 ( .z(pd clk n), .a(pd clk p), .pa vss(pa vss),
                                                                        .vdd(vdd));
                                                                   . . . .
                                                                   endmodule
```





#### **UVM Adapter Sequence Snippet**

```
class pma lane adapter seq extends uvm sequence#(pma transaction);
 pma transaction pma req;
  `uvm object utils (pma lane adapter seq)
  `uvm declare p sequencer(pma lane sequencer)
 // All Register Sequence's object created
 pma init state reg seq init state reg seq;
 pma set spec reg seq set spec reg seq;
 pma configure lane reg seq configure lane reg seq;
 virtual task body();
   forever begin
     p sequencer.get next item(pma req);
     if (pma_req.primitive_layer == pma_transaction::CTRL) begin
       drive ctrl();
     end else if (pma req.primitive layer == pma transaction::DATA) begin
       drive data();
     end else begin
       drive debug();
     end
     p sequencer.item done();
   end
 endtask : body
```



# UVM Adapter Sequence Snippet (contd.)

```
task drive ctrl();
    case (pma req.primitive name)
      pma transaction::FORCE PUP :
        begin
          if (pma req.pma cfg.ctrl path == UVM BACKDOOR) begin
            $cast(req copy, pma req.clone());
                                                                            Configuration
            req copy.set item context(this, p sequencer.cmn ctrl seqr); through interface
                                                                            ports
            start item(req copy);
            finish item(req copy);
           return response = 0;
          end else begin
             `uvm do on with(force pup reg seq,
             p_sequencer.reg_seqr, { block == pma_req.primitive target; enable ==
pma req.power on; });
          end
                                                                            Configuration through
        end
                                                                            Registers
endtask
```





٠

#### **Translator Voltage Scaling**

#### Translator on Tx Path scales the APMA output and sends to the transactors

| Tx NRZ Output from APMA | Normalized Output(with Vpp/2) | Comparator Output (threshold is 0V) |
|-------------------------|-------------------------------|-------------------------------------|
| 0.9V (Vpp)              | 1V                            | 1                                   |
| 0V                      | -1V                           | 0                                   |

- On Rx path: without loopback
  - "logic" data is directly sent from Rx Serial UVC
  - This data is scaled before sending to APMA

| Rx NRZ Input from Data UVC | Scaled output | CTLE Comparator Output (threshold is 0V) |
|----------------------------|---------------|------------------------------------------|
| 1                          | 0.171         | 1                                        |
| 0                          | -0.171        | 0                                        |

- Rx Path: with loopback
  - Tx data is normalized with Vpp/2
  - This output is scaled before sending to APMA

| Tx NRZ Output from APMA | Normalized Output(with Vpp/2) | Scaled output | CTLE Comparator Output (threshold is 0V) |
|-------------------------|-------------------------------|---------------|------------------------------------------|
| 0.9V (Vpp)              | 1V                            | 0.171         | 1                                        |
| 0V                      | -1V                           | -0.171        | 0                                        |





#### Conclusion

- Building a configurable translator module and the UVM layered structure can help reuse the blocks across IP variants and also across various analog models
- Using the adapter sequences rendered creating a vast set of sequence library for each data and lane configurations as per the specification
- This flow can render build a verification framework which models critical design features effectively and helps in focused verification of the same





#### Layered UVM Adapter Sequence

| Features of Adapter Sequence                          | Remarks                                                                                                                                                                            |  |
|-------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Lane specific traffic generation and routing          | Sending traffic across lanes, wait for CDR lock and checking lane-to-lane interactions (skew, latency etc.)                                                                        |  |
| Demarcation between control, data and debug transfers | Setting up beacon, sending burst, checking clocks or bit errors                                                                                                                    |  |
| Centralized coverage sampling across model types      | Coverage packet can be sent to coverage class or can be sampled here directly for control and data transfers                                                                       |  |
| Passing packet to scoreboard/monitors                 | Each transaction from register or control sequencer comes to lane adapter and hence the packet contents can be used to add checks in scoreboard or monitor specification adherence |  |

- A central layer for transaction ordering and control; helps in generating effective lane specific traffic
- Each transaction takes in the direction (Tx or Rx) as well as the lane number as a parameter which can be set from the test specific virtual sequence
- Lane-to-lane skew and elecidle entry/exit controlled for each lane

