#### **Portable Stimulus Tutorial**



SYSTEMS INITIATIVE

# Agenda

| Motivation                                      | Adnan Hamid – Breker          |  |
|-------------------------------------------------|-------------------------------|--|
| PSS Introduction                                | Tom Fitzpatrick – Siemens EDA |  |
| Developing Reusable Test Content at Block Level | Matthew Ballance - AMD        |  |
| Sub-system and SoC–level testing                | Sergey Khaikin - Cadence      |  |
| Post-Silicon testing                            | Prabhat Gupta - AMD           |  |
| PSS new features and conclusion                 | Tom Fitzpatrick – Siemens EDA |  |



**PSS Motivation** Why should UVM Engineers care about PSS

Adnan Hamid



SYSTEMS INITIATIVE

# What does the data tell us ?



#### **Project Resource Deployment**





## **UVM Engineers hold critical corporate knowledge**

- Only ones to fully understand IP functionality
- Key knowledge needed for full chip bring up in simulation / emulation / fpga
- Key knowledge needed in firmware development
- Only a small number of UVM resources for each IP to help with debug

What if there was a way to capture that corporate knowledge in an abstract, portable, reusable form



### UVM is the standard testbench methodology

- Large library of commercial interface VIPs
- Well established RAL register models
- Configuration DB

| Interface VIPs | UVM Environment     | Interface VIPs |
|----------------|---------------------|----------------|
|                | IP / Sub-System RTL |                |



### UVM does not help with creating test content

- Sequences are primarily directed with limited rand parameters
- Scoreboard checkers are manual
- Stimulus coverage critical but difficult
- Largely manual, time-consuming work
- Hard to scale to large complex blocks or subsystems





# **Difficult to randomize scenarios**

Difficult to manually synchronize tests across multiple ports and processor instruction streams

```
task my_seq::body();
    repeat (10) begin
    req = my_xtn::type_id::create("req");
    start_item(req);
    assert(req.randomize() with {dst_addr == 48'hfffffffffff;});
    finish_item(req);
    end
endtask
```



# UVM - Limited support for concurrency, resource and memory management

- Example Exercise every channel of a DMA concurrently
  - Each channel should use non-overlapping random memory addresses
  - Each channel transfer mode and size should be randomized
  - Every channel should start at the same time
- Possible but difficult and not scalable



# PSS - support for concurrency, resource and memory management

- Example Exercise every channel of a DMA concurrently
  - Each channel should use non-overlapping random memory addresses
  - Each channel transfer mode and size should be randomized
  - Every channel should start at the same time

DMA must be initialized before use



## Limited ability to combine scenarios in UVM

#### **UVM Sub-System Scaling Barriers**

- Okay to run IP sequences in parallel (with careful resource partitioning)
- Difficult to create end-to-end multi-IP scenarios



# HW/SW Interface of a Typical SoC

- UVM very hard to use at this level
- No reuse in full chip C-bench (simulation/emulation/post-silicon)



# **Very complex UVM TB Architectures**





### **PSS + UVM provides the required abstraction**





- More time spent thinking about scenarios
- Less time spent on implementation
- Flows as executable documentation
- PSS can be added into UVM environment to act as test content generator capability





## What benefits do UVM engineers report?

- 40%-60% effort reduction for new test benches (abstract model)
- 60%-80% effort reduction for IP revisions ( improved reuse )
- Great reduction in boring repetitive work
- Reduction in debug of complex UVM control flows
- Easier to provide tests to other part of flows
- Flows as executable documentation



## **UVM sub-system: Easy Port to Full SoC**

• UVM tests may be provided to system team



#### **Block Verification**



# Seamless reuse across sim/emu/post-si



SYSTEMS INITIATIVE

## What benefits do integration teams report ?

- Streamlined test documentation
- Much better system level coverage
- Great reduction in effort for system level test content
- Ability to stress test end-to-end scenarios w/o involving firmware and software
- Target complex scenarios (e.g. coherency) not easily covered using real workloads





Tom Fitzpatrick



SYSTEMS INITIATIVE

# **Methodology Shifts Require New Thinking**



- SystemVerilog brought a new approach to Verification
  - Standardized features from other proprietary languages
  - Directed testing  $\rightarrow$  Constrained-Random



 Constrained-Random requires Functional Coverage to know what happened



# **PSS is Declarative**

#### **Brings Constrained-Random Generation to the Scenario Level**



- Higher level of abstraction
  - Consice models
  - Describe a much larger set of tests
- Specifies rules to define the set of possible scenarios
- Scheduling constraints between actions
- Data flow requirements between actions
- Data constraints
- Target-specific resource constraints
- Tool generates code to execute on Target Platform
  - Each unique solution is effectively a directed test
- May infer action executions to meet rule requirements
- "Overlaying" tests effectively covers the desired test space



# What is a Portable Stimulus Model?



































- Behavior = Action
- Schedule = Activity

- Parallel Data = Stream
- State info = State
- In UVM, PSS can create a set of sequences
  - Run in existing UVM env



Sequential Data = Buffer





Easier to specify constraints at scenario level



Can be considered rules for scenario generation



Rules allow scenarios to be inferred from partial specification





Scheduling built into generated test regardless of target





# The Rubber Meets the Road





The Abstract Model must be implemented on different targets



Atomic Actions → target code
Target code modeled in *exec* blocks



*Generator* assembles target code according to *Activity* schedule





# **Exec Blocks Define the Target Implementation**

Target Templates Define 1:1 Mapping



Target templates require a separate exec block for each target language Managed in PSS via inheritance or extension



# **Exec Blocks Define the Target Implementation** Procedural Interface Isolates Exec Block from Target Language



Procedural Interface lets you have one exec block per action type Simplifies PSS code management



# **Exec Blocks Define the Target Implementation**

Procedural Constructs Move Complex Flow Control to PSS



Algorithm is specifed in exec block Imported methods called accordingly



Complex control flow generated from PSS Language-specific code is much simpler



Procedural Constructs provide maximum reuse by simplifying the migration between languages



# **PSS Generalized Tool Flow**



# Generated Code Assembled According to Activity Schedule















### **Developing Reusable Test Content at Block Level**



Matthew Balance

### **PSS Test Content at Block-IP level**

#### Goals and Requirements

- Create reusable content for IP consumer teams to use
  - Initialize IP in specific modes
  - Exercise key IP operations
- Exercise key configurations as requested by consumer team
  - Collect coverage metrics to confirm
- Test content must run in UVM and embedded-software environments

#### Benefits to the Block-level Testing

- More-easily create complex scenario-level tests
- · Shared medium to discuss test scenarios with other teams
  - Architecture, SoC DV, firmware, driver, validation, etc



# **PSS Modeling and Realization**

#### Modeling

- Capture the 'what' of a test
- Capture relationships and requirements

#### Realization

- How do we carry out behavior?
- What functions to we call?
- What values (from the modeling layer) to we pass?
- Data selected in the Modeling layer used in Realization



```
cregs.src_addr.write_val(addr_value(dat_i.mem_h));
cregs.dst_addr.write_val(addr(value(dat_o.mem_h));
cregs.sz.write_field("TOT_SZ", dat_i.size);
cregs.status.write_field("EN", 1);
```

```
while (cregs.status.read().DONE == 0) {
   yield;
}
```



# Simplifying Register Programming with a RAL

#### Register Access Layers exist to simplify reading/writing registers and avoid mistakes

- Define mnemonics for registers and fields, so we don't have to remember addresses and bit positions

#### Most methodologies have one or more RAL

- C/C++ -- structs, unions, macros
- UVM UVM register model
- PSS PSS register-access layer

Most device-specific RALs are generated from a higher-level description



### **PSS RAL Overview**

- PSS defines data types for capturing a RAL in the Core Library
- The PSS RAL targets the requirements of bare-metal software tests
  - Light-weight, intended to enable tools to scale to huge register maps

#### The RAL for an IP is easily reused in a larger system context

- Self-contained and addressed relative to its parent

#### Provides access methods that simplify programming-sequence creation

- Read/write by integer value
- Read/write by bitfield view
- Read/modify/write operation to update fields with compact code



# **Defining PSS Register Layout**

- PSS packed struct specifies register field layout
  - Specify width of each field
  - Position specifies the offset within the register
- PSS register group collects registers
  - Contains reg fields defined in terms of packed structs
  - Implements a function to map fields to relative offsets

PSS RAL types are defined in the PSS Core Library

```
pure component uart_ctrl_regs : reg_group_c {
    reg_c<uart_ctrl_ua_cr_reg_s,READWRITE,32> ua_cr;
    reg_c<uart_ctrl_ua_mr_reg_s,READWRITE,32> ua_mr;
    pure function bit[64] get_offset_of_instance(string name) {
        if (name == "ua_cr") return 0x0;
            if (name == "ua_mr") return 0x4;
        }
}
```

struct uart\_ctrl\_ua\_mr\_reg\_s : packed\_s<> {
 bit[1] cclk;
 bit[2] chrl;
 bit[3] par;
 bit[2] nbstop;
 bit[2] chmode;
 bit[1] clks;
 bit[1] irmode;
 bit[20] reserved;

```
SYSTEMS INITIATIVE
```

# **Reading/Writing Registers with the PSS RAL**

#### PSS registers provide several read/write APIs

| Register Access Function                     | Purpose               |
|----------------------------------------------|-----------------------|
| <pre>void write(packed_s reg_struct)</pre>   | Write register struct |
| packed_s read()                              | Read register struct  |
| <pre>void write_val(bit[SZ] reg_value)</pre> | Write register value  |
| bit[SZ] read_val()                           | Read register value   |

```
rand bit[2] stop_bits;
constraint stop_bits == 2;
```

```
exec body {
    ua_mr_reg_s mr_reg_temp;
```

```
mr_reg_temp.par = 1;
mr_reg_temp.nbstop = stop_bits;
mr_reg_temp.chmode = 0;
```

// Write Mode Register
comp.regs.ua\_mr\_reg.write(mr\_reg\_temp);

#### Single-call read-write-modify simplifies programming sequences

| Register Access Function                                                          | Purpose                     |
|-----------------------------------------------------------------------------------|-----------------------------|
| void write_masked(R mask, R val)                                                  | Masked write of a struct    |
| void write_val_masked(bit[SZ] mask, bis[SZ] val)                                  | Masked write of a integer   |
| void write_field(bit[string name, bit[SZ] val)                                    | Write a named field         |
| void write_fields(list <string> names,<br/>list<bit[sz]> vals)</bit[sz]></string> | Write a set of named fields |

```
rand bit[4] mode;
rand bit[16] coeff;
exec body {
   comp.regs.cr.write_fields(
      {"mode", "coeff"}, // field names
      {mode, coeff}); // values
}
```



# How is a PSS RAL Created?



# **Modeling: IP Behavior - Initialization**

#### Nearly all IPs need to be initialized before use

- DMA needs to configure channels and interrupts
- UART needs to set baud rate, etc

#### Typical to use a state object to store initialization data

- Accessible by any action using the IP component
- Prevents changes to the initialized mode while the IP is in use

#### Want to provide a variety of initialization actions

- Full-random initialization
- Fully-fixed 'sanity' initialization
- Fixed along specific axes, etc.

```
action uart_init {
   output uart_init_s init_o;
   // ...
}
```

```
state uart_init_s {
    rand bit[4] in [5..8] bits;
    rand bit stop;
    rand bit parity_en;
    rand bit parity_even;
    rand bit[32] in
       [9600,19200,38400,115200] baud;
}
```





# **Test Realization: UART Register Definitiion**

#### Focus on what we need for initialization





### **Test Realization: Programming Sequence**



# **Modeling: Specialized Initialization Actions**

#### The base UART initialization action is fully-random

- Can select any combination of values
- Typically, there is a set of common modes in which to initialize an IP
- Define these as specializations of the base (fully-random) initialization action
  - Some with fully-specified parameters
  - Others with some variability

```
action uart_init_sanity : uart_init {
   constraint init_o.baud == 9600;
   constraint init_o.parity_en == 0;
   constraint init_o.bits == 8;
   constraint init_o.stop == 1;
}
```

```
action uart_init_n81 : uart_init {
   constraint init_o.parity_en == 0;
   constraint init_o.bits == 8;
   constraint init_o.stop == 1;
}
```

• Automatically reuse register-programming sequence defined in the base action



# **Modeling: Requiring Initialization**

#### Encoding pre-conditions is a key aspect of creating reusable test content

- In this case, that the IP must be initialized, possibly in a specific mode

#### PSS state objects allow us to require IP initialization before use

- Initialization actions set the state to non-initial
- Behavior actions require a non-initial state



#### PSS tools detect if we attempt to use an IP before initialization

- Randomly infer a valid initialization action
- Report an error if no initialization action exists to be inferred



# **Methodology: Factoring Out Commonalities**

- It's likely that all of our behavior actions depend on a properly-initialized IP
- It's good practice to factor out core requirements like this to an abstract base action
  - Abstract means that the action is just a building block, and won't independently





### **Placing Requirements on Initialization Mode**

#### Thus far, we have just required the IP is initialized

- Any randomly-selected initialization mode is okay
- Often, we also need it to be initialized in some specific way
- Constraining the initialized state adds a requirement
  - Must initialize in high-speed mode to test large data transfer

```
action uart_tx_huge : uart_base {
    input mem_b dat_i;
    constraint dat_i.size >= 256*1024;
    constraint init_i.baud >= 115200;
}
```



# **Initialization Coverage - UART**

- One of our block-level deliverables is coverage of key initialization modes
- PSS covergroups collect coverage metrics
- Covergroups are sampled automatically
  - E.g., at the end of action execution
- Can predict coverage before tests run
  - Coverage is on stimulus fully under our control
  - Shortens time to identify and close coverage holes

```
action uart_init {
    output uart_init_s init_o;
```

```
covergroup {
  baud_cp : coverpoint init_o.baud;
  bits_cp : coverpoint init_o.bits;
  baud_bits_cr : cross baud_cp, bits_cp;
```

```
parity_en_cp : coverpoint init_o.parity_en;
parity_even_cp : coverpoint
    init_o.parity_even
    iff (init_o.parity_en);
parity_cr : cross
    parity_en_cp,
    parity_even_cp iff (init_o.parity_en);
```

```
} uart_init_cov;
```



# **DMA Behavior – Single DMA Transfer**

#### Our simplest DMA operation is a memory-to-memory copy

- It copies memory from a source memory block to a destination

#### Memory-to-memory pre-conditions

- DMA IP must have been initialized
- Action must have dedicated access to a DMA channel
- Action must be supplied source memory block to read

#### Memory-to-memory post-conditions

- Memory-to-memory operation produces a destination memory block





# **PSS Action Outline**

- Our PSS 'memory-to-memory' action captures these requirements
- And, enforces some required relationships



# **Modeling: Claiming Memory**

- The mem-to-mem action produces a block of memory
  - Memory is allocated via a memory *claim* within the memory buffer
  - The newly-created memory block is passed out via the output memory buffer
- Claimed memory is automatically allocated and freed
  - Ensures parallel activity uses unique memory regions
  - Avoids memory leaks

```
buffer mem_b {
   rand bit[32] size;
   addr_handle_t mem_h;
   rand addr_claim_s<> mem_claim;
```

#### constraint mem\_claim.size == size;

exec post\_solve {

```
mem_h = make_handle_from_claim(mem_claim);
```



- Think about consumer team
- Data producer?
- Accellera-std library in the works



# **Test Realization – DMA Single Transfer**

We can now implement the connection between action-level model and device registers



### **Modeling: Encapsulating Complex Behaviors**

#### IP operations often involve multiple steps that, as a group

- Place internal and external requirements around memory lifetime
- Place temporal requirements on resource availability

#### PSS enables encapsulating these behaviors with their requirements

- Ensures that the behaviors are internally consistent
- Ensures that usage is consistent with requirements

#### Goal is to deliver easy-to-use behaviors

- Expose top-level 'knobs' to enable control
- Hide details from end users



# Modeling: Encapsulating Complex Behaviors

#### The DMA Engine supports chained transfers via in-memory descriptors

- Each descriptor performs a copy between memory regions
- Descriptors 'linked' together into a descriptor chain
- Descriptor-chain memory must be valid for the duration of the transfer
- Data in source regions must be valid before transfer starts
- Data in destination regions is only legal once the full transfer completes

#### Goal: encapsulate task of creating and running a chained transfer

- Capture requirements around memory usage and lifetime
- Capture programming sequence for descriptor setup and transfer
- Provide a simple action to produce chained transfers of various lengths





# **Modeling: Building a Descriptor Chain**

- Key operation: add descriptor to the chain
- Two possibilities
  - Add last descriptor -- 'next' pointer points to null
  - Add non-last descriptor 'next' pointer points to previous
- Key data: last descriptor pointer and accumulated memory blocks
  - Model with a *buffer*
- Two actions:
  - Initialize chain (marks end of the chain)
  - Add new descriptor





# Modeling: DMA 'chain' buffer

#### The descriptor chain is built starting at the end

- First descriptor built is the tail of the chain
- Last descriptor built is the head of the chain processed by DMA

#### Must manage two things while building the transfer

- Handle to the previously-build descriptor
  - "next" descriptor for the DMA engine to process
- Handle to memory regions used by the transfer
  - Prevents them from being freed until they've been used

#### Encapsulate this data in a buffer

- list to hold memory handle
- address handle pointing to the next descriptor
  - Or, null, if at the end of the chain







# **Modeling: Add-Descriptor Action**

#### The add-descriptor action

- Claim source, destination, descriptor, memory
- Propagates the 'chain' buffer data
- Remember: just setting up the transfer
  - The full chained transfer runs later

Specify relationships around allocated memory

Update descriptor-chain 'head' and 'previous' references

Save all address handles to extend their lifetime to the end of the transfer action dma\_chain\_add {
 input chain\_b chain\_i;
 output chain\_b chain\_o;
 input dat\_b dat\_i;
 output dat\_b dat\_o;
 rand addr\_claim\_s<> desc\_claim;
 addr\_handle\_t desc\_h;

```
constraint desc_claim.size ==
    sizeof_s<dma_desc_s>::nbytes;
constraint dat_i.size == dat_o.size;
```

```
exec post_solve {
```

desc\_h = make\_handle\_from\_claim(desc\_claim);
chain\_o.next\_desc = desc\_h;

```
chain_o.mem_h = chain_i.mem_h;
chain_o.mem_h.push_back(desc_h);
chain_o.mem_h.push_back(dat_i.data_h);
chain_o.mem_h.push_back(dat_o.data_h);
```

### **Test Realization: Descriptor Packed Struct**

| Represent in-memory descriptors with packed structs   | <pre>struct dma_desc_csr_s : packed_s&lt;&gt; {</pre>                                                                                     |
|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
| Use to model DMA-descriptor memory                    | <pre>bit[12] sz;<br/>bit[4] rsvd1;</pre>                                                                                                  |
| Model subfields when needed                           | <pre>bit dst, src;<br/>bit inc_dst, inc_src;<br/>bit eol;<br/>}</pre>                                                                     |
| Combine with fields of other<br>fixed-size data types | <pre>struct dma_desc_s : packed_s&lt;&gt; {     dma_desc_csr_s csr;     bit[32] src_addr;     bit[32] dst_addr;     bit[32] next; }</pre> |



### **Test Realization: Populating Descriptor Chain Link**



# **Test Realization: Running Chained Transfer**



# Modeling: Encapsulating Transfer-Chain Building

Now, let's create the reusable 'descriptor-chain transfer' action



# Modeling Wrap-up: IP-centric PSS Component

- We've been focused on the IP behaviors
  - Modeling pre-conditions and requirements
  - Modeling test realization targeting memory and registers
- We encapsulate those behaviors (actions) with required IP resources in a component

| <ul> <li>Reference to the IP register group</li> </ul>                                                       |                                                 |                  |
|--------------------------------------------------------------------------------------------------------------|-------------------------------------------------|------------------|
| - Pool of resources                                                                                          | <pre>component dma_c {     ref dma_regs_c</pre> | regs;            |
| - state pool that holds the current initialized state                                                        | dma_chan_r [4]<br>pool dma_init_s               | channe<br>init s |
| <ul> <li>IP component is independent of integration level</li> <li>Same at IP, subsystem, and SoC</li> </ul> | <pre>action dma_m2m { /* }</pre>                | —                |
|                                                                                                              |                                                 |                  |

Environment-specific details go in the containing component



channels;

init s;

# **PSS Environment Integration**

- Every verification environment has specific characteristics
  - Memory map

- ...

- Mechanism used to access memory
- Collect these specifics in a top-level PSS component





# **Connecting PSS to a SystemVerilog Testbench**

- Programming sequences interact with IPs via memory and memory-mapped registers
- PSS defines a standard set of read/write routines for accessing memory
  - May be called directly by user-defined test realization
  - Called indirectly when user-defined test realization reads/writes registers
- Implement these read/write functions in terms of target environment
- Direct to BFM





### **PSS at IP-Block Level: Summary**

### Captured

- Test content to Initialize our IPs
- Test content to exercise key behaviors
- Register-access layer to interact with IP registers
- Rules that document our actions' requirements

### Behaviors capture requirements for their execution

- IP must be initialization before use
- Resources required by each behavior
- Memory required by each behavior

### Requirements Capture+Test Realization = Portability

- Can automatically detect missing requirements (eg missing initialization)
- PSS processing tools can *infer* an action to satisfy the requirement
- Requirements provide automatic documentation





### Sub-system and SoC-level testing with PSS



Sergey Khaikin

## **PSS Beyond Block IP: History and Future**

### SoC-level was a classic application of PSS in the early days

- Embedded-C, bare metal SW-driven verification
- Abstract behavioral models layered on top of driver APIs
- Tape-out proven with initial industry adoption circa 2018



### PSS 2.1 opens new opportunities for SoC !

- Coreless environments
  - customization of write and read primitives enable driving transactors and BFMs
- Shift left
  - addr\_reg\_pkg features enable register-based device programming when driver SW is not yet available
- Virtualization
  - addr\_value()customization and memory region tagging enable modeling of address translation mechanisms



### **PSS for SoC / Sub-system: Goals and Requirements**

#### Focus on integration aspects that are often custom in SoC designs

- HW/FW logic that is *prone to bugs* 
  - Not verified in lower-level environments
- Some aspects of desired behavior not explicitly covered in formal specs
  - Subject to "soft" issues, such as overall power consumption and performance, not just clear-cut functional bugs
- Capture and drive System-Level Functional Coverage metrics

#### Typical examples of PSS test content at SoC and Sub-system levels:

- SoC integration: coalesce unit tests into cross-IP flows to exercise data paths and system concurrency
  - Can be instrumented for performance measurements
- Power management and chip bring-up: exercise IP power-cycle sequences and SoC boot flows
  - Generate directed-random sequences of power state transitions on cores/clusters and IPs/subsystems
  - Cross power-related flows with functional "traffic" tests: archetype of system use-cases
- Additional SoC-level aspects: exercise interrupt controllers, chip frequency switching ...



### Test Content for SoC / Sub-system: Challenges

### Facilitate portability across diverse execution platforms

- Simulation: coreless with BFMs or processor-driven
- Fast platforms (emulation): coreless with transactors/AVIPs or processor driven
- Post-Si: Silicon board, ATE testers processor driven

### Quickly initialize required IPs

- Generate complex and valid cross-IP traffic patterns
  - Parallel traffic avoiding resource conflicts
  - Memory allocation management
- Accommodate changes in register memory maps
- Prove coverage of key concurrent behaviors
- Unify scenario space model across all testing environments
  - Reuse abstract test content on transition from register to driver-based testing



#### Assembling PSS View of a Modern SoC Design: From Vision to Deployable Methodology and Production Use component pss\_top { SoC SS A a; **Typical PSS environment:** Model SS B b; PORTABLE Hierarchy of abstract models – SoC, SubSys, IP Sub-System Residing on top of Test Realization Layer component SS B { component SS\_A{ component SS\_X Level IP uart uart; IP dma dma: **Models** PORTABLE PORTABLE component IP uart { component IP\_dma { resource chnl{} action init{} Pure verification intent **IP-Level** action config{} action tx{} **Models** realized in different action rx{} **Diverse Scopes** (Integration) environments Middleware Graphics, Audio etc..) **Test Realization Layer** OS & Drivers Embedded or **Multi-platform** Host-driven Bare Metal SW **Target Execution** Virtual Platform Simulation Emulation **FPGA** Silicon Board Tests in C/C++ System on Chip (HW + SW) **Environment** UVM PSS is agile - any modeling approach is viable! Top-to-Bottom, Bottom-to-Top or "somewhere in between" Next slides describe roles and interaction of these layers Diverse Platforms acceller Horizontal Reuse

SYSTEMS INITIATIVE

## **PSS Modeling of SoC - Top to Bottom approach**

### PSS models formally span Test Spaces

- Rules of the game
  - Participating entities, actors and their properties
  - Behaviors, their properties and dependencies

### PSS activities traverse Test Spaces

- Specific Plays within the game
  - Interesting use-cases, per Test Plan
  - Naturally map onto System-Level Coverage Goals



### Modeling cross-IP flows at SoC Level



### **PSS Modeling of SoC – Let There be IP !**





### PSS Modeling of SoC – Devil is in the Detail ... Really?

IP models level may vary in level of detail: coarse, abstract ←→ fine-grained, detailed



### Methodology - Who Owns PSS IP-level Models

#### IP ownership methodology choice depends on project phase and degree of PSS technology adoption across different teams



SYSTEMS INITIATIVE

### Assembling PSS View of SoC Design -



### Modeling cross-IP flows with PSS activity statements



SYSTEMS INITIATIVE

PSS enables easy test creation for highly parallel, hard-to-schedule scenarios

### **PSS Coverage: Value-add, Differentiation and ROI**

### Value-add

- Construction and analysis of system-level functional coverage metrics
  - Portable
  - Abstract
- Predictability
  - Regression suite optimization, faster coverage closure with gen-time coverage prediction

### Differentiation

- Enabler of innovative verification methodologies and flows

### Cost

- Coverage methodology is consistent with PSS scenario modeling and test generation mindset
- Trainable, deployable

Proven Impact and Differentiation Low Deployment Cost = High ROI =



### **PSS Coverage: Current Capabilities and Applications**

#### Abstract and high-level, like PSS stimulus itself

- Required for applications in system level, use-case based and software-driven validation

#### Portability: critical enabler for applications on fast platforms with low observability

- ATE, Emulation, Silicon Boards, bare metal environments

#### Enables tools to predict coverage at generation time [details on slide 11]

- Possible because PSS scenarios are declarative, can be solved upfront
- Highly differentiated in comparison to UVM and other procedurally-driven environments
- Enables flows aimed at generation of exhaustive coverage regression suites

#### Easy to define functional coverage spaces over PSS scenario attribute values

- Structurally, same as SV coverage sets of combinations expressed in terms of cover points, bins and crosses
  - Interoperable with eco-system: other coverage engines .e.g Formal, SV and test plan tracking databases
  - Low adoption barrier and deployment cost
- Can define coverage goals that span across multiple actions within a scenario
  - Enables specification, collection and tracking of system-level <u>behavioral</u> coverage goals [new in PSS 3.0]



### Coverage Maximization and Regression Optimization with PSS coverage



### New in PSS 3.0: Behavioral Coverage



### **Memory Allocation Consistency in PSS**



### Methodology: Unify Scenario Space Model Across Environments



### Summary: PSS Advantages for SoC Verification Engineers



SYSTEMS INITIATIVE

### **Post-silicon testing with PSS**

Prabhat Gupta



SYSTEMS INITIATIVE

## Post-silicon testing goals

#### Early silicon bring-up

- Screen for defective parts
- Ensure major features and data paths are working

#### Systematic feature coverage

• Run tests to verify each SoC feature in isolation to build a baseline

#### Stress testing

- Gradually build-up test complexity from feature coverage to multiple features together
- Create and run tests that mimic real-world scenarios

#### All feature enablement with OS and application

• Run real production use cases, measure power and performance

#### Production yield optimization

· Select a suite of tests from broad test suite for better yield



### Streamline post-silicon testing with PSS

The benefits of PSS stimulus

### Reduced test development cost

- Save time and money by reusing pre-silicon IP/SoC test content
- Empower your team with formal action-based knowledge transfer of test space to enable anyone to create complex SoC tests

#### Feature coverage reports

- Ensure comprehensive testing with PSS coverage features to check coverage of tests before and after run

#### Failure debug in simulation or emulation

- Quickly root-cause failure cases for stress test fail or failures in the field with PSS Lego-block based tests
- Connect directly with IP experts using PSS language as a common language for efficient debugging

#### • Use AI to figure out effective tests for stress testing and yield maximization

 Optimize your testing strategy with actions attributes control knobs that provide better input for AI model training to choose functional tests



### Al Engine IP – an example use case

- A 2D arrays consisting of multiple AI tiles
  - Compute, memory and interface DMA tiles
- Grid of engines and highly configurable network makes creating post-silicon validation tests very challenging



Control Processor Subsystem



## Early silicon bring-up

#### Tests for Power, reset, and clock sequencing

- Quickly create experimental test sequences to screen for meta-stability and manufacturing issues
- Tested in UVM, ported to processor by PSS tool

#### Tests for major datapath and feature coverage

 Run isolated simple tests for datapath and major features to create a coverage baseline and to screen for manufacturing issues

#### Test coverage report

- Automatic coverage report with PSS coverage feature

Structural DFT tests don't find all manufacturing issues

PSS building block approach makes turnaround fast for experimental functional tests



## Array boot with control processor



SYSTEMS INITIATIVE

# **AI Engine PSS model**

```
// setup address map though TLBs
action setup_tlb { };
```

```
// Power up a column
action release_clamp {
    rand int in [0..COLS-1] col;
    rand bool release;
};
```

```
// gate, un-gate the column clock
action clock_gate {
    rand int in [0..COLS-1] col;
    rand bool disable_cg;
};
```

```
// assert, de-assert the column reset
action reset_col {
   rand int in [0..COLS-1] col;
   rand bool deassert;
};
```

#### PSS model created by IP team

```
abstract action boot_base {
    output aie_state aie_state_out;
    constraint aie_state_out.aie_state_obj.boot == true;
};
```

```
// Normal boot sequence
action boot_array : boot_base {
    activity {
        do setup_tlb;
        repeat(c: COLS) {
            do reset_col with {col == c; deassert;};
        };
        repeat(c: COLS) {
            do release_clamp with {col == c; release;};
        };
        repeat(c: COLS) {
            do release_clamp with {col == c; disable_cg;};
        };
        };
    };
    };
    };
};
```



# **Al Engine PSS model**

```
// Base action for all building block actions
abstract action aie_base {
    input aie_state aie_state_in;
    constraint aie_state_in.aie_state_obj.boot == true;
};
```

```
// A few random register accesses
action register_access : aie_base {
   rand int in [0..NOCS-1] noc;
};
```

```
// Program one circuit through the array
action setup_circuit : aie_base {
    output circuit_buf circuit_buf_out;
};
```

```
//
action dma : aie_base {
    input circuit_buf circuit_buf_in;
    rand conn_s src;
    rand conn_s dst;
};
```

### PSS model created by IP team

```
// Simplest early bringup test
action bringup_reg_test {
    activity {
        do register_access;
    };
};
```



## Solved bringup tests



## Nothing is working in the lab!

- Access to registers inside the Al Array failing randomly
- Is the boot sequence wrong
- Is there manufacturing fault
- Are we seeing metastability
- Need to create lots of experimental bootcode and tests

```
// Experimental boot sequence
```

```
action exp_boot_array : boot_base {
    activity {
        do setup_tlb;
        repeat(c: COLS) {
            do release_clamp with {col == c; release;};
        };
        repeat(c: COLS) {
            do clock_gate with {col == c; disable_cg;};
        };
        repeat(c: COLS) {
            do reset_col with {col == c; deassert;};
        };
    };
};
```

All building block actions may have complex rules about sequencing that can not be violated. Action may have random attributes with constraints. A person in lab doesn't need to be an expert to be able to create experimental tests. PSS tool can create a test with minimal user input



### New experimental tests quickly created in lab





### **Contract between IP and IP consumers**

#### IP public actions should specify most requirements for that action in PSS

- Non-expert users can quicky create new tests
- E.g., DMA action should add a dependency on DMA being powered up and initialized

#### IP PSS building blocks are designed by breaking down real world use cases

- Lower barrier to create new tests along with PSS abstraction of behaviors and dependencies
- Consumers use higher abstraction problem domain language with PSS action

### Common flow object types for easy interoperability with other IPs

- A company or industry wide methodology library



### Systematic feature coverage, stress testing, yield optimization

### Comprehensive Feature Coverage

- Ensure complete coverage of all datapath and features using PSS building block actions
- Leverage PSS randomization to explore and cover a wide range of operational modes
- Eliminate the need for deep IP expertise for effective coverage

### Robust Stress Testing

- Easily design scenarios that accurately replicate real-world use cases
- Effortlessly create stress scenarios that rigorously exercise all aspects of the IP under test

### Optimizing Manufacturing Yield

- Combine different IP tests with various modes, specifically constrained for the silicon testing environment
- Enhance yield outcomes by leveraging a comprehensive suite of generated tests for experimentation



### Conclusion

#### Embrace the Future with PSS

- PSS, the state-of-the-art unified pre- and post-silicon testing methodology, leverages randomization and high-level constructs to revolutionize post-silicon testing

#### Streamlined Test Documentation

- Formal test space documentation, coupled with action building blocks, empowers anyone to create tests quickly and efficiently

#### First-Class Automated Coverage Reporting

- With PSS, coverage reporting is elevated to a first-class citizen, ensuring comprehensive and accurate testing results

#### Comprehensive Test Suite Generation

- Generate a large suite of tests designed for rigorous stress testing, ensuring the robustness and reliability of your systems

#### Simplified Debugging Process

- PSS building blocks streamline the debugging process, making it easier to identify and rectify issues, enhancing overall efficiency and productivity





Tom Fitzpatrick



SYSTEMS INITIATIVE

## What's New in 2.1

### See last year's tutorial for more details

### Test Realization

- Memory & Register Enhancements
  - Read-modify-write register-access functions to make programming sequences more compact
  - Support user modeling of address translation
  - Provide gen-time access to resolved allocation addresses
  - Support defining a shared storage region with different address-space-specific 'views'
- Allow concurrent execution to yield execution
- Add support for a comment directive in target-template strings

### Activity Modeling

- Atomic regions that exclude inferred actions
- Labeled anonymous-action traversals
- Enhanced pool-bind directive with support for arrays of components



## What's New in 2.1

#### Core Language

- REMOVED C++
- Floating-point data types
- Base types for enums
- Static functions in components

### I/O and Messaging

- Functions for reading/writing files
- Functions for formatting strings, displaying messages, reporting errors

### Randomization

- Procedural randomization in exec blocks and functions
- Randomization of list elements
- Weighted random-distribution directive



### What's Coming in 3.0 Introducing Scenario-Level Behavioral Coverage

- Given a stream of action executions, find out whether a given temporal scenario (query) occurs in this stream
- The cover statement specifies the interesting scenario
- A monitor encapsulates behaviors to be covered
  - A monitor may be implicit (in a cover statement) or explicit
- The answer is yes or no
  - Yes, if the top-level monitor has at least one match,
  - No, otherwise





#### WR: cover { do write; do read }



### **PSS + UVM provides the required abstraction**

SD Rea



SD Wri



- More time spent thinking about scenarios
- Less time spent on implementation
- Flows as executable documentation
- PSS can be added into UVM environment to act as test content generator capability





### **PSS Enables True Block-to-SoC Reuse**

### Easily capture IP-specific knowledge

- Abstract model independent of target language/platform
- Define rules for reuse

### Compose complex scenarios

- PSS models are hierarchical
- Infer actions based on rules
- Model gets implemented on your target platform(s)
  - Correct-by-construction
  - Model defines memory/register rules and behaviors

### **Concise Language to Specify Verification Intent**

Scheduling built into generated test regardless of target





## **Thank You!**

Adnan Hamid – Breker

Tom Fitzpatrick – Siemens EDA

Matthew Ballance - AMD

Sergey Khaikin - Cadence

Prabhat Gupta - AMD

Tom Fitzpatrick – Siemens EDA



## QUESTIONS

