#### QU22 DESIGN AND VERIFICATION™ DVCDDN CONFERENCE AND EXHIBITION

#### **UNITED STATES**

# Extension of the power-aware IP reuse approach to ESL

Antonio GENOV, NXP Semiconductors Loic LECONTE, NXP Semiconductors François VERDIER, University Cote d'Azur, CNRS, LEAT

NXP logo are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2020 NXP B.V.







# Agenda







### Early-Stage power estimation Current approaches

#### **RTL and below**

- Currently, power modeling is initiated at the RTL level
- It is risky to neglect power in preceding steps
- At RTL level power modeling is applied individually to each IP
  Efficient, BUT ...
  - □ High top-level complexity and difficult verification
- Huge simulation time at RTL level

#### System level

- A rough idea of what energy consumption might be
- Swapping between documentations and digging for information (analytical approach)
- No tools available or standard methodologies for power estimation and modeling at this level
- Impossible to verify if the chosen architecture is optimal (PPAC)







Complexity

# Overview of PwClkARCH

- □ C++ classes and SystemC/TLM modules [1];
- □ Developed in LEAT;
- □ UPF-based high-level approach (TLM):
  - Design elements (DE), power domains, power switches, supply nets, Power State Table (PST).
- □ Added clock description [2]:
  - Clocks, clock domains, DPLLs, Clock State Table (CST),
  - ✓ Power intent is built around the DEs.
  - ✓ The Power Intent and the OPP Table allow us to test power management technics automatically controlled by a PMU (Power/Clock gating, DVFS).
- □ Functional model / Power intent separation;
  - ✓ No intrusive power code  $\rightarrow$  reuse and efficiency.
- □ Parallel DYNAMIC Co-simulation;
- Continuously monitoring power consumption







# Overview of PwClkARCH

- □ C++ classes and SystemC/TLM modules [1];
- □ Developed in LEAT;
- □ UPF-based high-level approach (TLM):
  - Design elements (DE), power domains, power switches, supply nets, Power State Table (PST).
- □ Added clock description [2]:
  - Clocks, clock domains, DPLLs, Clock State Table (CST),
  - ✓ Power intent is built around the DEs.
  - ✓ The Power Intent and the OPP Table allow us to test power management technics automatically controlled by a PMU (Power/Clock gating, DVFS).
- □ Functional model / Power intent separation;
  - ✓ No intrusive power code  $\rightarrow$  reuse and efficiency.
- □ Parallel DYNAMIC Co-simulation;
- Continuously monitoring power consumption







### Modeling & Reusability Application and Flow integration

The study is done on NXP i.MX8 SoC family platforms and the proof-of-concept was executed in three main phases:







#### Modeling & Reusability Initiation of Modeling and proof-of-concept Behavioral – functional modeling Behavioral – functional modeling

#### approach:

- □ Careful choice of granularity & abstractions
- Portability and functional accuracy
- □ Basic skeleton modules for uniform structure
- Adapted for continuous refinement

#### **Communication** – configurable interfaces:

- Separation remove protocol-related code from behavioral model
- □ Reusable communication interface module (AT)
- Protocols declared externally in derived specs
- Create and test protocols without code modifications
- Modeled AXI interfaces
  - Abstractions only features influencing the DUT power consumption

#### Behavioral modeling





#### Modeling and Reusability Initiation of Modeling and proof-of-concept Behavioral – functional modeling

#### approach:

- □ Careful choice of granularity & abstractions
- Portability and functional accuracy
- □ Basic skeleton modules for uniform structure
- Adapted for continuous refinement

#### **Communication** – configurable interfaces:

- Separation remove protocol-related code from behavioral model
- □ Reusable communication interface module (AT)
- Protocols declared externally in derived specs
- Create and test protocols without code modifications
- Modeled AXI interfaces
  - Abstractions only features influencing the DUT power consumption

#### **Behavioral modeling**





## Modeling and Reusability SM Functional/Power (i.MX8QM and i.MX8QXP)

We have created the Switch Matrix power model following the UPF separation of concerns semantics.





### Modeling and Reusability SM Functional/Power (i.MX8QM and i.MX8QXP)

We have created the Switch Matrix power model following the UPF separation of concerns semantics.



Considering the separation between the functional and power models, we can easily change and/or reuse them together or separately.





# Modeling and Reusability First model calibration – i.MX8QM

- □ Model created in parallel with the development of the QM SoC (medium to final production phase).
- Available important silicon measurements.
- Initial parameters estimated from measurements:
  - 1. Estimated current for each unit with/without traffic;
  - 2. Extracted capacity and leakage resistance;
  - Initial activity factors relative to the estimated 3. capacity and leakage resistance: rrent [mA]
    - $\Box$  Active unit with traffic = 1.0;
    - $\Box$  Active unit without traffic = 0.7;
    - $\Box$  Inactive unit = 0.

#### Refined using Physical data:

□ CPM, Cell count, toggle rates, ....









### Modeling and Reusability i.MX8QM Simulation results – Memcopy & Display Refresh

Two major testcases:

□ Memcopy:

- > 1 Active traffic generator;
- > 256KB/1MB DRC/LPDDR4 read accesses;
- > Sequential memory accesses (16 bytes data);
- > 4K interleaving for memory optimization.

Display refresh:

- > One frame of transactions (= 259200 trans);
- > Pixel size 32 bits with frequency 138.5 MHz;
- > Refresh rate 60Hz (Frame rate 60fps);
- Observations:

Power savings using power management.Memory activity observation.

□ Spikes coming from the interleaving between the two memories + passing transactions power consumption.





### Modeling and Reusability Correlations – Baylibre ACME

- Key State 1 (KS1): System idle
- Key State 2 (KS2): Leakage measurement for tester

Total Power [mW]

- *Key State 3 (KS3): System idle with display*
- Key State 4 (KS4): Stream/Memcopy





| i.MX8QM Hardware units | Silicon/Simulation power correlation |
|------------------------|--------------------------------------|
| SM dynamic power       | 85-99%                               |
| SM static power        | 85-90%                               |
| 2 DRCs dynamic power   | 85-90%                               |
| 2 DRCs static power    | 90-96%                               |
| Total power            | 88-92%                               |
|                        |                                      |

Stream/Memcopy (Blue) and Display ON (Orange) measurements





### Modeling and Reusability Correlations – LabView<sup>™</sup> & NI DAQ<sup>™</sup>

LabView NIDAQmx data acquisition model







### Modeling and Reusability Interoperability & IP-reuse: from i.MX8QM to i.MX8QXP

The functional, power, and IP-specific libraries of reusable units we created allowed us to easily model other platforms.







- Reuse the already available components including the power intent.
- Reconfigured them (without changing their code).
- Total power correlation of about 90%.





### Modeling and Reusability Interoperability & IP-reuse: from i.MX8QM to i.MX8nextGen

Reusing the same approach and combining the reusable units, we modeled the three generation i.MX8 SoC platforms.







### Modeling and Reusability New i.MX8 SoC generation









Display Refresh: i.MX8 platforms comparison – Total power



Display Refresh: i.MX8 platforms comparison - Energy





### Modeling and Reusability Interoperability & IP-reuse: Deliverable example

- PwClkARCH provided as compiled library (not open-source).
- Combination of Functional + Power models.
- As libraries or source files.
- PowerIntent and PowerManagement files are configurable header files.
- Replaceable Master module (user defined power management strategy).
- Certain flexibility on "open" information.







### Conclusion

For initial development when functional IPs were available:

i.MX8QM PwClkARCH adaptation (first effort) – 2 week – 1 person

#### New platforms modeling with reusable blocks:

- i.MX8QXP functional model + verification 1-2 days 1 person
- i.MX8QXP PwClkARCH adaptation + verification 1 day 1 person
- i.MX8nextGen functional model + verification 3-4 days 1 person
- i.MX8nextGen PwClkARCH adaptation + verification 2-3 days 1 person

#### □ Total – 3-4 weeks and 1 person for the integration and initial analysis of a power-aware reusable IP in 3 different high-performance SoCs

#### □ Previous publications (related to this study):

 A. Genov, L. Leconte, F. Verdier "Mixed Electronic System Level Power/Performance Estimation using SystemC/TLM2.0 Modeling and PwClkARCH Library"; DVCon EUROPE, Virtual, 27<sup>th</sup> –28<sup>th</sup> October 2020.

• A. Genov, A. Ben Ameur, F. Verdier, L. Leconte *"Timing-Aware high-level power estimation of industrial interconnect module"*; DVCon EUROPE, Virtual, 27<sup>th</sup> –28<sup>th</sup> October 2020.





Thank you! Q&A



