Vighnesh Iyer

vighnesh.iyer@berkeley.edu
Github, LinkedIn


59th Design Automation Conference (DAC 2022)

This was my first DAC at Moscone West. It was a good experience - lots of vendors at the exhibition halls to talk to, several decent research paper sessions (but relatively poor engineering track sessions), and interesting panels (where every panelist brought a short set of slides to discuss their perspective on the question posed).

Sunday, 7/10/22

Workshop on Design Automation for the Certification of Autonomous Systems (DAC-AS)

This repository provides a framework for wrapping a pre-trained neural network with uncertainty estimates. It is designed to work with any pytorch model. We implement several such wrappers in a general framework. Given a pretrained DNN model : torch.nn.Module, the distribution that the network parameterizes dist_fam : nn_ood.distributions.DistFam, and a PyTorch dataset containing the training data dataset : torch.utils.data.Dataset, we can construct a uncertainty-equipped version of the network as follows:

3rd ROAD4NN Workshop: Research Open Automatic Design for Neural Networks

The Fifth International Workshop on Design Automation for Cyber-Physical Systems (DACPS)

CAD for Hardware Security Workshop (CAD4Sec)

Nicole Fern of Riscure on "Hardware Shift-Left: Pre-silicon Fault Injection Evaluation and Power Side Channel Testing"

Cadence FV Presentation

  • State Machine Deadlock -> Denial of Service
  • Buffer Overflow -> Data Corruption, Unexpected Control Flow
  • Incorrect Register Access -> Secure Data Leakage or Corruption
  • Unexpected X-propagation -> Data Corruption, Unexpected Control Flow
  • Bus Protocol Violation -> Data Corruption
  • Improper ECO Implementation -> Vulnerability Insertion

Jason Oberg of Cycuity on his "Radix" Tool for Crypto Asset Tracking

A critical component of the security verification process is security analysis. This is crucial to ensure that the security requirements are concisely specified, as well as to assist in identifying unknown design weaknesses. By using Radix’s security analysis capabilities, we were able to validate that the random constant key never makes it to the output of the OTP controller in an unscrambled form, which is a good thing.

While applying Radix to the OTP controller, we also identified intermediate values of the random constant key appearing on the output of the scrambler. This is interesting and surprising, but was determined to be a low risk since the intermediate values are protected at the boundary of the OTP output. Even so, this information enabled OpenTitan to push a fix mitigating this leakage out of an abundance of caution to potential future threats.

EDA to Power Through Semiconductor Cycles - Charles Shi, Needham & Company

Historically the semiconductor industry goes through a boom-bust cycle every 3-4 years. Despite the chip shortage headlines, Wall Street is increasingly skeptical about the longevity of the current semiconductor boom cycle. The outlook is still bright, but there are dark clouds on the horizon. If the chip shortage turns into a glut, will a downturn affect the EDA industry? Our answer is no. In this presentation, I will walk you through reasons why EDA industry will power through semiconductor cycles and emerge on the other side stronger.

Monday, 7/11/22

Keynote: Advancing EDA Through the Power of AI and High-performance Computing

PowerPro Presentation @ Siemens Booth Level 2

Joules Discussion with Cadence Engineers @ Cadence Level 1

Discussion about the capabilities of Joules and comparison with PowerPro
  • Window size can be arbitrary and so can frame count - number of frames is just an attribute - it can be set to any value before running analysis
    • Daniel: doesn't seem that way in the documentation - frame count is specified as max 1000
    • Cadence: there is a way to do this, all of our customers use it to get arbitrary length cycle-by-cycle power traces
    • Later: we found out that there is indeed a secret batch flow that isn't described in any of Cadence's documents but was sent to us by an AE
  • Only Joules has a full featured synthesis flow baked in (all analysis is done at post-synthesis gate-level)
    • Genus is invoked: multi-Vt, logic rewriting / reduction, timing-aware synthesis, circuit selection, retiming, clock gate insertion, etc. are all captured
    • Genus is run with the default settings, except optimization of the most critical timing paths (this is what accounts for most of the full Genus synthesis time supposedly). All synthesis optimizations are performed as usual in the default Joules flow.
  • Joules has a "what-if" mode + clock gating suggestions
    • Unlike other tools, Joules actually synthesizes the proposed enable logic for a clock gate and determines if the power saved by gating actually exceeds the power drawn by the gating logic - they can also capture gating logic reuse and evaluate the need for logic duplication
    • They claim PowerPro often suggests gating logic that actually increases power on net - only caught after PnR power sims are performed and ends up wasting design time
  • Joules can take RTL waveforms and can rapidly replay them on GL netlists internally (no external GL sim tool required - we already knew this)
  • Joules has a fully-featured timer, unlike PowerPro
    • The timer considers clock constraints and performs timing-driven synthesis as usual
    • This means if clock constraints are updated, the underlying synthesized netlist will reflect that (more LVT/ULVT cell usage)
      • Supposedly PowerPro will give garbage power numbers if the clock constraints are updated since it uses an old GL netlist as the starting point and never re-performs synthesis
    • This also means that the power numbers from Joules are meaningful since they account for the actual clock period achievable - Joules' estimate will be much closer than PowerPro which just assumes you're running at the clock frequency specified in the constraints
  • PowerPro comparison (they were a little agitated about the claims the PowerPro people make and seem to understand what PowerPro actually does behind the scenes)
    • PowerPro: expects a GL netlist as the 'training' data from the get-go - I'm not sure about this
    • 3% power error from post-GL power sim to signoff isn't reasonable (only for average power maybe), peak and per-cycle will be very wrong (according to the Joules guys)
    • The way PowerPro and PowerArtist work is completely different from Joules
      • Joules: use Genus engine to perform quick synthesis -> actual mapped GL netlist -> accurate cap estimate (using CCS cap model) -> power estimate
      • PowerPro: map RTL to generic gates -> map generic gates to PDK -> sprinkle estimates about multi-Vt usage, CG insertion, sizing -> use NLDM model from .lib -> power estimate
    • Joules will take longer than PowerPro, but it comes with much more accuracy - there is no synthesis 'estimation' - it is actually performing synthesis
    • PowerPro uses statistical switching factor models to compute toggle rates at combinational nodes from register toggle rates - Joules performs actually trace-level analysis at sub-cycle-level granularity
  • Cadence admits that HW DSE isn't something that Joules is suitable for
    • Incremental synthesis within Joules is still not a feature (but will be soon when it lands in Genus) - they synthesize from scratch every time since accurate power recommendations are required
    • They talked about a one-off customer design they worked on to build a power model that was parameterized on a bus width but this is not some generic functionality
    • In general, tools like Joules are too slow for HW DSE - customers usually build architectural models (based on (linear) regression against Joules golden power numbers) that use Joules to get subsystem-level power (manually parameterized) that are used for this purpose during architectural exploration

Taming the Validation Dragon with Formal and Static Verification (Engineering Track)

Automating the Front End and Facing the Big Picture (Engineering Track)

ML Based Abnormal Simulation Detector in SoC Verification (Samsung)

Automatic Debug Knowledge Sharing Platform in SoC Verification (Samsung)

SMART Adaptive Regression Using Nearest Neighbors Algorithm

John Cooley's DAC Panel

Abstract: Come watch the EDA troublemakers answer the edgy, user-submitted questions about this year's most controversial issues! It's an old-style open Q&A from the days before corporate marketing took over every aspect of EDA company images.

Tuesday, 7/12/22

Updates from NVIDIA

Various things going on at NVR VLSI

RTLflow is a GPU acceleration flow for RTL simulation with batch stimulus. RTLflow first transpiles RTL into CUDA kernels that each simulate a partition of the RTL simultaneously across multiple stimulus. It also leverages CUDA Graph for efficient runtime execution. We build RTLflow atop Verilator to inherit its existing optimization facilities, such as variable reduction and partitioning algorithms, that have been rigorously tested for over 25 years in the Verilator community.

  • They have a dedicated ML for EDA now with 6 people and growing fast to over 10 this year

    • This is going to become a thing for all semi research houses soon
  • Ongoing work on using Intel PIN to instrument an architectural simulator (SystemC), capture dynamic dataflow graphs of execution (as CDFGs), perform some kind of graph embedding, and eventually target RTL coverage prediction

  • Report from the DV people: we don't really care about coverage closure, we always get it done eventually. However, the RTL bug localization problem is still the biggest bottleneck in verification throughput. Fix that!

Machine Learning for Synthesis and Synthesis for Machine Learning (Research Track)

High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing

Functionality Matters in Netlist Representation Learning

Enabling Automated FPGA Accelerator Optimization Using Graph Neural Networks

Iterate and Scale: Designing Stronger and Safer Embedded Systems

SCAIE-V: An Open-Source SCAlable Interface for ISA Extensions for RISC-V Processors

We present SCAIE-V, a highly portable and feature-rich ISAX interface that supports custom control flow, decoupled execution, multi-cycle-instructions, and memory transactions. The cost of the interface itself scales with the complexity of the ISAXes actually used.

Fantastic SoCs and What to Learn!

Chiplet Actuary: A Quantitative Cost Model and Multi-Chiplet Architecture Exploration

A Fast Parameter Tuning Framework via Transfer Learning and Multi-objective Bayesian Optimization

ML for Verification: Does it Work or Doesn’t It? (Panel)

Exhibition Floor

X-Epic

X-Epic was established in March 2020 and has obtained six rounds of financing according to reports, all of which are worth hundreds of millions of yuan (see Chinese EDA hopeful raises $30 million).

In November 2020 X-Epic released simulation technology supporting domestic computing architectures and in November 2021 launched four products: HuaPro-P1, a FPGA prototype verification system; GalaxSim-1.0, a digital simulator, GalaxPSS, a verification system, and GalaxFV, a formal verification tool based on word-level modelling.

Sigasi

VerifAI

An example csv file:

param0, param1, param2, cov_metric
0.5, 0.2, 0.4, 4
0.6, 0.1, 0.3, 5
0.2, 0.9, 0.2 2

Analog Innovation

Cadence

Open-Source EDA Birds-of-a-Feather Meeting at DAC

This repository contains RosettaStone, which leverages a standard physical design data model (LEF/DEF 5.8) and open-source database implementation (OpenDB in OpenROAD) to effectively connect the academic physical design field's past, present and future. RosettaStone's shared data model enables richer integrations, flow contexts, and assessments for research.

Cadence Scripts Can be Public Now!

Metrics4ML

In this repository, you will also find an overview of METRICS2.1, an open-source format for collecting design and tool metrics for an RTL-to-GDS flow.

Wednesday, 7/13/22

Breaking Down Physical Design Barriers with Open and Agile Flow Tools

SiliconCompiler Presentation and Demo @ Open Source Area

mflowgen Talk

HAMMER Talk

Designing and Verifying for Power at the Front End

Learning-based Power Modeling for Versal AI Engine

Machine Learning for Electronic Design Automation: Irrational Exuberance or the Dawn of a Golden age (Panel)

Posters

Coq Based DRAM Timing Model and Property Verifier

Deep RL Placer

Pulp-Based PMU FPGA Prototype

XLS (Google's HLS IR)

RVVI (RISC-V Verification Interface)

Thursday, 7/14/22

Teardown

SemiCon West

Automating Analog Layout - Has the time finally come?

Synopsys' Take

Elad's Take

Steve Burns' Take

NVIDIA's Take

So You Want a Better Design? Go with Faster Timing and Lower Power Please!

GATSPI: GPU Accelerated Gate-Level Simulation for Power Improvement

hgdb

Conclusion

The conference was much larger than I expected and there were many interesting sessions.

Some hangups:

Some followups for me
  • NVIDIA
  • Intel (SDCs and SimCommand benchmark)
  • Coq DRAM model people