Skip to main content
Blog

Solving bus and software deadlock problems in complex SoCs

By December 12, 2023No Comments

By: Siemens | Tessent Embedded Analytics | Author: Huw Geddes, Product Manager

Intermittent bus and software deadlocks are amongst the toughest problems for development teams to detect and diagnose, particularly on complex SoCs with hundreds of cores, many shared resources and complex systems that include 2.5D and 3D integration. Hard-to-find corner cases can cause complex SoCs to hang or stall intermittently and unpredictably, sometimes after days of continuous normal operation.

Conventional approaches either ignore the problem, or attempt to deal with it by generating massive, unmanageable data sets. A smarter approach is required that focuses on generating meaningful, actionable information, that allows chip design teams to truly understand the behaviour of today’s complex SoCs and root cause the problem quickly. 

Hardware bus deadlocks

Bus deadlocks occur when a processor stalls, waiting for a response to be received via an on-chip bus (such as AXI), from another on-chip sub-system. Traditionally, the only way of isolating such problems has been to attempt to continuously trace and output all bus activity, requiring a high-bandwidth off-chip connection to gather the data, and powerful offline analysis software of huge data-sets. The Tessent Embedded Analytics solution uses a smart on-chip protocol aware bus monitor which can be triggered when the time taken for a bus transaction exceeds a programmable limit. When triggered by a deadlocked transaction, the system identifies the complete transaction ID and address, guiding the engineer to the problem transaction. 

Software deadlocks

Software deadlocks are increasingly common in today’s SoCs which contain multiple CPUs. In a typical scenario, two different software processes might use a locking mechanism to govern shared access to common on-chip resources: for example, another core, hardware peripherals or the capabilities of another software process. Some cores might share access to the same peripherals – perhaps a keyboard and a display – with problems arising when each CPU believes that the other has locked its access to the shared resources. In this case Embedded Analytics provides an on-chip status monitor which can be used to detect the fault condition, halt the processors and initiate data capture to identify and isolate the problem. As multi-core systems and heterogenous architectures become more common, this type of behaviour can become more prevalent. Tessent Embedded Analytics is a vendor-neutral architecture, which supports many different bus protocols and processor families, making it possible to solve these situations. 

The Tessent Embedded Analytics solution

Tessent Embedded Analytics technology allows chip designers to intelligently and non-invasively monitor what’s happening inside their products, while they operate under real-time workloads. Captured data can then be output over hardware interfaces to host software to analyze, allowing engineers to investigate particularly difficult conditions that can cause devices to fail intermittently and unpredictably, including bus and software deadlocks. 

SoC debug and silicon validation are key challenges facing the global electronics industry today. Tessent Embedded Analytics technology creates an on-chip system monitoring and debug infrastructure that engineers and developers can use for pre- and post-silicon analysis. The combination of hardware IP and software helps to reduce many hard to detect risks inherent in the complex many-core multi-dimensional chip architectures under design currently, which improves time-to-market with increased quality and reduced costs.  

Learn more

Tessent Embedded Analytics from Siemens EDA offers an integrated range of hardware and software tools that accelerate debug of RISC-V based SoCs.

To learn more, visit: https://eda.sw.siemens.com/en-US/ic/tessent/embedded-analytics/