Skip to content

Latest commit

 

History

History

README.md

AMD logo

AMD Vitis™ AI Engine Tutorials

Refer to the Vitis™ Development Environment on amd.com
Refer to the Vitis™ AI Development Environment on amd.com

Versal Emulation Waveform Analysis

Version: Vitis 2025.2

Introduction

Simulating a complete system in the Vitis unified software platform elets you execute a near-hardware run of a design without the hardware. Simulation has the added benefit of detailed waveform analysis during hardware emulation. You can use this to identify issues in the programmable logic (PL), AI Engine interfaces, and memory read/writes that can be harder to debug on hardware.

This tutorial demonstrates how you can use the AMD Vivado™ Design Suite logic simulator (XSIM) waveform GUI, and the Vitis analyzer to debug and analyze your design for an AMD Versal™ adaptive SoC. It steps through the process of building a design for hardware emulation, launching emulation with waveform viewing. It also provides detailed information on how to read the waveforms, and using the Vitis analyzer to continue the analysis with generated trace output waveforms and data.

It is strongly recommended to go through the Versal Integration Tutorial and the Versal System Design Clocking Tutorial before running this tutorial.

IMPORTANT: Before beginning the tutorial, Install the Vitis 2025.2 software. The Vitis release includes all the embedded base platforms including the VCK190 base platform used in this tutorial. Also download the Common Images for Embedded Vitis Platforms from this link. The common image package contains a prebuilt Linux kernel and root file system. You can use this with the Versal board for embedded design development using Vitis.

Before starting this tutorial run the following steps:

  1. Go to the directory where you have unzipped the Versal Common Image package.
  2. In a Bash shell run the /Common Images Dir/xilinx-versal-common-v2025.2/environment-setup-cortexa72-cortexa53-amd-linux script. This script sets up the SDKTARGETSYSROOT and CXX variables. If the script is not present, you must run the /Common Images Dir/xilinx-versal-common-v2025.2/sdk.sh.
  3. Set up your ROOTFS and IMAGE to point to the rootfs.ext4 and Image files located in the /Common Images Dir/xilinx-versal-common-v2025.2 directory.
  4. Set up your PLATFORM_REPO_PATHS environment variable to $XILINX_VITIS/base_platforms/.

This tutorial targets VCK190 production board for 2025.2 version.

Objectives

After completing this tutorial, you can do the following:

  • Use XSIM as a live waveform viewer to view signals to and from the AI Engine including stream data and runtime parameters (RTP).
  • Read/understand Transaction Level Modeling (TLM) information in a waveform.
  • Use the Vitis analyzer to read trace and profile data.

Tutorial Overview

Design Overview

The design is a simple FIR filter that takes in random noise generated by the PL kernel. It gets random_noise and asynchronous RTP updates to the AI Engine to update the FIR filter coefficients. The host code reads back the coefficients to confirm the coefficients are applied.

Block Diagram

Transaction Level Modeling

Transaction level modeling (TLM) models Control Interfaces and Processing Systems (CIPS), network on chip (NoC), and AI Engine blocks, using SystemC, to show transaction-level communication in the waveform. It is cycle-approximate modeling that provides high-level information such as the address and data of the transactions to/from DDR memory or a specific PL kernel.

In the following diagram, the CIPS, NoC, and AI Engine are modeled in SystemC.

Model Block Diagram

Steps

Step 1: Building the Design

Step 2: Launching Emulation with the XSIM Waveform GUI

Step 3: Using XSIM Waveform GUI and QEMU

Step 4: Using Vitis Analyzer

Step 1: Build Design

  1. To build the design run the following commands.

    make aie
  2. After the ADF graph compiles, run the AI Engine simulator (aiesimulator) to get additional profile data. This ensures the design is simulating correctly and generates extra profile information for performance analysis and optimizing the kernels.

    To run the simulator, run the following commands.

    make aiesim

    Running emulation creates a new directory, aiesimulator_output. This directory contains a file called aiesim-options.txt.

  3. Open the aiesim-options.txt. Content is similar to the following.

    AIE_PKG_DIR=/path/to/<tutorial>/./Work
    AIE_DUMP_VCD=tutorial
    AIE_PROFILE=All
    
  4. Close the text file.

    NOTE: To view all the aiesim_option.txt values, refer to Reusing AI Engine Simulator Options.

  5. Run the rest of the build process using the following commands.

    make kernels
    make xsa
    make host
    make package

Step 2: Launching Emulation with XSIM Waveform GUI

After the building and packaging of the design is complete, you can run hardware emulation on your design. Make sure that launch_hw_emu.sh is in the sw directory.

  1. To launch emulation with the XSIM Waveform GUI, run the following command.

    ./launch_hw_emu.sh -g -aie-sim-options ../aiesimulator_output/aiesim_options.txt

    OR

    make run_emu

    OR

    You can include more options during emulation launch to view transactions logs:

    To view all the transactions generated by PS (QEMU) to either PL/AIE, set the env variable ENABLE_RP_LOGS=true. You can view the logs at sim/behav_waveform/xsim/rp_log.txt.

    NOTE: You cannot view the PS to DDR transactions here as QEMU has a backdoor direct connection into DDR buffer.

    To capture AIE transaction logs generated at runtime, set the env variable ENABLE_AIE_DBG_TRACE. You can view the logs created in the folder aie_log/ for example, sw/sim/behav_waveform/xsim/aie_log/S00_AXI.log file. This helps to debug the AIE systemC models only and contains transaction information per interface at each simulation cycle.

    At launch emulation, you can pass -xtlm-aximm-log switch. This logs all the transactions generated from CIPS to AIE or PL captured in xsc_report.log file, for example, sw/sim/behav_waveform/xsim/xsc_report.log.

    ./launch_hw_emu.sh -g -aie-sim-options ../aiesimulator_output/aiesim_options.txt -xtlm-aximm-log

    The terminal shows the following.

    Starting QEMU
    - Press <Ctrl-a h> for help
    Waiting for QEMU to start.
    running directly on console
    QEMU started. qemu_pid=3208
    Waiting for PMU to start.
    qemu-system-aarch64: -chardev socket,path=./qemu-rport-_pmc@0,server,id=ps-pmc-rp: info: QEMU waiting for connection on: disconnected:unix:./qemu-rport-_pmc@0,server
    PMC started. pmc_pid=3243
    qemu-system-aarch64: -chardev socket,id=pl-rp,host=127.0.0.1,port=7043,server: info: QEMU waiting for connection on: disconnected:tcp:127.0.0.1:7043,server
    XSIM started. xsim_pid=3300
    

    This shows QEMU starting and launching XSIM. The QEMU and XSIM are linked together, meaning closing one closes the other. The use of the -g flag opens up the XSIM Waveform GUI as shown in the following image with two config files (.wcfg and Untitled1).

    XSIM GUI Startup Default

    You can keep any of the one file (preferably close .wcfg file) to continue adding signals or creating wave groups for waveform analysis.

    XSIM GUI Startup

    In this view, you can select the signals you want to watch from the Scope and Objects views.

  2. In the Tcl Console, at the bottom of the view, run the following command.

    source ../../../../tcl/add_waveforms.tcl

    The add_waveforms.tcl file removes any default signals provided by the simulation environment, and adds in all the signals you want to view. There are some signals that are important to have such as: NoC, DDR memory, PL Kernel, and CIPS signals. Your design interacts with these components. Tracing signal changes from CIPS to the NoC to/from DDR memory, and then to your design is helpful in debugging data transfer issues. This file contains the following.

    ## Remove all waveforms before adding new ones
    remove_wave -of [get_wave_config] [get_waves -of [get_wave_config] -regexp ".*"]
    
    ## Set the appropriate paths based upon the platform being used
    set scope_path "/vitis_design_wrapper_sim_wrapper/vitis_design_wrapper_i/vitis_design_i"
    
    ## Create a wave group called CIPS and add all signals for the CIPS_0 to it
    set CIPS [add_wave_group CIPS]
    set cips_intf [get_objects -r $scope_path/CIPS_0/* -filter {type==proto_inst}]
    add_wave -into $CIPS $cips_intf
    
    ## Create a wave group called NOISE and add all signals of the random_noise_1 to it
    set NOISE [add_wave_group NOISE]
    set noise_intf [get_objects -r $scope_path/random_noise_1/* -filter {type==proto_inst}]
    add_wave -into $NOISE $noise_intf
    
    ## Create a wave group called S2MM and add all signals of the S2MM kernel to it
    set S2MM [add_wave_group S2MM]
    set s2mm_intf [get_objects -r $scope_path/s2mm_1/* -filter {type==proto_inst}]
    add_wave -into $S2MM $s2mm_intf
    
    ## Create a wave group called CIPS_NOC and all signals of the CIPS NoC to it
    set CIPS_NOC [add_wave_group CIPS_NOC]
    set cips_intf [get_objects -r $scope_path/cips_noc/* -filter {type==proto_inst}]
    add_wave -into $CIPS_NOC $cips_intf
    
    ## Create a wave group called DDR4 and all signals to/from DDR4
    set DDR4 [add_wave_group DDR4]
    set ddr4_intf [get_objects -r $scope_path/noc_ddr4/* -filter {type==proto_inst}]
    add_wave -into $DDR4 $ddr4_intf
    
    ## Create a wave group called AIENGINE and all signals of the AI Engine block to it
    set AIENGINE [add_wave_group AIENGINE]
    set aie_intf [get_objects -r $scope_path/ai_engine_0/* -filter {type==proto_inst}]
    add_wave -into $AIENGINE $aie_intf

    NOTE: This file can be executed automatically from the launch_hw_emu.sh command by using the -user-pre-sim-script add_waveforms.tcl.

    IMPORTANT: Add all the signals you need before starting emulation. If after starting emulation, you pause it, and add more signals, there will not be any data for the new signals.

    You will see a waveform view as shown in the following figure.

    XSIM with new signals

  3. Expand the all signal groups in the view to get the following view.

    Expanded signals

  4. The tutorial design runs quickly. You cannot view anything meaningful on this small scale.

    Adjust the scale to 100 µs.

    Proper Scale

    TIP: You can adjust the scale to fit your needs as emulation is running.

NOTE: For more information about this simulator view and how to use it, refer to the UG900 Vivado Design Suite User Guide: Logic Simulation.

Step 3: Using XSIM Waveform GUI and QEMU

A great benefit of having a waveform viewer showing live data is that you can view how the signals interact with each other. This includes the programming of the AI Engine and device traffic to/from the DDR memory and traffic to/from the PL kernels. You can also view RTP data writing to the AI Engine.

  1. Click the Run All button (run all).

  2. Click back to the terminal where ./launch_hw_emu.sh was launched. The QEMU instance has begun booting and when the following messages display, QEMU launch is complete.

    versal-rootfs-common-20222 login: root (automatic login)
    
    petalinux
    [   53.752390] audit: type=1006 audit(1666762798.812:2): pid=585 uid=0 old-auid=4294967295 auid=0 tty=(none) old-ses=4294967295 ses=1 res=1
    [   53.753038] audit: type=1300 audit(1666762798.812:2): arch=c00000b7 syscall=64 success=yes exit=1 a0=8 a1=ffffc55ea440 a2=1 a3=ffff94b186b0 items=0 ppid=1 pid=585 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=1 comm="(systemd)" exe="/lib/systemd/systemd" key=(null)
    [   53.753819] audit: type=1327 audit(1666762798.812:2): proctitle="(systemd)"
                                                                                                                                   	 ^root@versal-rootfs-common-20222:~# mount /dev/mmcblk0p1 /mnt
    root@versal-rootfs-common-20222:~# cd /mnt
    root@versal-rootfs-common-20222:/mnt#  ls
    BOOT.BIN          a.xclbin          data              host.exe
    Image             boot.scr          embedded_exec.sh
    
  3. Press Enter a few times to clear these messages. The following prompt displays.

    Type in the following commands to launch the tutorial application.

    ./host.exe a.xclbin

    NOTE: This can take some time to complete because hardware emulation is collecting profiling and value change dump (VCD) data.

  4. Navigate back to the XSIM Waveform GUI and notice that signals are toggling. Scroll up and down to view all the signals that are starting to display data.

  5. Pause the execution of the design when all signals in the view stop toggling.

Exploring the Waveforms

One of the things the waveform viewer can help with is figuring out the order in which data transfers from a source to a destination. The following sections show how you can explore the various waveforms specific to certain communication/data transfer.

NOTE: If AI Engine kernels contain printf statements, the output shows up in the XSIM Waveform GUI in the Tcl Console. This is written to the simulate.log file after emulation closes.

Checking Proper Boot-up Using PMC

The first key step to make sure emulation is operating correctly is making sure that the PS can program the platform management controller (PMC). This system is responsible for booting and configuring the device. Seeing the signal through the CIPS, NoC, and the AI Engine is a sign that things are operating normally.

  1. To see this signal only, run the following Tcl script.

    source ../../../../tcl/bootup_signals.tcl

    Bootup Signals

  2. Zoom in to the first transactions by clicking and dragging the mouse from the upper left to the bottom right. You see similar to the following.

    Bootup Zoomed In

    Expand these signals, and notice that the NoC and the CIPS signals are all matched. This is showing that the CIPS is transferring configuration information to the PMC. These signals are TLM signals because the blocks of the device they are targeting are modeled in SystemC. In this view, a wide colored block might not be one transaction (because it is a us timescale), so zoom in, and notice there are more transactions occurring in a short amount of time.

    Looking at the last interface, vitis_design_ai_engine_0_0_S00_AXI_tlm, the majority of the transactions are writes. These are configuring the AI Engine to the graph created in Step 1. These writes are specific to the Configuration Data Objects (CDO) that are commands passed to the PLM to configure the device, and in this interface, the AI Engine.

  3. Zoom the window to full by clicking the mouse on the lower right side, and drag to the upper left.

Transactions Generated by PS (QEMU) to PL/AIE

You can view the logs at : sim/behav_waveform/xsim/rp_log.txt. You cannot view PS to DDR transactions here as QEMU has a backdoor direct connection into DDR buffer.

PL to AI Engine

After bootup and device configuration, the application can begin to run. In this design, there is a PL kernel called random noise that is generating data that is going directly into the AI Engine. The key here is looking at the s_axi_control interface to find out when the PS sends the run signal.

  1. To view the specific signals controlling the PL kernel, run the following command.

    source ../../../../tcl/pl_to_aie.tcl
  2. Expand the CIPS group down to the Row 0. Zoom into a specific region. In the following screenshot, the red area is the zoom region.

    Zoom region

    After the zoom in, something similar to the following displays.

    Zoomed in

    Here, notice that the PS is using the Full Power Domain (FPD) interface to send the AXI signal to turn on. Notice that there are two blocks shown. This is the PS telling the random_noise and s2mm kernels to start running.

    If you zoom in more, you can view more specifics of the transactions.

    Zoomed in 2x

    Notice that the s_axi_control has a read transaction slightly after the second transaction has started of the FPD interface.

  3. Expand the vitis_design_CIPS_0_0_M_AXI_FPD_tlm interface and the Outstanding Reads, and you see a Row 0. If you move the mouse over the #3 or #4, a context help menu shows you some signal information on where data is transferring. Notice the ARADDR value of 0xa4060000 for #3 and 0xa4050000 for #4, and understand that this is the address to the PL kernels that the Vitis linker autoassigns it during linking. From the host code, you can determine that these kernels are activated before the AI Engine, soon after the application starts, so it is safe to assume these signals start them. Remember that these kernels are simpler than others—more complex kernels result in different transactions.

  4. Zoom to fit by clicking the Zoom Fit button (zoom fit). Expand the NOISE group and expand Out_r.

    Random Noise

  5. Notice the large green line after the random_noise kernel starts. This is a series of many transactions of the PL kernel transferring data to the AI Engine. There are a few red sections in the waveform. This is a link stall, or where the kernel has stalled and is caused by the AI Engine.

  6. Zoom to fit by clicking the Zoom Fit button.

AI Engine RTP Signals

As mentioned in the Overview, this design is sending RTP values to the AI Engine through the graph.update() host application. From the host code, notice that there are two updates occuring, both with an array size of 12. Because these only apply to the AI Engine kernel, these write signals to the AIENGINE/S00_AXI interface. However, there are other signals that show the same values because these are the interfaces the data traverses to the destination.

  1. Run the following Tcl script to view the AI Engine signal only.

    source ../../../../tcl/rtp_signals.tcl
  2. Expand the AIENGINE group, vitis_design_ai_engine_0_0_S00_AXI interface, and expand the Outstanding Writes. Some write transactions display; go to the second visible instance shown in the following figure.

    RTP zoom loc

  3. Zoom into the transaction of writes until something similar to the following screenshot displays.

    RTP Signals

    NOTE: Depending on the time the host application runs, the exact same times might not display.

    Here, notice that there are 12 writes transferring to the AI Engine. These are the RTP coefficients to be updated in the design.

    Expand Row 0, and hover the mouse where it says Data. Notice the pop-up as shown in the following figure.

    RTP One

    Notice that there is data presented here. It is in the radix of hexidecimal, and reads 0xB4. Converting this to decimal is 180, which is the first coefficent in the area for updates.

    TIP: There are two RTP updates occurring. If you follow the same write signal, you find the write transactions for the second update.

  4. Click the Zoom to Fit button.

AI Engine to PL to DDR Memory

After the RTP update sends, you can start to see output data writing to DDR memory. In this design, the AI Engine is sending data from the S00_AXIS interface and getting it to the s2mm kernel. This kernel is a FIFO written in HLS and writes the output to DDR memory.

  1. To view these signals run the following command.

    source ../../../../tcl/aie_to_ddr.tcl

    Expand S2MM and You should see something similar to the following figure.

    AIE to DDR

    Notice the transactions in green are slightly ahead of the tan. This means those signals are going first. The datapath is the AI Engine kernel to the interface tile, then to the AIENGINE/M00_AXIS interface. Notice how AIENGINE/M00_AXIS and S2MM/s interfaces match, meaning they are connected. The same applies to the S2MM/m_axi_gmem and the DDR4/S00_AXI interfaces on the noc_ddr4 IP.

    After the data is stored into DDR memory, the host application can then access it.

  2. Expand the CIPS_NOC group. The last transactions on the cips_noc_0_M00_AXI_tlm and the cips_noc_0_S00_AXI_tlm interfaces as shown in the following screenshot. This is the host application reading the data that was stored by the s2mm kernel.

    DDR to PS

    Zoom in and you should see the following.

    DDR to PS zoom in

  3. When emulation completes, close the XSIM GUI. This closes the QEMU and the emulation. Discard the waveform at the pop-up prompt.

  4. Navigate back to the terminal that launched emulation.

Limitations

Note the following limitations of the waveform viewer:

  • Use VCD to view signals internal to the AI Engine. They are not integrated in the general XSIM Waveform GUI.
  • CIPS, (QEMU model) which executes the software program, is purely a functional model with no timing accuracy. The NoC, DDR memory, and AI Engine are cycle-approximate models.
  • Bandwidth and latency estimation are approximate, based on the accuracy of the individual IP models.

Step 4: Using Vitis Analyzer

After emulation completes, you can look at the profiling and VCD trace data that was generated at the same time. If profiling and VCD signal features are not used, emulation runs faster.

Using the XSIM Waveform GUI to view waveforms is powerful in allowing you to view the datapath and flow of the design. You can also use it to debug potential issues like hangs. However, this only shows the PL side of the system. To investigate the AI Engine signals, you need to use the VCD trace in the Vitis analyzer. To use the Vitis analyzer, open up a .aierun_summary file.

  1. Open the run summary of the design by running the following command.

    vitis_analyzer sw/sim/behav_waveform/xsim/default.aierun_summary &

    When the summary is open, you should see something similar to the following.

    VA_overview

  2. Here you can see various reports: Summary, Trace, Profile, Graph, Array. Click Trace to open up the VCD data collected during hardware emulation.

    VA_overview_1

    Notice the inner traces of the graph through a tile hierarchy. Selecting a net, tile, function, or any object in this view cross-selects to various views. This can help with identifying specific nets and functions.

  3. Open the Graph view, and click the Buffers tab.

  4. To find the RTP buffers, click the Search button (search), and type in coeffs.

    A window like the following displays.

    VA_rtp_buffers

  5. Select the three coeffs buffers, and click the Trace view again. Notice that the lock signals are highlighted.

    VA_trace

  6. If you scroll up, notice that the FIR filter kernel begins to process data soon after the RTP is read.

    VA_trace_rtp

  7. Open up the Profile report. Here you can view specific information about the kernel and the tile it occupies.

    VA_profile

  8. Click Total Function Time, and notice the following:

    VA_func_time

    This information is useful because it helps determine how long the kernel runs. Use it with Trace to help determine if kernels are running optimally, or if there are stalls.

  9. You can analyze the AIE status and profile in HW emulation same as that of Hardware. The AIE status can be prior analyzed during HW emulation in Vitis Analyzer for debug purposes. The xrt.ini on PS has following entries and must be packaged in v++ package flow before launching emulation.

    aie_profile=true
    aie_status=true

As a result, two files: aie_status_edge.json and aieshim_status_edge.json are generated. You can copy these files back from the embedded linux file system into the host machine and then load them into Vitis Analyzer.

The AI Engine status copies to the following files when the host program is running:

  • xrt.run_summary: Run summary that contains a list of file information for use by Vitis Analyzer.
  • aie_status_edge.json: Status of AI Engine and AI Engine memory.
  • aieshim_status_edge.json: AI Engine interface tiles status.

For more details, refer to Analyzing AI Engine Status in Hardware Emulation.

  1. Close the Vitis analyzer.

Summary

In this tutorial you learned:

  • To read the waveform viewer to follow data flow pathing for a simple Versal adaptive SoC design.
  • To add/remove signals to the XSIM viewer to look at specific signals such as, NoC, DDR memory, PS, AI Engine.
  • View TLM signals and how they interact with the AI Engine and Versal adaptive SoC blocks.
  • Open and view Trace and Profile information in Vitis Analyzer.

Copyright © 2020–2025 Advanced Micro Devices, Inc.

Terms and Conditions