This demo (view animation in Flash) shows actual results from emulating an Mpeg-4 decoder, as described in the article below.
ZeBu, a pioneering hardware-assisted platform from Emulation and Verification Engineering (EVE), embodies the original “soft emulation prototype.” It is a combination of a prototyping board with a hardware emulator and a software debugger. Based on FPGAs with comprehensive visibility of the internal logic, ZeBu also meets most of the hardware designer’s needs.
ZeBu supports transaction-based verification. Hardware transactors enable rapid communication between a circuit mapped inside ZeBu and a testbench executed in the PC by moving the processing-intensive and time-consuming bit-level communication between test bench and design to within the emulator.

Unlike older and costly hardware emulators, ZeBu’s pricing facilitates its proliferation to each member, from software developers to hardware engineers, and provides benefits to the entire development process. The hardware design team maps the circuit in FPGAs and uses it for module testing of peripherals. Simultaneously, developers run fragments of critical code and develop peripheral drivers. As both teams use the same model, every bug fixed in the circuit benefits the developers. Similarly, peripheral drivers can be supplied to hardware designers as soon as they are written, enabling more comprehensive integration tests.
Ultimately, software developers and hardware designers work in cooperation, each using the interface to which they are accustomed. The software developer will use the “xt-gdb” C code debugger (standard GNU tool modified for the Xtensa processor) without any knowledge of ZeBu, while the latter will emulate the design. The hardware designer will use the ZeBu’s graphical interface to control the logic part of the simulation. Waveforms and monitors of each circuit bus and signal are constantly accessible, and contents of internal memories, even those deeply buried within the FPGAs, can be observed and modified at any time during execution. Circuit clocks are controlled from the graphical interface, even if simulation is run cycle by cycle to observe subtle changes in the circuit’s behavior, or set to run continuously for several million cycles at a time.
Sharing simulation control between software and hardware posed a challenge. While the software developer may need to stop the simulation by inserting a breakpoint in the C code, the hardware designer may want to stop the circuit when it achieves a specific status. ZeBu’s transactor support solved this problem.
As an example, consider the development of a low-power MPEG-4 decoder for wireless applications. To enable the viewing of film trailers on a mobile phone, it is necessary to design a small-scale MPEG-4 decoder that consumes little energy. The design of the decoder is founded on a reference C code for the MPEG-4 standard known as MoMuSys, issued by Open Source Initiative (OSI). It was based on a configurable Xtensa processor developed by Tensilica. Two optimizations were added to decode an MPEG-4 data flow –– the definition of an inverse discrete cosine transformation (iDCT) accelerator; and a set of instructions for vectors processing (SIMD), bit manipulation and extraction (bitstream), and YCbCr and RGB color format conversion.
The result met requirements –– a reusable component with approximately 120,000 gates, including a processor to run a program containing 200,000 lines of C code, and using less than 10 percent of processor resources. Running at a frequency of 200 MHz, the processor achieved real-time decoding of a video in QCIF format, at a rate of 15 frames per second.
The verification of the circuit presented some interesting challenges. Displaying a complete trailer for a film such as “Monsters Inc” requires approximately one billion clock cycles. Until now, there were no solutions that met both software developers’ and hardware designers’ requirements.
The MPEG-4 decoder was synthesized and mapped into ZeBu. Peripherals and the memory controller were validated at the hardware level, using simple software code. The full code for the decoder was put into the processor and connected to the software debugger. The transactor technology, which took four hours to create, was used to enable the capturing of each frame of the image from the decoder’s frame buffer into a PC window in real time.
The Xtensa processor in ZeBu runs at 30 MHz, generating a video picture of approximately 20 frames per second. Such performance produces real motion without any staggering effects, corresponding to an acceleration factor of around 50X compared to an ISS.
A verification platform like ZeBu allows for optimizing the joint use of hardware and software by modifying the way in which certain tasks are carried out. For a video decoder, for example, the test program in C normally includes the MPEG-4 data flow, or several megabytes of data. The loading of this program via the JTAG interface can take 10 minutes. Transferring the data flow loading process to hardware saves time –– ZeBu has direct access to the circuit memories via the PCI bus. In just a few seconds, the MPEG-4 video is downloaded via the hardware interface, while the C program continues to load via the software debugger and the JTAG interface.
Finally, while ZeBu appears to be the perfect solution for hardware-software co-verification, a question may be raised about its ability to detect, isolate, and fix bugs effectively and efficiently.
An example of the tool’s ability to debug hardware was the swiftness in finding a design error in a peripheral memory controller. The bug was causing the mixing of two pieces of data during certain consecutive reading and writing sequences. The symptom of the problem, which remained undetected during module testing, was that after two million cycles the decoded image became streaky, rather like an undecoded satellite TV image.
By using a combination of three ZeBu functions to analyze the problem we were able to rapidly detect and fix the bug. The three functions included: breakpoints in the C code to ascertain what the code does (software aspect), capture of the memory content to give an instant display of the incorrect words, and display of internal signals in the memory controller (hardware aspect).
Without the possibility of combining hardware and software development simultaneously, this bug would not have been fixed in time and would have severely affected the project schedule.