Embench-IoT Results on GaZmusino

The performance of a microcontroller or RISC-V core can be measured in many ways, but Embench-IoT has become a reference benchmark for embedded systems and IoT applications. In this post, we compare the performance of our core GaZmusino, evaluated via simulation with Verilator, against the results reported by Antmicro on their Embench-Tester platform.

Quick summary: GaZmusino is competitive in most benchmarks, but operations requiring heavy division show delays due to the lack of a hardware divider.


🔧 Methodology

The comparison was based on the following approaches:

  • GaZmusino: simulations performed using Verilator to obtain execution times for each benchmark.
  • Antmicro: public results available on the Embench Tester platform, reporting the performance of various open-source cores.
  • Compiler and flags: riscv64-unknown-elf-gcc 14.2.0 with -O2 optimization. Different versions may slightly affect execution times.

📊 Results

The table below is scrollable both horizontally and vertically, to allow viewing all benchmarks and cores.

Benchmark GaZmusino cv32e40p CVA5 VexRISCV serv picorv32 Ibex Rocket microwatt
cubic10244623928152326
nbody14459015972143952108
st184386907211214562
minver243882334213276277
nsichneu573967111612315755
wikisort65651662421423338598
crc32774192922515407268
nettle-aes773890712717294793
ud773679731115275965
aha-mont64754682131714317471
nettle-sha25682409362395336041
picojpeg863794941517256177
edn89429090210247752
matmult-int90399091313216244
primecount1126093127918368185
sglib-combined12145101981921337463
qrduino123541151132122357980
slre125551311392325398287
huffbench126461151194428359676
md5sum15964153137563265115115
tarfind22838143771620338870
statemate2985218120319244010195

Benchmarks ordered by GaZmusino performance


📝 Conclusions

  1. Strengths of GaZmusino

    • Very competitive in md5sum, qrduino, slre, huffbench, tarfind, and statemate.
    • Consistently outperforms lightweight cores such as serv, picorv32, and Ibex.
  2. Detected limitations

    • Low performance in cubic, nbody, st, and minver.
    • Main reason: absence of a hardware divider, divisions are handled in software, increasing latency.
  3. Overall comparison

    • Medium-high performance level, approaching more complex cores such as VexRISCV and CVA5.
    • Solid in integer, memory, and logic workloads; weak in division-heavy tasks.