Hello, RISC-V and QEMU

To experiment with RISC-V assembly, I've been trying to run and debug RISC-V binaries using QEMU.

This post covers things I'm unsure of and a topic I don't know much about. One might consider it learning in public. If I err — I probably do somehow — please don't hesitate to contact me.

Enter QEMU and a RISC-V toolchain

This obviously depends on your distribution. If you have QEMU installed, chances are you already have qemu-system-riscv64. Otherwise, or if your distribution does not ship an embedded RISC-V toolchain, pre-built toolchains and QEMU binaries are available from SiFive. You'll probably also want GDB. (A concise introduction to GDB may be found here.)

On NixOS, after skimming the wiki article on cross-compiling, I ended up with the following shell.nix. Unfortunately, it had me compile locally.

  pkgs = import <nixpkgs> {
    crossSystem = (import <nixpkgs/lib>).systems.examples.riscv64-embedded;
  shell = { mkShell, gdb, qemu, dtc }: mkShell {
    nativeBuildInputs = [ gdb qemu dtc ];
in pkgs.callPackage shell {}

Choosing a virtual machine

Naïvely, I ran my freshly built QEMU without any parameters.

$ qemu-system-riscv64
qemu-system-riscv64: Unable to load the RISC-V firmware "opensbi-riscv64-spike-fw_jump.elf"

Not sure if this is a distribution problem, but that file is apparently missing.

$ qemu-system-riscv64 -machine help
Supported machines are:
none                 empty machine
sifive_e             RISC-V Board compatible with SiFive E SDK
sifive_u             RISC-V Board compatible with SiFive U SDK
spike                RISC-V Spike Board (default)
virt                 RISC-V VirtIO board

I was apparently missing firmware for QEMU's default RISC-V machine, spike, and decided to use the VirtIO board instead.

$ qemu-system-riscv64 -machine virt

That started the QEMU monitor. (I run QEMU with -nographic and use Crtl+a c to escape to the QEMU monitor, where one can quit.) Next, I made QEMU open a GDB server at localhost:1234 using -s and not start the CPU immediately using -S.

$ qemu-system-riscv64 -machine virt -s -S
$ riscv64-none-elf-gdb  # In another shell.
(gdb) target remote :1234
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x0000000000001000 in ?? ()
(gdb) continue
Program received signal SIGINT, Interrupt.
0x0000000080000536 in ?? ()

What's in memory? What's it doing?

(gdb) x/32xw 0x1000
0x1000:	0x00000297	0x02828613	0xf1402573	0x0202b583
0x1010:	0x0182b283	0x00028067	0x80000000	0x00000000
0x1020:	0x87e00000	0x00000000	0x4942534f	0x00000000
0x1030:	0x00000002	0x00000000	0x00000000	0x00000000
0x1040:	0x00000001	0x00000000	0x00000000	0x00000000
0x1050:	0x00000000	0x00000000	0x00000000	0x00000000
0x1060:	0x00000000	0x00000000	0x00000000	0x00000000
0x1070:	0x00000000	0x00000000	0x00000000	0x00000000

I didn't know what to make of this. I also realized that I had no idea about the VirtIO machine, its devices and memory mapping. In the QEMU documentation I could only find documentation for the ARM virt machine, with a listing of supported devices and some valuable bare-metal programming information:

Hardware configuration information for bare-metal programming

The virt board automatically generates a device tree blob (“dtb”) which it passes to the guest. This provides information about the addresses, interrupt lines and other configuration of the various devices in the system. Guest code can rely on and hard-code the following addresses:

All other information about device locations may change between QEMU versions, so guest code must look in the DTB.

QEMU supports two types of guest image boot for virt, and the way for the guest code to locate the dtb binary differs:

I wasn't passing an ELF file, but a raw binary. Was I using the Linux kernel boot protocol? Is the start of memory the same on the RISC-V virt machine? Where can I find the device tree? A web search brought me to Tyler Wilcock's blog, mentioning the -machine dumptdb=<file> option, which wasn't documented in the QEMU manual. (How does one figure this out? Searching for this option, I only found it mentioned in release notes and a mailing list posting.)

$ qemu-system-riscv64 -machine virt,dumpdtb=qemu-riscv64-virt.dtb
$ dtc qemu-riscv64-virt.dtb > qemu-riscv64-virt.dts

The part of the device tree I was interested in at first was:

    memory@80000000 {
        device_type = "memory";
        reg = <0x00 0x80000000 0x00 0x8000000>;

As Tyler explains, this means the memory ranges from 0x8000_0000 to 0x8800_0000. It was time to write a program.

RISC-V assembly

I couldn't get my hands on a copy of The RISC-V Reader, but searching Hacker News I found some insightful lecture notes by Stephen Marz, (They also write a series on RISC-V and Rust.) and finally the brief RISC-V Assembly Programmer's Manual on GitHub.

I started with an infinite loop:

.section .init
.globl _start
    j _start

Which I assembled and converted to binary with:

$ riscv64-none-elf-as loop.s -g -o loop.elf
$ riscv64-none-elf-objcopy -O binary loop.elf loop.img

But after launching QEMU and attaching GDB, the TUI only gave me [ No Source Available ].

$ qemu-system-riscv64 -machine virt -kernel loop.img -s -S &
$ riscv64-non-elf-gdb loop.elf -tui
Reading symbols from src/loop.elf...
(gdb) target remote :1234
Remote debugging using :1234
0x0000000000001000 in ?? ()
(gdb) continue

Program received signal SIGINT, Interrupt.
0x0000000080200000 in ?? ()
(gdb) continue

Program received signal SIGINT, Interrupt.
0x0000000080200000 in ?? ()
(gdb) x/30xw 0x801ffff0
0x801ffff0:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200000:     0x0000006f      0x00000000      0x00000000      0x00000000
0x80200010:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200020:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200030:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200040:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200050:     0x00000000      0x00000000      0x00000000      0x00000000
0x80200060:     0x00000000      0x00000000

I thought that the loop seemed to work, because the address remained constant, at an offset of 0x20_0000 or 2 KiB from the start of memory at 0x8000_0000. There, I found a lone 0x0000006f, and loop.img is 0x6f 0x00 0x00 0x00. (Something with endianness, I thought.) But I'm not confident about that. Anyway, I wanted to get debugging to work.

I tried linking the object file beforehand:

$ riscv64-none-elf-ld -o loop.linked.elf loop.elf
$ riscv64-none-elf-objcopy -O binary loop.linked.elf loop.linked.img
$ diff loop.img loop.linked.img

Unsurprisingly, the binaries were identical. But in the new object file, the .init section had VMA of 0x10078. Why is that? GDB now displayed the source code up until I attached it to QEMU. I dawned on me that 0x10078 was outside of memory, and that there was no way ld could know where the memory was. After skimming the ld manual, I made it dump its default linker script.

$ riscv64-none-elf-ld --verbose > qemu-riscv64-virt.ld

And edited it to specify the location of memory, inserting the following command:

  ram (rwxai) : ORIGIN = 0x80000000, LENGTH = 0x8000000

I linked anew.

$ riscv64-none-elf-ld -T qemu-riscv64-virt.ld -o loop.linked.elf loop.elf

But stuff still didn't work. Now, GDB would show the assembly code even once attached, but the memory contents at 0x80000000 clearly weren't the assembled loop.s. At that point I understood (I should have understood this way sooner, given the quote from the manual above) that I had to give QEMU the ELF file, not the binary. To which QEMU replied:

rom: requested regions overlap (rom phdr #0: src/loop.linked.elf. free=0x000000008000e240, addr=0x0000000080000000)
qemu-system-riscv64: rom check and register reset failed

This was because the OpenSBI firmware claimed the start of memory. I should have known, because it had printed (To the serial console, I guess?) the following ASCII art:

OpenSBI v0.7
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|____/_____|
        | |

Platform Name          : QEMU Virt Machine
Platform HART Features : RV64ACDFIMSU
Current Hart           : 0
Firmware Base          : 0x80000000
Firmware Size          : 128 KB
Runtime SBI Version    : 0.2

MIDELEG : 0x0000000000000222
MEDELEG : 0x000000000000b109
PMP0    : 0x0000000080000000-0x000000008001ffff (A)
PMP1    : 0x0000000000000000-0xffffffffffffffff (A,R,W,X)

Okay, so there is firmware. Do I need it? What if I just disabled it?

$ qemu-system-riscv64 -machine virt -s -S -nographic -kernel loop.linked.elf -bios none
$ risv64-non-elf-gdb loop.linked.elf  # In another shell.
Reading symbols from loop.linked.elf...
(gdb) target remote :1234
Remote debugging using :1234
0x0000000000001000 in ?? ()
(gdb) break _start
Breakpoint 1 at 0x80000000: file loop.s, line 4.
(gdb) continue
Breakpoint 1, _start () at loop.s:4
4	  j _start
(gdb) step
Breakpoint 1, _start () at loop.s:4
4	  j _start
(gdb) step
Breakpoint 1, _start () at loop.s:4
4	  j _start

It worked! My instructions were sitting nicely at 0x8000_0000, the loop worked, and GDB knew our location in the source file. But was I wrong in disabling the firmware? According to its own output, the firmware ends 128 KiB after the start of memory, i.e. at 0x8002_0000. But what does it even do? Change modes? Initialize "hardware"? According to the project README.md:

The RISC-V Supervisor Binary Interface (SBI) is the recommended interface between:

To be honest, I was tapping in the dark and pretty glad things finally seemed to work. I didn't understand the SBI nor was I eager to comply, I was merely poking around and, at this point, fancied a perceivable result. To that end, I needed to access IO.

UART not alone

I found information about the serial console in qemu-riscv64-virt.dts.

    uart@10000000 {
        interrupts = <0x0a>;
        interrupt-parent = <0x03>;
        clock-frequency = <0x384000>;
        reg = <0x00 0x10000000 0x00 0x100>;
        compatible = "ns16550a";

It was mapped into memory from 0x1000_0000 to 0x1000_0100, and compatible with the "ns16550a". This is apparently a very common model, but I didn't know anything about serial programming. A quick web search brought me to a Wikipedia article, of which the first paragraph is quoted below, and a data sheet by National Semiconductor.

The 16550 UART (universal asynchronous receiver/transmitter) is an integrated circuit designed for implementing the interface for serial communications. The corrected -A version was released in 1987 by National Semiconductor. It is frequently used to implement the serial port for IBM PC compatible personal computers, where it is often connected to an RS-232 interface for modems, serial mice, printers, and similar peripherals. It was the first serial chip used in the IBM PS/2 line, which were introduced in 1987.

And further down the article, below a list of features:

Both the computer hardware and software interface of the 16550 are backward compatible with the earlier 8250 UART and 16450 UART. The current version (since 1995) by Texas Instruments which bought National Semiconductor is called the 16550D.

Indeed, the data sheet for the NS16550A referred to its predecessor's documentation:

The reader is assumed to be familiar with the standard features of the NS16450, so this paper will concentrate mainly on the new features of the NS16550A. If the reader is unfamiliar with these UARTs it is advisable to start by reading their data sheets.

Of which I could only find a scan on a website not too official-looking. Nevertheless, section 8 contained a tabular summary of (byte-sized) registers. Instinctively, I tried writing a byte — 0x48 for the "H" of "Hello, world!" — to the THR, located conveniently at offset 0.

.section .init
.globl _start
    li s1, 0x10000000 # s1 := 0x1000_0000
    li s2, 0x48       # s2 := 0x48
    sb s2, 0(s1)      # (s1) := s2

And it worked! H was printed to the console! Next, I had to scale this to the 14 characters of Hello, world!\n.

.section .init
.global _start
    li s1, 0x10000000 # s1 := 0x1000_0000
    la s2, message    # s2 := <message>
    addi s3, s2, 14   # s3 := s2 + 14
    lb s4, 0(s2)      # s4 := (s2)
    sb s4, 0(s1)      # (s1) := s4
    addi s2, s2, 1    # s2 := s2 + 1
    blt s2, s3, 1b    # if s2 < s3, branch back to 1

.section .data
  .string "Hello, world!\n"

After assembling and linking:

$ qemu-system-riscv64 -machine virt -bios none -kernel hello.linked.elf -nographic
Hello, world!

Phew! That was fun, but I'm far off a robust serial driver. Back to reading Tanenbaum's Operating Systems — Design and Implementation (I uploaded the assembly program, linker script, shell.nix and a Makefile to a repository on sourcehut.)