Running OpenGL Shaders on the Raspberry Pi

Mon Mar 08 2021

A shader running on the Raspberry Pi
A shader from Shadertoy running fullscreen on the Rapsberry Pi official touchscreen

TL;DR: If you just want to try it, you can head to the project repository.1“How can it be that mathematics, being after all a product of human thought which is independent of experience, is so admirably appropriate to the objects of reality?” – Albert Einstein. As far as I can remember, I’ve always felt attracted by computer graphics. I guess I approach the field from the angle of my Mathematics background, as I see it as a universal language between machines, arts, and possibly, nature1.

2Shadertoy has been created by Iñigo Quilez, that publishes very interesting articles, tutorials, and other awesome resources at iquilezles.org.As a projection of that language, I find the OpenGL Shading Language an interesting case, at the intersection of computing and visual arts, that’s concise and expressive enough to demonstrate human ingenuity and creativity. You can find me browsing for hours on shadertoy.com2, mesmerized by the visuals.

3As crazy as it may sounds, it’s possible to develop games in GLSL (click on the images to try it):
Arkanoid
“Pacman”
Space Invaders
Since I built my Kubernetes cluster with Rapsberry Pi a year ago, I’ve kept somewhere in a corner of my head, the idea I could use the touchscreen monitor to play with OpenGL, making the coolest cluster ever, to begin with 😎, and turning it eventually, into a small gaming device 🕹👾3.

So I’ve accepted the mission to run shaders from Shadertoy, on the Raspberry Pi!

The Linux Graphics Stack

I use a Raspberry Pi 4 as my cluster’s main node, that’s connected to the touchscreen monitor. It runs the Lite version of Raspberry Pi OS, which means there is no windowing system available, like X11.

I also have some Rapsberry Pi 3, so I wanted to have a solution that would work on these as well, and possibly on any other Linux device with GPU hardware.

The following diagram gives a good understanding of the Linux graphics stack:

Mesa:APIs+DRI/Gallium3D driverlibGL-mesa-swx11 (libGL)libGL-mesa-glxlibOpenVG-mesalibGLES-mesalibEGL-mesalibEGL-mesa-drivers (Wayland)libGBMlibGL-mesa-DRI (Modules)X-server (X.Org)X.Org Server display driverxserver-xorg-video-nouveauxserver-xorg-video-nvidiaxserver-xorg-video-radeonDRMLinux kernellibDRM-intellibDRM-radeonlibDRM-nouveaulibDRM-freedrenohardware specificUserspace interface tohardware specificdirect rendering managerCPU & registers & L1 & L2 & L3 & L4 & main memoryGPU & registers & L1 & L2 (& graphic memory)ApplicationsToolkitslibDRM3D-game engineDDX-driverlibX / libXCBProprietary OpenGL 4.2 driverlibGL-nvidia-glxlibGL-fglrx-glx"libGL"Rendering APIs: OpenGLOpenGL|ESOpenVGX 11R7.8blobDisplayserverlibwayland-clientWayland 1.5framebufferDIX driverhardware-specificby Shmuel Csaba Otto Traian; CC-BY-SA 4.0 intl; created 2013-08-24; last updated 2014-03-25KMSKernel Mode SettingWaylandobsoletes2D drivers indisplay serverWayland compositorAPI: EGLAPI: EGLWindow managerKWinMutterWestonEnlightmentKWinCompizOpenBoxMetacityMutter
Illustration of the Linux graphics stack
(by Shmuel Csaba Otto Traian, CC-BY-SA 4.0)

With the requirement to run without any windowing system, like X or Wayland, this implies relying either on:

While the later option requires more development work, it promises to work across the range of GPUs that have a Mesa driver available. It also brings the a-priori benefits of the open-source model, with community and freely accessible documentation.

Now that the Linux stack is clearer, let’s continue on that mission, and find out what drivers exist for the Raspberry Pi …

The Rapsberry Pi

The Raspberry Pi 3 Broadcom BCM2837 SoC includes the VideoCore IV GPU, which could initially be used with the corresponding proprietary driver, and closed-source implementation of the graphics libraries. While some portions of that stack was released as open-source in 2012, most of the work is still done in the closed-source runtime libraries and GPU code, as depicted in this diagram:

applicationOpenGL ESapplicationapplicationMediaOpenMaxOpenVGEGLKernel driverVideocore IV GPUOpensourceClosedsourceBinaryblobARM3D2D
The VideoCore IV GPU driver stack
(CC BY-SA 3.0)

5These header files and libraries can be found at https://github.com/raspberrypi/firmware.The C header files and libraries for these Broadcom specific implementations are located in the /opt/vc/include and /opt/vc/lib directories5.

6The Architecture Reference Guide for the Broadcom VideoCore IV GPU is available at https://docs.broadcom.com/docs/12358545.7The source code for the userland libraries can be found at https://github.com/raspberrypi/userland.8The source code for the VC4 driver can be found in the src/gallium/drivers/vc4 directory of the Mesa repository.In 2014, Broadcom and the Raspberry Pi Foundation announced the documentation release for the VideoCore IV 3D graphics processor6, as well as the source release of the graphics stack under a BSD license7. Few months after the announcement, the source code of a Gallium-based Mesa OpenGL driver for the Broadcom SoC GPU, written from scratch by Eric Anholt, was committed to the Mesa project8. This paved the way towards open-source drivers for the Rapsberry Pi GPUs.

9The source code for the V3D driver can be found in the src/gallium/drivers/v3d directory of the Mesa repository.The Raspberry Pi 4 Broadcom BCM2711 SoC (formerly BCM2838) now includes the VideoCore VI GPU, that’s only supported by a Mesa driver9. The original Broadcom proprietary driver, specifically designed for the BCM2837 SoC GPU, does not work on the Rapsberry Pi 4. This Mesa V3D (VideoCore VI) driver conforms to OpenGL ES 3.1 (as of March 2021), while the VideoCore VI GPU is capable of OpenGL ES 3.2.

After this research phase, and its few historical findings, I’m convinced the way forward to succeed in my mission, is to rely on these open-source drivers …

The Programming

10The Linux GPU Driver Developer’s Guide provides an extensive documentation of the DRM/KMS sub-system.With these drivers, running OpenGL or OpenGL ES, without X11, is possible using the DRM/KMS Linux kernel sub-system10, in combination with the Mesa Generic Buffer Management (GBM) library.

Lukily, I stumbled upon kmscube, which is an example application, written in C, that demonstrates how to use the KMS/GBM/EGL APIs to drive bare metal graphics, and provides an implementation of the mode-setting and page-flipping operations.

The basic idea is to use two triangles, covering the entire screen, that are rasterized by sampling the shader for every pixel. So that left me with:

1) Loading a copy of the Shadertoy shader from the file system:

// The template used to input uniforms that are automatically added by Shadertoy,
// and to call the Shadertoy shader main method entrypoint.
static const char *shadertoy_fs_tmpl =
	"precision mediump float;                                             \n"
	"uniform vec3      iResolution; // viewport resolution (in pixels)    \n"
	"uniform float     iTime;       // shader playback time (in seconds)  \n"
	"uniform int       iFrame;      // current frame number               \n"
	"                                                                     \n"
	"%s                                                                   \n"
	"                                                                     \n"
	"void main()                                                          \n"
	"{                                                                    \n"
	"    mainImage(gl_FragColor, gl_FragCoord.xy);                        \n"
	"}                                                                    \n";

// Creates the fragment shader from a local copy of the Shadertoy shader
static char *load_shader(const char *file) {
	struct stat statbuf;
	char *frag;
	int fd, ret;

	fd = open(file, 0);
	if (fd < 0) {
		err(fd, "could not open '%s'", file);
	}

	ret = fstat(fd, &statbuf);
	if (ret < 0) {
		err(ret, "could not stat '%s'", file);
	}

	const char *text = mmap(NULL, statbuf.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
	asprintf(&frag, shadertoy_fs_tmpl, text);

	return frag;
}

2) Compiling the shaders on the GPU, creating the OpenGL program, linking it, and initializing uniform variables and buffers:

// Holds the playback time and frame number uniform locations
GLint iTime, iFrame;

// The triangles, to be rasterized by sampling the shader for every pixel of the screen.
// Quads are not supported by OpenGL ES, so we have to use two triangles.
static const GLfloat vertices[] = {
	// First triangle:
	 1.0f,  1.0f,
	-1.0f,  1.0f,
	-1.0f, -1.0f,
	// Second triangle:
	-1.0f, -1.0f,
	 1.0f, -1.0f,
	 1.0f,  1.0f,
};

// The vertex shader, responsible to position the geometry.
// We simply need the identity in our case.
static const char *shadertoy_vs =
	"attribute vec3 position;                \n"
	"void main()                             \n"
	"{                                       \n"
	"    gl_Position = vec4(position, 1.0);  \n"
	"}                                       \n";

int init_shadertoy(const struct gbm *gbm, struct egl *egl, const char *file) {
	int ret;
	char *shadertoy_fs;
	GLuint program, vbo;
	GLint iResolution;

	// Loads the Shadertoy shader from the file system, and creates the fragment shader
	shadertoy_fs = load_shader(file);
	// Compiles the fragment and vertex shaders, and attaches them the returned program
	ret = create_program(shadertoy_vs, shadertoy_fs);
	if (ret) {
		printf("failed to create program\n");
		return -1;
	}
	program = ret;
	// Links the program
	ret = link_program(program);
	if (ret) {
		printf("failed to link program\n");
		return -1;
	}

	// Matches the viewport width and height to the screen resolution
	glViewport(0, 0, gbm->width, gbm->height);
	glUseProgram(program);
	// Initializes the uniform variables
	iTime = glGetUniformLocation(program, "iTime");
	iFrame = glGetUniformLocation(program, "iFrame");
	iResolution = glGetUniformLocation(program, "iResolution");
	glUniform3f(iResolution, gbm->width, gbm->height, 0);
	// Initializes the vertices buffer that holds the triangles data
	glGenBuffers(1, &vbo);
	glBindBuffer(GL_ARRAY_BUFFER, vbo);
	glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), 0, GL_STATIC_DRAW);
	glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(vertices), &vertices[0]);
	glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, 0, (const GLvoid *) (intptr_t) 0);
	glEnableVertexAttribArray(0);

	// Provides the rendering method to be called for each frame
	egl->draw = draw_shadertoy;

	return 0;
}

3) Finally, rasterizing the triangles, so that every pixel from the screen gets sampled from the fragment shader:

// This is called to render every frame
static void draw_shadertoy(uint64_t start_time, unsigned frame) {
	// Sets the playback time uniform in seconds
	glUniform1f(iTime, (get_time_ns() - start_time) / (double) NSEC_PER_SEC);
	// Sets the current frame number uniform
	glUniform1ui(iFrame, frame);
	// Renders the triangles
	glDrawArrays(GL_TRIANGLES, 0, 6);
}

The complete source code is available at https://github.com/astefanutti/kms-glsl.

The Fun

Examples
A selection of shaders from Shadertoy, that run successfully on the Rapsberry Pi.
You can find copies of these in the examples directory of the project repository.

I’ve successfully run shaders on the RPi 3B+ and RPi 4, with Raspberry Pi OS Lite 2020-12-02, Linux kernel 5.4.79.

You can run the following instructions to build the CLI binary:

$ sudo apt update
# Install the build tools
$ sudo apt install gcc make
# Install the required DRM, GBM, EGL and OpenGL ES API headers
$ sudo apt install libdrm-dev libgbm-dev libegl-dev libgles2-mesa-dev
# Clone the repository
$ git clone https://github.com/astefanutti/kms-glsl.git && cd kms-glsl
# Build the glsl CLI binary
$ make

The VC4/V3D driver kernel module must be activated. Assuming you’ve installed Raspberry Pi OS, this can be achieved by running the following commands:

1) Edit the /boot/config.txt file, e.g.:

$ sudo vi /boot/config.txt

2) Set the following properties:

# Required: Enable the firmware/fake DRM/KMS VC4/V3D driver
dtoverlay=vc4-fkms-v3d
# Optional: Increase the memory reserved for the GPU
#           16MB disables certain GPU features
gpu_mem=64
# Optional: Avoid GPU down-clocking below 500 MHz that slows FPS down
#           Should be set to 250 on the RPi 3
v3d_freq_min=500

3) Reboot your Raspberry Pi, so that the changes are taken into account, e.g.:

$ sudo reboot

You can then run shaders from the examples directory, e.g.:

$ ./glsl examples/stripey_torus_interior.glsl

OpenGL ES 2.x information:
  version: "OpenGL ES 3.1 Mesa 19.3.2"
  shading language version: "OpenGL ES GLSL ES 3.10"
  vendor: "Broadcom"
  renderer: "V3D 4.2"

And check that renderer: "V3D 4.2" is present in the console output, to confirm it’s setup correctly.

Soon after I started testing, I realized the framerate was fluctuating, and the V3D GPU frequency was dropping well below 500 MHz. This can be observed by running the following command from a separate terminal:

$ watch -n 1 vcgencmd measure_clock v3d

11This is not optimal, as it increases power consumption, while the GPU is idle.This issue has been reported in raspberrypi/linux#3935. It seems the default governor scales the GPU frequency down, despite instructions being scheduled into the GPU workload queue. A solution to prevent GPU frequency down-scaling, is to set the minimum frequency, by adding v3d_freq_min=500 to the /boot/config.txt file11.

This can also be used to overclock the GPU. I’ve successfully tested overclocking the V3D GPU to 600 MHz, which results in a noticeable FPS improvement.

The Future

The vc4-fkms-v3d driver is known as the fake/firmware DRM/KMS driver, where the kernel driver still delegates the interactions with the display controller to the firmware. A newer vc4-kms-v3d driver, known as the full DRM/KMS driver, is now available, where the kernel drives the display controller directly.

I gave it a try, after an upgrade to the latest kernel version available (as of March 2021):

$ sudo apt full-upgrade
$ uname -a
Linux master 5.10.17-v7l+ #1403 SMP Mon Feb 22 11:33:35 GMT 2021 armv7l GNU/Linux

Unfortunately, I faced the issue reported in raspberrypi/linux#4020. I plan to try it again, once it’s fixed. It seems it may be possible to use it in combination with the touchscreen driver for the DSI display. So it could possibly enable interactivity, by feeding the mouse uniform, with touchscreen events from tslib.

There are also few things that, I think, would be logical additions:

What started as a toy project, to end year 2020 light-heartedly, turned out to be a small, yet very rewarding, journey into the world of open-source GPU programming. I think I can safely say it: mission accomplished!

I’d be happy to hear your feedback at https://github.com/astefanutti/kms-glsl!