eBPF’s User Ring Buffer: Introducing BPF_MAP_TYPE_USER_RINGBUF

TJ. Podobnik, @dorkamotorka
Level Up Coding
Published in
5 min readApr 26, 2024

--

Lately, I’ve been spending a good amount of time tinkering with eBPF, trying out various features to understand how flexible and efficient it is. In this post, I want to shed a light on BPF map types, in particular the User Ring Buffer. In eBPF, there are two types of ring buffers: user ring buffer and kernel ring buffer, which help in efficient data communication between user mode and kernel mode. I recently experimented with the user ring buffer, a relatively new feature. However, I couldn’t find many examples or documentation on it, so I decided to share my findings in this post.

User vs. Kernel Ring Buffer

You may have encountered the Ring Buffer or its predecessor, the Perf Buffer feature. The User Ring Buffer differs in that it facilitates data transfer from user space to the kernel, rather than the other way around, distinguishing it from both the Ring Buffer and Perf Buffer.

⚠️ Note: It’s worth noting that this feature isn’t yet implemented in cilium/ebpf, so the example below will utilize the libbpfC library directly.

The ring buffer data structure is particularly useful for scenarios where the kernel needs to send data to user mode, such as transmitting kernel monitoring events, asynchronous notifications, or status updates. For instance, monitoring the status of numerous network service program ports requires real-time transmission of opening, closing, error, and other status updates to user space for processing. Additionally, the Linux kernel’s logging system and performance analysis tools frequently send large amounts of data to user space for user-friendly display and analysis. In such scenarios, the ring buffer demonstrates high efficiency in transferring data from the kernel to the user.

The User Ring Buffer is a new type of Map type based on the ring buffer, offering the semantics of a single user space producer/single kernel consumer. Its advantage lies in providing robust support for asynchronous message passing, minimizing unnecessary synchronization operations, optimizing kernel-to-user data transfer, and reducing system call overhead. Introduced in kernel version 6.1, its current use cases are somewhat limited.

Regarding its applicability, it highly depends on your specific requirements. However, if you find yourself in need of it, here’s an example that may not be readily available elsewhere. I had to dive into the Linux Kernel code myself to grasp it fully, as examples are scarce if any at all.

  • Kernel Space code
// user_ringbuf.bpf.c
...

struct {
__uint(type, BPF_MAP_TYPE_USER_RINGBUF);
__uint(max_entries, 256 * 1024);
} user_ringbuf SEC(".maps");

static long extract_context(struct bpf_dynptr *dynptr, int *context) {
// user_sample struct should be used for both producing samples and reading
// them from the user ring buffer (check user space part of the program)
struct user_sample *sample;
sample = bpf_dynptr_data(dynptr, 0, sizeof(*sample));
if (!sample)
return 0;

// Store value from the sample to the context variable
*context = sample->pid;
bpf_printk("Context value inside callback is: %d", *context);
return 0;
}

// As an example this could be attached to the TC Egress hook
SEC("tc") int tc_egress(struct __sk_buff *ctx) {

// Drain samples from user ring buffer.
// Samples are passed to the callback function "extract_context".
// "context" variable is there to illustrate how to read
// sample data back to the eBPF program main thread
int context;
long ret = bpf_user_ringbuf_drain(&user_ringbuf, extract_context, &context, 0);

// Return value equals number of drained samples
if (ret > 0) {
bpf_printk("Context value outside callback is: %d", context);
}

return 0;
}

...
  • User Space Code
// user_ringbuf.c
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2020 Facebook */
#include <argp.h>
#include <signal.h>
#include <stdio.h>
#include <time.h>
#include <sys/resource.h>
#include <bpf/libbpf.h>
#include "user_ringbuf.skel.h"

#define TASK_COMM_LEN 16

struct user_sample {
int pid;
char comm[TASK_COMM_LEN];
};

static void drain_current_samples(void) {
printf("Draining current samples...\n");
}

static int write_samples(struct user_ring_buffer *ringbuf) {
int i, err = 0;
struct user_sample *entry;

entry = user_ring_buffer__reserve(ringbuf, sizeof(*entry));
if (!entry) {
err = -errno;
goto done;
}

entry->pid = getpid();
printf("PID is: %d\n", entry->pid);
strcpy(entry->comm, "hello");

int read = snprintf(entry->comm, sizeof(entry->comm), "%u", i);
if (read <= 0) {
/* Assert on the error path to avoid spamming logs with
* mostly success messages.
*/
err = read;
user_ring_buffer__discard(ringbuf, entry);
goto done;
}

user_ring_buffer__submit(ringbuf, entry);

done:
drain_current_samples();

return err;
}

static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args) {
return vfprintf(stderr, format, args);
}

static volatile bool exiting = false;

static void sig_handler(int sig) {
exiting = true;
}

struct user_ring_buffer *user_ringbuf = NULL;

static int handle_event(void *ctx, void *data, size_t data_sz) {
const struct event *e = data;
struct tm *tm;
char ts[32];
time_t t;

time(&t);
tm = localtime(&t);
strftime(ts, sizeof(ts), "%H:%M:%S", tm);

printf("%-8s %-5s %-16s %-7d\n",
ts, "SIGN", e->comm, e->pid);
write_samples(user_ringbuf);
return 0;
}

int main(int argc, char **argv) {
struct ring_buffer *rb = NULL;
struct user_ringbuf_bpf *skel;
int err;

/* Set up libbpf errors and debug info callback */
libbpf_set_print(libbpf_print_fn);

/* Cleaner handling of Ctrl-C */
signal(SIGINT, sig_handler);
signal(SIGTERM, sig_handler);

/* Load and verify BPF application */
skel = user_ringbuf_bpf__open();
if (!skel) {
fprintf(stderr, "Failed to open and load BPF skeleton\n");
return 1;
}

/* Parameterize BPF code with minimum duration parameter */
skel->bss->read = 0;

/* Load & verify BPF programs */
err = user_ringbuf_bpf__load(skel);
if (err) {
fprintf(stderr, "Failed to load and verify BPF skeleton\n");
goto cleanup;
}

/* Attach tracepoints */
err = user_ringbuf_bpf__attach(skel);
if (err) {
fprintf(stderr, "Failed to attach BPF skeleton\n");
goto cleanup;
}

/* Set up ring buffer polling */
rb = ring_buffer__new(bpf_map__fd(skel->maps.kernel_ringbuf), handle_event, NULL, NULL);
if (!rb) {
err = -1;
fprintf(stderr, "Failed to create ring buffer\n");
goto cleanup;
}
user_ringbuf = user_ring_buffer__new(bpf_map__fd(skel->maps.user_ringbuf), NULL);

write_samples(user_ringbuf);

while (!exiting) {
err = ring_buffer__poll(rb, 100 /* timeout, ms */);
/* Ctrl-C will cause -EINTR */
if (err == -EINTR) {
err = 0;
break;
}
if (err < 0) {
printf("Error polling perf buffer: %d\n", err);
break;
}
}

cleanup:
/* Clean up */
ring_buffer__free(rb);
user_ringbuf_bpf__destroy(skel);
user_ring_buffer__free(user_ringbuf);

return err < 0 ? -err : 0;
}

⚠️ Note: Note: The following code snippet is intentionally partial, as its implementation heavily relies on your specific eBPF setup. Be sure to read the comments for crucial details.

While it may appear lengthy, mastering certain patterns in eBPF can significantly simplify the process, particularly since the available features are still relatively limited.

Conclusion

In conclusion, exploring the User Ring Buffer within eBPF sheds light on its potential for efficient data transfer from user space to the kernel, complementing existing functionalities like the Ring Buffer and Perf Buffer. While its current use cases may be limited, its advantages in asynchronous message passing and reduced system overhead make it a valuable addition. As the eBPF ecosystem continues to evolve with each kernel release, expect to see more diverse features and examples showcased. Stay tuned for upcoming posts highlighting these advancements and further expanding our understanding of eBPF’s capabilities. With almost every kernel release, there are new eBPF features, and in correlation to that, there are many more eBPF examples I plan to post, so stay tuned :)

To stay current with the latest cloud technologies, make sure to subscribe to my newsletter, Cloud Chirp. 🚀

--

--