Overview of eBPF procfs kernel parameters

Or how to configure the JIT compiler

Why it matters

There are (at least) 5 relevant procfs settings that eBPF developers should know about. Rather than having to navigate procfs, these are already visible through the bpftool feature probe command.

Most of the tunables of interest to eBPF developers are security-related so even if we're just getting started with eBPF they are worth knowing about for future production. Be sure to check your distro for the default values and tune them to your preferred security settings.

echo vs sysctl

For completion, note that there are several ways of tuning your procfs values. Most commonly is using bare echos or the friendlier sysctl. Honestly, I switch between both all the time. The benefit of using sysctl is that you can save it to your /etc/sysctl.conf with sysctl -p.

# 1: Using cat & echo
$ cat /proc/sys/net/core/bpf_jit_harden
1
$ echo 0 > /proc/sys/net/core/bpf_jit_harden
# If you need sudo:
$ sudo bash -c "echo 0 > /proc/sys/net/core/bpf_jit_harden"
# 2: Using sysctl (prepend sudo if needed)
$ sysctl net.core.bpf_jit_harden
net.core.bpf_jit_harden = 1
$ sysctl -w net.core.bpf_jit_harden=0 # -w is used to differentiate cli args from procfs args
net.core.bpf_jit_harden = 0

Parameters

/proc/sys/kernel/unprivileged_bpf_disabled

This is arguably the most important one to know since it can allow complete open access to the eBPF subsystem, which is currently not recommended. Outside of exploiting kernel bugs, there are "unsafe" eBPF helpers that can be used to read & write memory.

If this is set to 0, then unprivileged users (non-root) currently have access to running entire eBPF programs.

You'll most likely want this set to 1 or 2, depending on the environment you're running eBPF programs. Both restrict eBPF programs from being run outside of privileged users, but with 2 this can be changed by a privileged user at anytime. 1 does not allow privileged users from changing this value.

Update(07/01/22): This PATCH allows for more nuance on what this flag enforces. In upcoming versions of the kernel (most likely v5.19), privilege (CAP_BPF or CAP_SYS_ADMIN) will only be required on object (bpf programs and maps) creation.

/proc/sys/net/core/bpf_jit_enable

The JIT compiler is what enables eBPF programs to get a tremendous speedup by compiling a program's eBPF instructions to native CPU architecture operations. This happens on the fly so your program can be agnostic (kind of) of the CPU it'll eventually run on. There's not a JIT compiler for every arch and there seems to be some trickiness with endianess, but that's the theory.

When this is set to 0, the JIT compiler is disabled. Your eBPF program will still be able to run but each instruction will be interpreted rather than compiled. This could work depending on your requirements.

This can be set to either 1 or 2 to enable JIT compilation, but most of the time you'll want to set it to 1. 2 enables debugging kernel logs (dmesg) intended for kernel JIT developers.

/proc/sys/net/core/bpf_jit_harden

eBPF programs are supposed to work within defined bounds and not allow for unseen code to enter kernel memory. There was a point in time though where arbitrary CPU code could be tucked away in a program's constants (known as JIT spraying) that could be executed if the kernel was directed to that memory location (usually through a kernel bug). To prevent this, constant blinding was implemented.

This flag seems to be designed as a general catch-all for JIT hardening measures, but I believe constant blinding is the primary impetus for this setting.

When set to 0, hardening measures are disabled (this might be the default on your distro). Setting it to 1 enables it for unprivileged users and 2 enables it for all users.

/proc/sys/net/core/bpf_jit_kallsyms

When you run (load & attach) an eBPF program, it has become a part of kernel space (for the most part) which means that it might be useful to have it present in places where kernel symbols are read (such as /proc/kallsyms). bpftool and perf actually use /proc/kallsyms to provide eBPF debugging information. Whether we want the exportation of these JITed programs as kernel symbols is configurable through this procfs param.

Setting it to 0 disables it. Setting it to 1 enables JIT kallsyms export for privileged users.

/proc/sys/net/core/bpf_jit_limit

This is another parameter that can be used to limit JITed eBPF programs for unprivileged users. This sets the global memory allowed for unprivileged eBPF programs. Once it has been surpassed, JIT requests are rejected.

This is different than the other flags in the sense the returned unit of this value is in bytes.

Worth noting

The privilege a user has is directly related to the task wanting to be completed. For certain kernel operations you'll need specific capabilities enabled. For eBPF, you'll most likely want both CAP_BPF and CAP_SYS_ADMIN enabled for operations. I'm planning on writing another blog post explaining the nuance here.

bpf-feature (written in Rust)

I wrote a Rust library that encapsulates eBPF feature detection here: bpf-feature. Available on crates.io if you want to build eBPF feature detection into your application! If you want to interactively peruse your eBPF features, I recommend just using bpftool.

Going forward

Currently, I'm working on better visibility for recommended eBPF security practices such as the ones mentioned here at bpfdeploy.io. Ideally there should be tooling to warn engineers when either their host or eBPF programs (especially third-party vendor-provided ones) are doing less-than-ideal things.

References

If you see any inaccuracies, typos or have comments, please reach out @mdaverde.