Discovering new options of enum generation
While working on updating the bindings for libbpf-sys
since the release of libbpf v0.8.0, I
realized I wasn't a fan of the way libbpf's
enum bpf_func_id
was being generated through rust-bindgen,
an FFI library that generates bindings for Rust to C. This isn't an overview on how
to use the library but a deep dive on its particular enum generation options.
The current bindgen translation of bpf_func_id
looks like this:
pub const BPF_FUNC_unspec: bpf_func_id = 0;
pub const BPF_FUNC_map_lookup_elem: bpf_func_id = 1;
pub const BPF_FUNC_map_update_elem: bpf_func_id = 2;
pub const BPF_FUNC_map_delete_elem: bpf_func_id = 3;
pub const BPF_FUNC_probe_read: bpf_func_id = 4;
pub const BPF_FUNC_ktime_get_ns: bpf_func_id = 5;
You can ignore the weird casing and prefix since that occurs in C rather than bindgen
.
The issue for me more so was why couldn't I have this directly translated in a proper Rust enum?
Turns out, bindgen actually implements several different ways of mapping enums into Rust.
At the time of this writing, the latest version of rust-bindgen
is v0.59.2
.
A Github repo with examples
I ended up creating a repository dedicated to showing the different styles here: @mdaverde/bindgen-enum-flavors
You can see most of the options for enum generation in the repo's build.rs
.
I think the API could be simplified but it's straightforward enough to understand quickly.
The generated bindings can be seen in src/bindings.rs
and the original C header file is enum.h
.
The flavors and descriptions of each
constified_enum
This is the default when no other is specified.
Origin:
enum meals {
breakfast,
lunch,
dinner
};
Bindgen:
pub const meals_breakfast: meals = 0;
pub const meals_lunch: meals = 1;
pub const meals_dinner: meals = 2;
pub type meals = ::std::os::raw::c_uint;
You can see that it generates just Rust consts with the same type specified. This is similar to how C enums work in practice.
In the options, you can also change whether the enum name should prepend and if the int type should be set to c_uint
or the direct int size u32
.
constified_enum_module
Bindgen:
pub mod game {
pub type Type = ::std::os::raw::c_uint;
pub const win: Type = 0;
pub const lose: Type = 1;
pub const draw: Type = 2;
}
This still keeps the const nature of the enum values but wraps it in a module for
encapsulation and more Rust-like ergonomics: game::lose
.
newtype_enum
Bindgen:
impl planet {
pub const earth: planet = planet(0);
}
impl planet {
pub const jupiter: planet = planet(1);
}
impl planet {
pub const saturn: planet = planet(2);
}
impl planet {
pub const mars: planet = planet(3);
}
#[repr(transparent)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub struct planet(pub ::std::os::raw::c_uint);
You can deduce from the generated bindings that this allows traits to be implemented for this enum which is pretty useful to extend the enums with custom functionality.
bitfield_enum
This flavor is the newtype_enum
but with bit operation traits implemented.
Bit impl example:
impl ::std::ops::BitAndAssign for animal {
#[inline]
fn bitand_assign(&mut self, rhs: animal) {
self.0 &= rhs.0;
}
}
rustified_enum
Bindgen:
#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum color {
purple = 0,
red = 1,
blue = 2,
green = 3,
yellow = 4,
pink = 5,
indigo = 6,
brown = 7,
black = 8,
white = 9,
}
This is nice! So clean. But with potentially unsafe tradeoffs. With all the previous flavors, a user could use the type to generate values that aren't part of the original enum:
// constified_enum
let meals_supper: meals = 3; // Not part of the original enum
// constified_enum_module
let postponed: game::Type = 3; // Not part of the original enum
// newtype_enum
let pluto: planet = planet(4); // Sad
This is allowed by design. Why? Because in C it's allowed (whether it's recommended...). Enums are basically just ints. You can return "foreign" values in a type signature that specifies an enum.
This means that if you choose this enum flavor you have to know that this won't happen in your FFI functions or else it's undefined behavior. If you own the C library and you know that no foreign values will be returned when an enum is expected, then use this flavor. Otherwise, you're better off sticking to the other styles.
rustified_non_exhaustive_enum
The same as rustified_enum
but with the #[non_exhaustive]
attribute added.
Update - 06/14/22: As per feedback, I wanted to add clarification to this.
This can also cause UB! The reason being that #[non_exhaustive]
only enforces
users of the enum to handle the possibility of other variants but does not require
the compiler to assume other variants exist. In other words, the compiler can decide
to not include the wildcard arm in the final binary if it detects its dead code.
Therefore, if your FFI function returns a variant outside of the described enum,
this is undefined behavior. For more information, check out this issue and this example.
Priority of enum flavors
Bindgen actually implements a priority to the enum flavors in case multiple are specified for the same enum (most likely with the use of patterns). If there are conflicts this is the order:
constified_enum_module
bitfield_enum
newtype_enum
rustified_enum
rustified_non_exhaustive_enum
constified_enum
(default)
You can change the default with the default_enum_style
option.
Conclusion
I debated over these options for bpf-rs but I realized it was eventually more fruitful to bring in the enum directly myself for the eBPF helpers. The downside to this is that this enum will need to be maintained with future libbpf updates but that shouldn't be often (famous last words) and I wrote a cargo test to help me catch mismatches.