-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tcx #921
Tcx #921
Conversation
cc @dave-tucker For some eyes |
✅ Deploy Preview for aya-rs-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
73ad791
to
43b67e4
Compare
Hey @alessandrod, this pull request changes the Aya Public API and requires your review. |
bfe585e
to
fc9f947
Compare
Thanks so much for doing all this work!
This seems fine, except the current definition isn't type safe:
This shouldn't be possible. We should try to make the API as type safe as possible. Among other things we should probably have a wrapper type for program ids, so that
This also seems fine
I don't think we can do this without giving up on a lot of type safety. If you have only one Link type, then all operations become possilble on all links: xdp operations on tc links, kprobe operations on xdp links, etc. But if you have something in mind that would work I'd love to see it!
As an user of aya, I probably don't know anything about netlink or TCX and I shouldn't really learn anything about it. I think we should make this as transparent as possible to users. Like we do for xdp, we should probably detect whether we can attach as TCX or fallback to netlink, and in the default/common case, have sensible defaults for LinkOrdering and everything else. |
Hiya @alessandrod i'm FINALLY getting back to this :) Everything looks good on your comments above except I do have some questions on...
So the main problem with making this truly "transparent" is that the tcx requires new return codes https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L6413 /* (Simplified) user return codes for tcx prog type.
* A valid tcx program must return one of these defined values. All other
* return codes are reserved for future use. Must remain compatible with
* their TC_ACT_* counter-parts. For compatibility in behavior, unknown
* return codes are mapped to TCX_NEXT.
*/
enum tcx_action_base {
TCX_NEXT = -1,
TCX_PASS = 0,
TCX_DROP = 2,
TCX_REDIRECT = 7,
}; vs #define TC_ACT_UNSPEC (-1)
#define TC_ACT_OK 0
#define TC_ACT_RECLASSIFY 1
#define TC_ACT_SHOT 2
#define TC_ACT_PIPE 3
#define TC_ACT_STOLEN 4
#define TC_ACT_QUEUED 5
#define TC_ACT_REPEAT 6
#define TC_ACT_REDIRECT 7
#define TC_ACT_TRAP 8 /* For hw path, this means "trap to cpu"
* and don't further process the frame
* in hardware. For sw path, this is
* equivalent of TC_ACT_STOLEN - drop
* the skb and act like everything
* is alright.
*/
#define TC_ACT_VALUE_MAX TC_ACT_TRAP So therefore technically the bytecode AND the attach mechanism has to change for TCX to work properly... However this does seem to be backwards compatible, i.e the old return codes will still work as expected so we should be implement "Use TCX if available, if not fall back to netlink" Let me take a stab |
b5d4989
to
1b9fd57
Compare
@alessandrod Can you PTAL another look here? I changed up the API a bit making it much more type safe :) LMK what you think and if you're alright with it I'll continue by adding some integration tests before marking as ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r22, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @alessandrod and @astoycos)
test/integration-test/src/tests/tcx.rs
line 42 at r18 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
The way you've written this confers no benefit to it being a macro rather than a function. Exploiting the fact that it is a macro we can write (something like):
diff --git a/test/integration-test/src/tests/tcx.rs b/test/integration-test/src/tests/tcx.rs index f334a02..73dfea8 100644 --- a/test/integration-test/src/tests/tcx.rs +++ b/test/integration-test/src/tests/tcx.rs @@ -7,8 +7,8 @@ use test_log::test; use crate::utils::NetNsGuard; -#[test(tokio::test)] -async fn tcx() { +#[test] +fn tcx() { let kernel_version = KernelVersion::current().unwrap(); if kernel_version < KernelVersion::new(6, 6, 0) { eprintln!("skipping tcx_attach test on kernel {kernel_version:?}"); @@ -26,49 +26,60 @@ async fn tcx() { // // Yields a tuple of the `Ebpf` which must remain in scope for the duration // of the test, and the link ID of the attached program. - macro_rules! attach_program_with_linkorder { - ($link_order:expr) => {{ + macro_rules! attach_program_with_link_order { + ($program_name:ident, $link_id_name:ident, $link_order:expr) => { let mut ebpf = Ebpf::load(crate::TCX).unwrap(); - let program: &mut SchedClassifier = + let $program_name: &mut SchedClassifier = ebpf.program_mut("tcx_next").unwrap().try_into().unwrap(); - program.load().unwrap(); - let link_id = program + $program_name.load().unwrap(); + let $link_id_name = $program_name .attach_with_options( "lo", TcAttachType::Ingress, TcAttachOptions::TcxOrder($link_order), ) .unwrap(); - (ebpf, link_id) - }}; + }; } - let (default, _) = attach_program_with_linkorder!(LinkOrder::default()); - let (first, _) = attach_program_with_linkorder!(LinkOrder::first()); - let (mut last, last_link_id) = attach_program_with_linkorder!(LinkOrder::last()); + attach_program_with_link_order!(default, default_link_id, LinkOrder::default()); + attach_program_with_link_order!(first, first_link_id, LinkOrder::first()); + attach_program_with_link_order!(last, last_link_id, LinkOrder::last()); - let default_prog: &SchedClassifier = default.program("tcx_next").unwrap().try_into().unwrap(); - let first_prog: &SchedClassifier = first.program("tcx_next").unwrap().try_into().unwrap(); - let last_prog: &mut SchedClassifier = last.program_mut("tcx_next").unwrap().try_into().unwrap(); + let last_link = last.take_link(last_link_id).unwrap(); - let last_link = last_prog.take_link(last_link_id).unwrap(); - - let (before_last, _) = - attach_program_with_linkorder!(LinkOrder::before_link(&last_link).unwrap()); - let (after_last, _) = - attach_program_with_linkorder!(LinkOrder::after_link(&last_link).unwrap()); + attach_program_with_link_order!( + before_last, + before_last_link_id, + LinkOrder::before_link(&last_link).unwrap() + ); + attach_program_with_link_order!( + after_last, + after_last_link_id, + LinkOrder::after_link(&last_link).unwrap() + ); - let (before_default, _) = - attach_program_with_linkorder!(LinkOrder::before_program(default_prog).unwrap()); - let (after_default, _) = - attach_program_with_linkorder!(LinkOrder::after_program(default_prog).unwrap()); + attach_program_with_link_order!( + before_default, + before_default_link_id, + LinkOrder::before_program(default_prog).unwrap() + ); + attach_program_with_link_order!( + after_default, + after_default_link_id, + LinkOrder::after_program(default_prog).unwrap() + ); - let (before_first, _) = attach_program_with_linkorder!(LinkOrder::before_program_id(unsafe { - ProgramId::new(first_prog.info().unwrap().id()) - })); - let (after_first, _) = attach_program_with_linkorder!(LinkOrder::after_program_id(unsafe { - ProgramId::new(first_prog.info().unwrap().id()) - })); + attach_program_with_link_order!( + before_first, + before_first_link_id, + LinkOrder::before_program_id(unsafe { ProgramId::new(first.info().unwrap().id()) }) + ); + attach_program_with_link_order!( + after_first, + after_first_link_id, + LinkOrder::after_program_id(unsafe { ProgramId::new(first.info().unwrap().id()) }) + ); let expected_order = [ before_firstI wasn't able to get the integration tests compiling locally sadly, so this probably doesn't quite compile, but it should be close.
I have made this change.
Reminder to please squash on or before merge. |
Thanks for the reminder @tamird. Commits squashed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r23, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @alessandrod and @astoycos)
/// # Example | ||
/// | ||
///```no_run | ||
/// # let mut bpf = aya::Ebpf::load(&[])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// # let mut bpf = aya::Ebpf::load(&[])?; | |
/// # let mut ebpf = aya::Ebpf::load(&[])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still let mut bpf
in all other examples. I'd prefer to stick to the established pattern.. or change all of them to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, let's change them in a separate PR then.
aya-rs/aya#921 Do not merge in bpfman untl the tcx support gets merged in Aya. Signed-off-by: Andre Fredette <[email protected]>
aya/src/programs/mod.rs
Outdated
|
||
/// A checked integral type conversion failed. | ||
#[error(transparent)] | ||
TryFromIntError(#[from] TryFromIntError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't do this. We're exposing an implementation detail to the public
API: what if we stop using try_from/try_into in the implementation? Or if I get
the error, I get "TryFromIntError" with an error "an integral type conversion
failed" - what? what did I do wrong? how do I fix it?
Where is this used from? If we must fail, we should use a better, higher level
error variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve made the error more descriptive.
I’m only using it in cases where an if index is being converted from u32
to i32
. The problem stems from the fact that the kernel isn’t very disciplined about whether to use a u32
or i32
for if index, so we need to convert sometimes. The existing code was using the unchecked as i32
to do the conversion, but Tamir wanted that changed to a checked try_into
.
What happens if it fails? In these cases, it should never fail because the value should have started out as an i32
in the kernel, but got converted to a u32
somewhere before we got it, so it should convert back to an i32
just fine. In the unlikely event that it does fail, there’s a bug either in our code or in one of the external Linux functions, and the user would need to report it and/or fix it.
aya/src/programs/mod.rs
Outdated
@@ -239,6 +245,20 @@ impl AsFd for ProgramFd { | |||
} | |||
|
|||
/// The various eBPF programs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this line was left here accidentally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
aya/src/programs/mod.rs
Outdated
@@ -797,6 +819,57 @@ impl_fd!( | |||
CgroupDevice, | |||
); | |||
|
|||
/// Defines the [`Program`] types which support the kernel's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A trait does not define.
Trait implemented by the [`Program`] types which support ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
aya/src/programs/mod.rs
Outdated
|
||
impl_multiprog_fd!(SchedClassifier); | ||
|
||
/// Defines the [`Link`] types which support the kernel's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
aya/src/programs/mod.rs
Outdated
/// # Minimum kernel version | ||
/// | ||
/// The minimum kernel version required to use this feature is 6.6.0. | ||
pub trait MultiProgProgram { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiProgProgram kinda stutters... maybe MultiProgram?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, and MultiProgram sounds better. Done.
match self { | ||
Self::Ingress => Ok(BPF_TCX_INGRESS), | ||
Self::Egress => Ok(BPF_TCX_EGRESS), | ||
Self::Custom(tcx_attach_type) => Err(TcError::InvalidTcxAttach(*tcx_attach_type)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would return None in this case. Also TcAttachType::parent should return None
for tcx I think? Tcx programs use LinkOrder I don't think that parent makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below.
InvalidTcxAttach(u32), | ||
/// operation not supported for programs loaded via tcx | ||
#[error("operation not supported for programs loaded via tcx")] | ||
InvalidLinkOperation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove obth this and InvalidTcxAttach and return None where necessary
instead. The way things are now, TcAttachType::parent returns "something" but
it's not really a valid value for TCX. Instead of adding another error for that
case, I'd just return None everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I’ve worked with the code in this pr and in my bpfman implementation I’ve started thinking that it might be better to have different attach types for TC and TCX because they are really two different hook points and attach types with different kernel APIs and attributes. Trying to treat them as the same adds complication as you are pointing out here. Since I wasn’t involved in the earlier design discussions, what’s implemented works, and it’s kind of late in the game, I was planning to see if I could come up with a cleaner solutions after this pr merges. That said, with the current design, I think it’s best to leave these errors as-is for now. Some details follow.
TcAttachType::tcx_attach_type()
should only be called from a TCX context, and the attach type should never be Custom
. If the user makes an error, it's better to return an error here rather than requiring the caller to handle a None
error case later.
TcAttachType::parent
should only be used in a TC (netlink) context, which it currently is. However, we can't detect if it's mistakenly used in a TCX context so we can’t return something different for TCX. To help clarify, I renamed parent
to tc_parent
to indicate that it applies only to TC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r24, 4 of 4 files at r25, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @alessandrod, @anfredette, @astoycos, @dave-tucker, and @vadorovsky)
aya/src/programs/mod.rs
line 222 at r25 (raw file):
/// Value is too large to use as an interface index. #[error("Value {value} is too large to use as an interface index")]
nit: the other error strings aren't capitalized
aya/src/programs/mod.rs
line 225 at r25 (raw file):
InvalidIfIndex { /// Value used in conversion attempt. value: u32,
nit: s/value/if_index/?
aya/src/programs/mod.rs
line 823 at r25 (raw file):
); /// Trait implemented by the [`Program`] types which support the kernel's
nit: don't say "trait"? the code already says it (ditto below)
They should all be capitalized
I would rather say it. In general, I would always do what std does eg https://doc.rust-lang.org/stable/std/borrow/trait.Borrow.html |
@@ -150,11 +205,11 @@ impl SchedClassifier { | |||
&mut self, | |||
interface: &str, | |||
attach_type: TcAttachType, | |||
options: TcOptions, | |||
options: TcAttachOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this api it's now possible to
attach_with_options("foo", TcAttachType::Custom(42), TcAttachOptions::TcxOrder(...))
which shouldn't be possible.
How badly blocked is bpfman on this? This PR has been going on for a while, so I
understand if you want to merge asap to unblock bpfman.
If it's not super urgent, I think this whole TcAtachType/TcAttachOptions needs
to be redone, I can try to come up with a better API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let's just merge as is, with the errors instead of None. No need to
block this further I'll clean up the API later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alessandrod It's perfectly fine with me to merge it as is. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, I'd be happy to work on an improved API later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the risk of extending this further, I pushed a couple more changes that we had discussed:
- Changed the two try_into's for i32 to "as i32" and deleted the InvalidIfIndex error. (I already had the change staged, and I think it's better.)
- Don't try to attach as TCX for TcAttachType::Custom. (This was a bug, so I thought it was worth fixing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@alessandrod Is this ready to go now? |
aya-rs/aya#921 Do not merge in bpfman untl the tcx support gets merged in Aya. Signed-off-by: Andre Fredette <[email protected]>
/// let (revision, programs) = SchedClassifier::query_tcx("eth0", TcAttachType::Ingress)?; | ||
/// # Ok::<(), Error>(()) | ||
/// ``` | ||
pub fn query_tcx( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be done in a followup I don't want to delay this PR further, but
- I think this should be a module-level function not a method of
SchedClassifier - I don't think it should have "query" in its name. Query is the underlying
syscall used, but we could call the methodattached_tcx_programs()
or something
b5a1f17
to
c333328
Compare
This commit adds the initial support for TCX bpf links. This is a new, multi-program, attachment type allows for the caller to specify where they would like to be attached relative to other programs at the attachment point using the LinkOrder type. Signed-off-by: astoycos <[email protected]> Co-authored-by: Andre Fredette <[email protected]> Co-authored-by: Dave Tucker <[email protected]> Co-authored-by: Tamir Duberstein <[email protected]>
aya-rs/aya#921 Do not merge in bpfman untl the tcx support gets merged in Aya. Signed-off-by: Andre Fredette <[email protected]>
Fixes #918
This initial attempt works with the following example -> https://github.com/astoycos/tcxtest
Please see integration tests for example of how the new API works
This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"