-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio encoding - part 1 of N #524
base: main
Are you sure you want to change the base?
Conversation
730f8f9
to
d5fe996
Compare
@@ -61,6 +61,9 @@ function(make_torchcodec_libraries | |||
AVIOContextHolder.cpp | |||
FFMPEGCommon.cpp | |||
SingleStreamDecoder.cpp | |||
# TODO: lib name should probably not be "*_decoder*" now that it also | |||
# contains an encoder | |||
Encoder.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be libtorchcodec_coreN.so
?
src/torchcodec/_core/Encoder.cpp
Outdated
// We're allocating the stream here. Streams are meant to be freed by | ||
// avformat_free_context(avFormatContext), which we call in the | ||
// avFormatContext_'s destructor. | ||
avStream_ = avformat_new_stream(avFormatContext_.get(), nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to make sure I understand the relationship here: avformat_new_stream()
takes a pointer to an AVFormatContext
. It creates a new stream, associates that stream with the provided AVFormatContext
such that the AVFormatContext
owns the stream, and returns a pointer to that newly created stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, or to be more precise: this is also what I understand from the FFmpeg docs.
src/torchcodec/_core/Encoder.h
Outdated
void encode(); | ||
|
||
private: | ||
void encode_inner_loop( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: camelCase instead of snake_case.
For the record, I prefer snake case, but I think being consistent is more important. If I could wave a magic wand and make our repo all snake case for variable and function names, I would. Class names should still be pascal case, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am more used to camel case too. If we really want to, surely there must exist some camelCase to smake_case auto converters?
Expect N to be large :)
This PR implements a basic and not feature complete audio encoder, which seems to work OK in some limited scenarios. There are a lot of TODOs left in the code.
This is only C++ and core APIs, nothing is public yet. The current design is to pass all the necessary parameters to the constructor. Namely, with the core API:
The reason is: all these parameters are required in order to initialize the
AVFormatContext
and theAVCodecContext
. This may very well change, i.e. eventually we may decide that we don't even want to expose this as a C++ object but rather as a pure function? It'll be easier to decide later once we're more feature complete.