AV1 (CPU) #333

GoingOffRoading · 2023-04-21T05:07:40Z

What is your new plugin request?

The AV1 from ffmpeg as an available encoder plugin. This is the next generational jump from X264 to X265 to AV1

Additional Context

https://trac.ffmpeg.org/wiki/Encode/AV1

Admin9705 · 2023-07-06T14:31:11Z

Would be helpful also.

GoingOffRoading · 2023-07-06T14:56:44Z

We have to wait for jellyfin_ffmpeg to be updated to a more recent version:

https://github.com/Unmanic/unmanic/blob/master/docker/Dockerfile

Once it's updated, creating an AV1 plugin will be possible.

OpenSanctions helps investigators find leads, allows companies to manage risk and enables technologists to build data-driven products. You can check [their datasets](https://www.opensanctions.org/datasets/). feat(aleph#offshore-graph): offshore-graph [offshore-graph](https://github.com/opensanctions/offshore-graph) contains scripts that will merge the OpenSanctions Due Diligence dataset with the ICIJ OffshoreLeaks database in order create a combined graph for analysis. The result is a Cypher script to load the full graph into the Neo4J database and then browse it using the Linkurious investigation platform. Based on name-based entity matching between the datasets, an analyst can use this graph to find offshore holdings linked to politically exposed and sanctioned individuals. As a general alternative, you can easily export and convert entities from an Aleph instance to visualize them in Neo4j or Gephi using the ftm CLI: https://docs.aleph.occrp.org/developers/how-to/data/export-network-graphs/ feat(ccc): introduce the CCC [Chaos Communication Congress](https://events.ccc.de/en/) is the best gathering of hacktivism in europe. **Prepare yourself for the congress** You can follow [MacLemon's checklist](https://github.com/MacLemon/CongressChecklist) **[Install the useful apps](https://events.ccc.de/congress/2024/hub/de/wiki/apps/)** *The schedule app* You can use either the Fahrplan app or giggity, I've been using the second for a while, so is the one I use *The navigation app* `c3nav` is an application to get around the congress. The official F-droid is outdated, so add [their repository](https://f-droid.c3nav.de/fdroid/repo/?fingerprint=C1EC2D062F67A43F87CCF95B8096630285E1B2577DC803A0826539DF6FB4C95D) to get the latest version. **Reference** - [Home](https://events.ccc.de/en/ - [Engelsystem](https://engel.events.ccc.de/) - [Angel FAQ](https://engel.events.ccc.de/faq) feat(ccc#Angel's system): introduce the Angel's system [Angels](https://angelguide.c3heaven.de/) are participants who volunteer to make the event happen. They are neither getting paid for their work nor do they get free admission. **[Expectation](https://angelguide.c3heaven.de/#_expectations)** Helping at our events also comes with some simple, but important expectations of you: - Be on time for your shift or give Heaven early notice. - Be well rested, sober and not hungry. - Be open-minded and friendly in attitude. - Live our moral values: - Be excellent to each other. - All creatures are welcome. **[Quickstart](https://angelguide.c3heaven.de/#_quick_start)** - Create yourself an [Engelsystem account](https://engel.events.ccc.de/) - Arrive at the venue - Find [the Heaven](https://c3nav.de/) and go there. - Talk to a welcome angel or a shift coordinator to get your angel badge and get marked as arrived. - If you have any questions, you can always ask the shift coordinators behind the counter. - Attend an angel meeting - Announced in the Engelsystem news - Click yourself an interesting shift - Read shift descriptions first - Participate in your shift - Use the navigation to find the right place. - Arrive a little bit early at the meeting point - Rest for at least one hour - Repeat from step 5 And always, have a lot of fun. To get more insights read [this article](https://jascha.wtf/angels-at-chaos-about-volunteering-and-fitting-in/) **[The engelsystem](https://angelguide.c3heaven.de/#_the_engelsystem)** The [Engelsystem](https://engel.events.ccc.de/) is the central place to distribute the work to all the helping angels. It can be a bit overwhelming at the beginning but you will get used to it and find your way around. As you might have seen there are many different shifts and roles for angels — some sounding more appealing than others. There are shifts where you need to have some knowledge before you can take them. This knowledge is given in introduction meetings or by taking an unrestricted shift in the team and getting trained on the job. These introduction meetings are announced in the Engelsystem under the tab "Meetings". Heaven and the teams try to make sure that there are only restrictions for shifts in place where they are absolutely needed. Most restrictions really only need a meeting or some unrestricted shifts at the team to get lifted. Harder restrictions are in place where volunteers need to have special certification, get access to certain systems with a huge amount of data (e.g. mail-queues with emails from participants) or handling big piles of money. Usually the requirements for joining an angeltype are included in the description of the angeltype. Especially the restricted shifts are tempting because after all we want to get the event running, aren’t we? From our personal experience what gets the event running are the most common things: Guarding a door, collecting bottle/trash, washing dishes in the angel kitchen, being on standby to hop in when spontaneous help is needed or check the wrist band at the entrance. If there are any further questions about angeltypes, the description of the angeltype usually includes contact data such as a DECT number or an e-mail address that can be used. Alternatively, you can also ask one of the persons of the respective angeltype mentioned under "Supporter". **[Teams](https://angelguide.c3heaven.de/#_teams)** Congress is organized from different teams, each with its own area of expertise. All teams are self-organized and provide their own set of services to the event. Teams spawn into existence by a need not fulfilled. They are seldom created by an authority. Check out the [different teams](https://angelguide.c3heaven.de/#_teams) to see which one suits you best. [Some poeple](https://jascha.wtf/angels-at-chaos-about-volunteering-and-fitting-in/) suggest not to try to fit into special roles at your first event. The roles will find you – not the other way around. Our community is not about personal growing but about contributing to each other and growing by doing this. **Perks** Being an angel also comes with some perks. While we hope that participation is reward enough, here is a list of things that are exclusive to angels: - Community acknowledgement - Hanging out in Heaven and the angel hack center with its chill out area - Free coffee and (sparkling) water - Warm drinks or similar to make the cold night shifts more bearable **Rewards** If you have contributed a certain amount of time, you may receive access to: - Fantastic hot vegan and vegetarian meals - The famous limited™ angel T-shirt in Congress design - Maybe some other perks feat(kubectl_commands#Delete pods that are stuck in terminating state for a while): Delete pods that are stuck in terminating state for a while ```bash kubectl delete pod <pod-name> --grace-period=0 --force ``` fix(himalaya): tweak the bindings Move forward and backwards in the history of emails: ```lua vim.api.nvim_create_autocmd("FileType", { group = "HimalayaCustomBindings", pattern = "himalaya-email-listing", callback = function() vim.api.nvim_buf_set_keymap(0, "n", "b", "<plug>(himalaya-folder-select-previous-page)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "f", "<plug>(himalaya-folder-select-next-page)", { noremap = true, silent = true }) end, }) ``` Better bindings for the email list view: ```lua -- Refresh emails vim.api.nvim_buf_set_keymap(0, "n", "r", ":lua FetchEmails()<CR>", { noremap = true, silent = true }) -- Email list view bindings vim.api.nvim_buf_set_keymap(0, "n", "b", "<plug>(himalaya-folder-select-previous-page)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "f", "<plug>(himalaya-folder-select-next-page)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "R", "<plug>(himalaya-email-reply-all)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "F", "<plug>(himalaya-email-forward)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "m", "<plug>(himalaya-folder-select)", { noremap = true, silent = true }) vim.api.nvim_buf_set_keymap(0, "n", "M", "<plug>(himalaya-email-move)", { noremap = true, silent = true }) ``` feat(himalaya#Searching emails): Searching emails You can use the `g/` binding from within nvim to search for emails. The query syntax supports filtering and sorting query. I've tried changing it to `/` without success :'( **Filters** A filter query is composed of operators and conditions. There is 3 operators and 8 conditions: - `not <condition>`: filter envelopes that do not match the condition - `<condition> and <condition>`: filter envelopes that match both conditions - `<condition> or <condition>`: filter envelopes that match one of the conditions - `date <yyyy-mm-dd>`: filter envelopes that match the given date - `before <yyyy-mm-dd>`: filter envelopes with date strictly before the given one - `after <yyyy-mm-dd>`: filter envelopes with date stricly after the given one - `from <pattern>`: filter envelopes with senders matching the given pattern - `to <pattern>`: filter envelopes with recipients matching the given pattern - `subject <pattern>`: filter envelopes with subject matching the given pattern - `body <pattern>`: filter envelopes with text bodies matching the given pattern - `flag <flag>`: filter envelopes matching the given flag **Sorting** A sort query starts by "order by", and is composed of kinds and orders. There is 4 kinds and 2 orders: - `date [order]`: sort envelopes by date - `from [order]`: sort envelopes by sender - `to [order]`: sort envelopes by recipient - `subject [order]`: sort envelopes by subject - `<kind> asc`: sort envelopes by the given kind in ascending order - `<kind> desc`: sort envelopes by the given kind in descending order **Examples** `subject foo and body bar`: filter envelopes containing "foo" in their subject and "bar" in their text bodies `order by date desc subject`: sort envelopes by descending date (most recent first), then by ascending subject `subject foo and body bar order by date desc subject`: combination of the 2 previous examples feat(himalaya#Not there yet): List more detected issues - [Replying an email doesn't mark it as replied](https://github.com/pimalaya/himalaya-vim/issues/14) feat(himalaya#Cannot install): Troubleshoot cannot install the program Sometimes [the installation steps fail](https://github.com/pimalaya/himalaya/issues/513) as it's still not in stable. A workaround is to download the binary created by the [pre-release CI](https://github.com/pimalaya/himalaya/actions/workflows/pre-releases.yml). You can do it by: - Click on the latest job - Click on jobs - Click on the job of your architecture - Click on "Upload release" - Search for "Artifact download URL" and download the file - Unpack it and add it somewhere in your `$PATH` feat(jellyfin#Enable hardware transcoding): Enable hardware transcoding **[Enable NVIDIA hardware transcoding](https://jellyfin.org/docs/general/administration/hardware-acceleration/nvidia)** *Remove the artificial limit of concurrent NVENC transcodings* Consumer targeted [Geforce and some entry-level Quadro cards](https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new) have an artificial limit on the number of concurrent NVENC encoding sessions (max of 8 on most modern ones). This restriction can be circumvented by applying an unofficial patch to the NVIDIA Linux and Windows driver. To apply the patch: First check that your current version is supported `nvidia-smi`, if it's not try to upgrade the drivers to a supported one, or think if you need more than 8 transcodings. ```bash wget https://raw.githubusercontent.com/keylase/nvidia-patch/refs/heads/master/patch.sh chmod +x patch.sh ./patch.sh ``` If you need to rollback the changes run `./patch.sh -r`. You can also patch it [within the docker itself](https://github.com/keylase/nvidia-patch?tab=readme-ov-file#docker-support) ```yaml services: jellyfin: image: jellyfin/jellyfin user: 1000:1000 network_mode: 'host' volumes: - /path/to/config:/config - /path/to/cache:/cache - /path/to/media:/media runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] ``` Restart the docker and then check that you can access the graphics card with: ```bash docker exec -it jellyfin nvidia-smi ``` Enable NVENC in Jellyfin and uncheck the unsupported codecs. **Tweak the docker-compose** The official Docker image doesn't include any NVIDIA proprietary driver. You have to install the NVIDIA driver and NVIDIA Container Toolkit on the host system to allow Docker access to your GPU. refactor(life_review): into roadmap_adjustment feat(nodejs#Using nvm): Install using nvm ```bash curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash nvm install 22 node -v # should print `v22.12.0` npm -v # should print `10.9.0` ``` feat(linux_snippets#Convert an html to a pdf): Convert an html to a pdf **Using weasyprint** Install it with `pip install weasyprint PyMuPDF` ```bash weasyprint input.html output.pdf ``` It gave me better result than `wkhtmltopdf` **Using wkhtmltopdf** To convert the given HTML into a PDF with proper styling and formatting using a simple method on Linux, you can use `wkhtmltopdf` with some custom options. First, ensure that you have `wkhtmltopdf` installed on your system. If not, install it using your package manager (e.g., Debian: `sudo apt-get install wkhtmltopdf`). Then, convert the HTML to PDF using `wkhtmltopdf` with the following command: ```bash wkhtmltopdf --page-size A4 --margin-top 15mm --margin-bottom 15mm --encoding utf8 input.html output.pdf ``` In this command: - `--page-size A4`: Sets the paper size to A4. - `--margin-top 15mm` and `--margin-bottom 15mm`: Adds top and bottom margins of 15 mm to the PDF. After running the command, you should have a nicely formatted `output.pdf` file in your current directory. This method preserves most of the original HTML styling while providing a simple way to export it as a PDF on Linux. If you need to zoom in, you can use the `--zoom 1.2` flag. For this to work you need your css to be using the `em` sizes. feat(linux_snippets#Format a drive to use a FAT32 system): Format a drive to use a FAT32 system ```bash sudo mkfs.vfat -F 32 /dev/sdX ``` Replace /dev/sdX with your actual drive identifier feat(linux_snippets#Get the newest file of a directory with nested directories and files): Get the newest file of a directory with nested directories and files ```bash find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" " ``` fix(linux_snippets#How to debug a CPU Throttling high alert): How to debug a CPU Throttling high alert If the docker is using less resources than the limits but they are still small (for example 0.1 CPUs) the issue may be that the CPU spikes are being throttle before they are shown in the CPU usage, the solution is then to increase the CPU limits # Create a systemd service for a non-root user To set up a systemd service as a **non-root user**, you can create a user-specific service file under your home directory. User services are defined in `~/.config/systemd/user/` and can be managed without root privileges. feat(linux_snippets): Check the docker images sorted by size ```bash docker images --format "{{.Repository}}:{{.Tag}}\t{{.Size}}" | sort -k2 -h ``` You can also use the builtin `docker system df -v` to get a better understanding of the usage of disk space. feat(orgzly): Migrate from Orgzly to Orgzly Revived - [Home](https://www.orgzlyrevived.com/) - [F-droid page](https://f-droid.org/en/packages/com.orgzlyrevived/) - [Source](https://github.com/orgzly-revived/orgzly-android-revived) - [Old Home](https://orgzly.com/) feat(parkour#Vaults): Introduce Vaults **[Safety vault](https://yewtu.be/watch?v=f65H4Rr0oD0)** I'm going to describe one way of doing it, but it could also be done as a mirror image. - Advance towards the fence - Place your left hand on the fence with your fingers pointing in the direction you're going - Jump while placing your right foot on the fence - Pass your left foot over the fence by pulling your knee close to your chest to prevent your foot from catching on the fence - Once your left foot is over, let go of your right foot - Advance with your right hand in the direction of movement (not upwards), until it rests on the bar (your hand should be facing opposite to the left, with fingers pointing downwards) - Once both feet are free and hands supported, push yourself forward. *[Thief vault](https://yewtu.be/watch?v=f65H4Rr0oD0)* *[Speed vault](https://yewtu.be/watch?v=f65H4Rr0oD0)* *[Kong vault or Cat](https://yewtu.be/watch?v=f65H4Rr0oD0)* If you're thinking on starting to learn parkour when you're an adult there are some things that you need to take into account: refactor(roadmap_adjustment): Reorder the file feat(transcoding): Introduce transcoding **Big disclaimer!!!** I have absolutely no idea of what I'm doing. It's the first time I do this and I'm just recording what I'm learning. Use the contents of this page at your own risk. I've made this document thinking mainly of using [AV1](#av1) as a video encoding algorithm, `ffmpeg` to do the transcoding and `jellyfin` to stream it, so the content is heavily **Initial guidance into the transcoding world** *Shall I transcode my library?* There are some discrepancies whether it makes sense to transcode your library. In my case I'm going to do it because disks are expensive (people compare with buying one disk, but to upgrade my NAS I need 5, so it would be around 1400$). Here are other opinions: [1](https://www.reddit.com/r/AV1/comments/ymrs5v/id_like_to_encode_my_entire_library_to_av1/) **[Do's and don'ts](https://wiki.x266.mov/blog/av1-for-dummies#dos--donts)** Due to a lot of misunderstandings about codecs and compression, there are a lot of common misconceptions that are held regarding video encoding. We'll start by outlining some bad practices: - Don't encode the same video multiple times. This is a common mistake made by people new to video encoding. Every time you encode a video, you lose additional quality due to generation loss. This is because video codecs are lossy, and every time you encode a video, you lose more information. This is why it is important to keep the original video file if you frequently re-encode it. - Don't blindly copy settings from others without understanding them. What works for one person's content and workflow may not work for yours. Even the default settings on many encoders are not optimal for most content. - Don't assume that higher bitrate equates to better quality. Inefficient encoding can waste bits without improving visual quality, and efficient encoding can make lower bitrate video look drastically better than higher bitrate video using the same codec. - Don't assume all encoders/presets/settings/implementations are created equal. Even given two encoding frameworks that use the same underlying encoder, you may achieve different results given encoder version mismatches or subtly different settings used under the hood. - Don't use unnecessarily slow presets/speeds unless you have a specific need and ample time. W slower presets improve encoding efficiency most of the time, the quality gains reach a point of diminishing returns beyond a certain point. Use the slowest preset you can tolerate, not the slowest preset available. - Don't blindly trust metric scores. It is unfortunate how trusted VMAF is considering how infrequently it correlates with visual fidelity in practice now that it has become so popular. Even the beloved SSIMULACRA2 is not a perfect one-to-one with the human eye. Now, let's move on to some good practices: - Experiment with different settings and compare the results. - Consider your content type when choosing encoding settings. Film, animation, and sports all have different characteristics that benefit from distinct approaches. - Keep your encoding software up-to-date; the encoding world moves quickly. **[Types of transcoding](https://jellyfin.org/docs/general/server/transcoding/#types-of-transcoding)** There are four types of playback in jellyfin; three of which involve transcoding. The type being used will be listed in the dashboard when playing a file. They are ordered below from lowest to highest load on the server: - Direct Play: Delivers the file without transcoding. There is no modification to the file and almost no additional load on the server. - Remux: Changes the container but leaves both audio and video streams untouched. - Direct Stream: Transcodes audio but leaves original video untouched. - Transcode: Transcodes the video stream. **Shall I use CPU or GPU to do the transcoding?** The choice between **CPU** or **GPU** for transcoding depends on your specific needs, including speed, quality, and hardware capabilities. Both have their pros and cons, and the best option varies by situation. *CPU-Based Transcoding* Pros: - **High Quality**: Software encoders like `libx264`, `libx265`, or `libsvtav1` produce better compression and visual quality at the same bitrate compared to GPU-based encoders. - **Flexibility**: Supports a wider range of encoding features, tuning options, and codecs. - **Optimized for Low Bitrates**: CPU encoders handle complex scenes more effectively, producing fewer artifacts at lower bitrates. - **No Dedicated Hardware Required**: Works on any modern system with a CPU. Cons: - **Slower Speed**: CPU encoding is much slower, especially for high-resolution content (e.g., 4K or 8K). - **High Resource Usage**: Consumes significant CPU resources, leaving less processing power for other tasks. Best Use Cases: - High-quality archival or master files. - Transcoding workflows where quality is the top priority. - Systems without modern GPUs or hardware encoders. *GPU-Based Transcoding* Pros: - **Fast Encoding**: Hardware-accelerated encoders like NVIDIA NVENC, AMD VCE, or Intel QuickSync can encode video much faster than CPUs. - **Lower Resource Usage**: Frees up the CPU for other tasks during encoding. - **Good for High-Resolution Video**: Handles 4K or even 8K video with ease. - **Low Power Consumption**: GPUs are optimized for parallel processing, often consuming less power per frame encoded. Cons: - **Lower Compression Efficiency**: GPU-based encoders often produce larger files or lower quality compared to CPU-based encoders at the same bitrate. - **Limited Features**: Fewer tuning options and sometimes less flexibility in codec support. - **Artifact Risk**: May introduce visual artifacts, especially in complex scenes or at low bitrates. Best Use Cases: - Streaming or real-time encoding. - High-volume batch transcoding where speed matters more than maximum quality. - Systems with capable GPUs (e.g., NVIDIA GPUs with Turing or Ampere architecture). *Quality vs. Speed* | **Factor** | **CPU Encoding** | **GPU Encoding** | |--------------------------|-------------------------------|--------------------------------| | **Speed** | Slower | Much faster | | **Quality** | Higher | Good but not the best | | **Bitrate Efficiency** | Better | Less efficient | | **Compatibility** | Broader | Limited by hardware support | | **Power Consumption** | Higher | Lower | *Conclusion* - For **quality-focused tasks** such as a whole library encoding: Use CPU-based encoding. - For **speed-focused tasks** such as streaming with jellyfin: Use GPU-based encoding. For more information read [1](https://www.reddit.com/r/PleX/comments/16w1hsz/cpu_vs_gpu_whats_the_smartest_choice/). **[Video transcoding algorithms](https://jellyfin.org/docs/general/clients/codec-support/)** [This guide](https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Video_codecs) introduces the video codecs you're most likely to encounter or consider using. I'm only going to analyse the ones that I might use. When deciding which one to use check [jellyfin's video compatibility table](https://jellyfin.org/docs/general/clients/codec-support/#video-compatibility) or [test your browser's compatibility for any codec profile](https://cconcolato.github.io/media-mime-support/). Without taking into account H264 8Bits AV1 is the one which has more compatibility support except for iOS (but who cares about them?). **[AV1](https://wiki.x266.mov/blog/av1-for-dummies)** [AV1](https://wiki.x266.mov/docs/video/AV1) is a royalty-free video codec developed by the Alliance for Open Media. It is designed to replace VP9 and presently competes with H.266. AV1 is known for its high compression efficiency, which the marketing will have you believe reduces file sizes by up to 50% compared to H.264 and up to 30% compared to H.265 across the board. It is supported by several major browsers and is widely used across many streaming services and video platforms. Before we dive in, it is important to understand why you may want to use AV1 instead of other codecs. The reality is that AV1 is not better than H.264/5 in every single scenario; video encoding is a complicated field, and the best codec for you will depend on your specific needs. AV1 excels in: - Low to medium-high fidelity encoding - Higher resolution encoding - Encoding content with very little grain or noise - Slow, non-realtime contexts (e.g. offline encoding) The enumeration above still consists of broad strokes, but the point is to understand that AV1 is not a silver bullet. It will not automatically make your videos smaller while preserving your desired quality. To make things more difficult, the x264 & x265 encoders are very mature, while AV1 encoding efforts designed to meet the extremely complicated needs of the human eye are still in their infancy. **AV1 current problems** The first problem I've seen is that Unmanic doesn't support AV1 very well, you need to write your own ffmpeg configuration (which is fine). Some issues that track this are [1](https://github.com/Unmanic/unmanic-plugins/issues/333), [2](https://github.com/Unmanic/unmanic/issues/181), [3](https://github.com/Unmanic/unmanic/issues/390), [4](https://github.com/Unmanic/unmanic/issues/471) or [5](https://github.com/Unmanic/unmanic-plugins/issues?q=is%3Aissue+is%3Aopen+av1) **[AV1 probably outdated problems](https://gist.github.com/dvaupel/716598fc9e7c2d436b54ae00f7a34b95#current-problems)** The original article is dated of early 2022 so it's probably outdated given the speed of this world. - 10 bit playback performance is not reliable enough on average consumer hardware. - AV1 tends to smooth video noticeably, even at high bitrates. This is especially appearant in scenes with rain, snow etc, where it is very hard to conserve details. - Grainy movies are a weakpoint of AV1. Even with film grain synthesis enabled, it is very hard to achieve satisfying results. At the moment (early 2022, SVT-AV1 v0.9.0) it is a fairly promising, but not perfect, codec. It is great for most regular content, achieves great quality at small file sizes and appearantly keeps its promise of being considerably more efficient than HEVC. The only area where it must still improve is grainy, detailed movie scenes. With such high-bitrate, bluray-quality source material it's hard to achieve visual transparency. If grain synthesis has improved enough and smooth decoding is possible in most devices, it can be generally recommended. For now, it is still in the late experimental phase. feat(transcoding#AV1 encoders): AV1 encoders The world of AV1 encoding is diverse and complex, with several open-source encoders available, each bringing its own set of strengths, weaknesses, and unique features to the table. - SVT-AV1 - rav1e - aomenc (libaom) - SVT-AV1-PSY Understanding these encoders is crucial for making informed decisions about what best suits your specific encoding needs. **SVT-AV1** [SVT-AV1](https://wiki.x266.mov/docs/encoders/SVT-AV1) (Scalable Video Technology for AV1) is an AV1 encoder library and application developed by Intel, Netflix, and others. It has gained significant popularity in the encoding community due to its impressive balance of speed, quality, and scalability. Links: - Wiki page: [SVT-AV1](https://wiki.x266.mov/docs/encoders/SVT-AV1) - Git repository: https://gitlab.com/AOMediaCodec/SVT-AV1 - Documentation: https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/README.md 1. **Performance & Scalability** - SVT-AV1 is renowned for its encoding speed, particularly at higher speed presets. - It leverages parallel processing, making it exceptionally efficient on multi-core systems. Fun fact: SVT-AV1's parallel processing is lossless, so it doesn't compromise quality for speed. 2. **Quality-to-Speed Ratio** - SVT-AV1 strikes an impressive balance between encoding speed and output quality. - At faster presets, it usually outperforms other encoders in quality per unit of encoding time. - While it may not achieve the absolute highest *quality per bit* possible, its quality is generally considered impressive for its speed. 3. **Flexibility** - SVT-AV1 offers a wide range of encoding options and presets, allowing fine-tuned control over the encoding process. - It provides 14 presets (0-13), with 0 being the slowest and highest quality, and 13 being the fastest but lowest quality. - Advanced options allow users to adjust parameters like hierarchical levels, intra-refresh type, and tuning modes. 4. **Continuous Development** - SVT-AV1 receives frequent updates and optimizations, with new releases often coming alongside big changes. - The open-source nature of the project encourages community contributions and rapid feature development. SVT-AV1 is an excellent choice for a wide range of encoding scenarios. It's particularly well-suited for: - High-volume encoding operations where speed is crucial - Live or near-live encoding of high-resolution content - Scenarios where a balance between quality and encoding speed is required - Users with multi-core systems who want to leverage their hardware efficiently Some downsides include: - Higher memory usage compared to other encoders - The developers assess quality via its performance on traditional legacy metrics, which harms its perceptual fidelity ceiling. **rav1e** [rav1e](https://wiki.x266.mov/docs/encoders/rav1e) is an AV1 encoder written in Rust & Assembly. Developed by the open-source community alongside Xiph, it brings a unique approach to AV1 encoding with its focus on safety and correctness. Links: - Wiki page: [rav1e](https://wiki.x266.mov/docs/encoders/rav1e) - Git repository: https://github.com/xiph/rav1e - Documentation: https://github.com/xiph/rav1e/tree/master/doc#readme 1. **Safety & Reliability** - Being written in Rust, rav1e emphasizes memory safety and thread safety. - This focus on safety translates to a more stable and reliable encoding process, with reduced risks of crashes or undefined behavior. 2. **High Fidelity** - At high fidelity targets - an area where AV1 usually lacks - rav1e is a strong contender compared to other encoders. - It excels in preserving fine details and textures, making it a good choice for high-fidelity encoding. 3. **Quality** - While not typically matching aomenc or SVT-AV1 in pure compression efficiency, rav1e can produce high-quality output videos. - It often achieves a good balance between quality and encoding time, especially at medium-speed settings. 4. **Perceptually Driven** - rav1e's development is driven by visual fidelity, without relying heavily on metrics. - This focus on perceptual quality leads to a stronger foundation for future potential improvements in visual quality, as well as making the encoder very easy to use as it does not require excessive tweaking. rav1e is well-suited for: - Projects where stability is paramount - Users who prioritize a community-driven, open-source development approach - Encoding tasks where a balance between quality and speed is needed, but the absolute fastest speeds are not required Some limitations of rav1e include: - Lagging development compared to other encoders - Slower encoding speeds compared to SVT-AV1 at similar quality & size - Fewer advanced options compared to other encoders **aomenc (libaom)** [aomenc](https://wiki.x266.mov/docs/encoders/aomenc), based on the libaom library, is the reference encoder for AV1. Developed by the Alliance for Open Media (AOM), it is the benchmark for AV1 encoding quality and compliance. Links: - Wiki page: [aomenc](https://wiki.x266.mov/docs/encoders/aomenc) - Git repository: https://aomedia.googlesource.com/aom/ 1. **Encoding Quality** - aomenc is widely regarded as the gold standard for AV1 encoding quality. - It often achieves high compression efficiency among AV1 encoders, especially at slower speed settings. - The encoder squeezes out nearly every last bit of efficiency from the AV1 codec, making it ideal for archival purposes or when quality per bit is critical. 2. **Encoding Speed** - aomenc is generally the slowest among major AV1 encoders. - It offers 13 CPU speed levels (0-12), but even at its fastest settings, it's typically slower than other encoders at their slower settings. - The slow speed is often considered a trade-off for its high compression efficiency. 3. **Extensive Options** - As the reference implementation, aomenc offers the most comprehensive encoding options. - It provides fine-grained control over nearly every aspect of the AV1 encoding process. - Advanced users can tweak many parameters to optimize for specific content types or encoding scenarios. 4. **Flexibility** - Being the reference encoder, aomenc produces highly standards-compliant AV1 bitstreams that take advantage of the full arsenal of AV1 features. - It supports 4:2:0 and 4:4:4 chroma subsampling, 8- to 12-bit color depth, and various other advanced features that more specialized encoders like SVT-AV1 do not support. aomenc is ideal for: - Scenarios where achieving the highest possible quality is the primary goal - Archival encoding where compression efficiency is crucial - Research and development in video compression - Encoding projects where encoding time is not a significant constraint Some drawbacks of aomenc include: - Unresponsive development driven by legacy metrics, leading to slower adoption of new techniques and ignoring improvements communicated by people outside the Google development team - Cripplingly difficult to use for beginners, with a culture of cargo-culting settings - Slow encoding speeds compared to other AV1 encoders, which has less of an impact on the quality of the output than it used to compared to maturing encoders like SVT-AV1 **SVT-AV1-PSY** [SVT-AV1-PSY](https://wiki.x266.mov/docs/encoders/svt-av1-psy) is a community fork of the SVT-AV1 encoder focused on psychovisual optimizations to enhance perceived visual quality. It aims at closing the distance between SVT-AV1's high speeds and the perceptual quality of aomenc's slow brute force approach. Links: - Wiki page: [SVT-AV1-PSY](https://wiki.x266.mov/docs/encoders/svt-av1-psy) - Git repository: https://github.com/gianni-rosato/svt-av1-psy - Documentation: https://github.com/gianni-rosato/svt-av1-psy/blob/master/Docs/PSY-Development.md 1. **Perceptual Optimizations** - SVT-AV1-PSY introduces various psychovisual enhancements to improve the perceived quality of encoded video. - These optimizations often result in output that looks better to the human eye, even if it might not always score as well in objective metrics. 2. **Additional Features** - Introduces new options like variance boost, which can help maintain detail in high-contrast scenes. - Offers alternative curve options for more nuanced control over the encoding process. - Extends the CRF (Constant Rate Factor) range to 70 (from 63 in mainline SVT-AV1), allowing for extremely low-bitrate encodes. - Introduces additional tuning options, including a new "SSIM with Subjective Quality Tuning" mode that can improve perceived quality. 3. **Visual Fidelity Focus** - Aims to produce more visually pleasing results, sometimes at the expense of metric performance. - Includes options like sharpness adjustment and adaptive film grain synthesis which can significantly impact the visual characteristics of the output. - Features modified defaults driven by perceptual quality considerations. 4. **Extended HDR Support** - Includes built-in support for Dolby Vision & HDR10+ encoding. - This makes it particularly useful for encoding HDR content without requiring additional post-processing steps or external tools. 5. **Performance** - Based on SVT-AV1, it retains the performance characteristics of its parent encoder. - Adds super slow presets (-2 and -3) for research purposes and extremely high-quality encoding. These additional presets can be useful for creating reference encodes or applications where encoding time is not a concern. SVT-AV1-PSY is particularly well-suited for: - Encoding scenarios where subjective visual quality is prioritized over pure metric performance - HDR content encoding in Dolby Vision or HDR10+ - Users who want fine-grained control over psychovisual aspects of encoding - Projects that require a balance between the speed of SVT-AV1 and enhanced visual quality - Encoding challenging content with complex textures or high-contrast scenes Some drawbacks are: - Everything that applies to SVT-AV1, including the lack of support for 4:4:4 chroma subsampling and 12-bit color depth that are useful in specific scenarios - There are no official ffmpeg builds with this fork, but there are community built binaries: - https://github.com/Uranite/FFmpeg-Builds - https://github.com/Nj0be/HandBrake-SVT-AV1-PSY - https://www.reddit.com/r/AV1/comments/1fppk4d/ffmpeg_with_svtav1_psy_latest_for_windows_10_x64/ **Conclusion** While SVT-AV1 is known for being fast, aomenc is renowned for its high-quality output, and rav1e is recognized for its safety and reliability, each encoder has strengths and weaknesses. The best encoder for you will depend on your specific needs and priorities. I'll go with SVT-AV1 given that is the standard maintained by the community, and that SVT-AV1-PSY is not available by default in the tools that I am going to use (you'd need to compile it, and maintain it), and that I probably won't notice the difference. **X265 (HVEC)** - Is not open source **Conclusion** I'll use AV1 because: - Is an open source, royalty free codec - It's one of the most modern encoders - It has a wide range of compatibility support - It has better compression rate than x265 and it seems to be faster? feat(transcoding#Transcoding parameter comparison): Transcoding parameter comparison **Check the original file transcoding information** To investigate the encoding details of an existing video file, you can use FFprobe (which comes with FFmpeg). ```bash ffprobe -v quiet -print_format json -show_streams -show_format file_to_test.mp4 ``` The metadata isn't always perfectly reliable, especially if the file has been modified or re-encoded multiple times. **CRF comparison** Try to use CRF (Constant Rate Factor) for offline encoding, as opposed to CBR (onstant Bitrate) or VBR (Variable Bitrate). While the latter two are effective for precisely targeting a particular bitrate, CRF is more effective at targeting a specific quality level efficiently. So avoid the `-b:v 0` `ffmpeg` flag. Check these articles [1](https://ottverse.com/analysis-of-svt-av1-presets-and-crf-values/), [2](https://academysoftwarefoundation.github.io/EncodingGuidelines/EncodeAv1.html#crf-comparison-for-libsvtav1) and [3](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Ffmpeg.md) for an interesting comparison between `presets` and `crf` when encoding in AV1. A good starting point for 1080p video is `crf=30`. **Can you compare values of CRF between video algorithms?** No, CRF (Constant Rate Factor) values are not directly comparable between different encoding algorithms like AV1 and H.265. Each codec has its own unique compression characteristics, so a CRF of 32 in AV1 will look different from a CRF of 32 in H.265, and similarly, a CRF of 21 will have different visual qualities. Here's a more detailed explanation: 1. Codec-Specific Compression: - Each codec (AV1, H.265, H.264, etc.) has its own compression algorithm and efficiency - AV1 is generally more efficient at compression compared to H.265 - This means a higher CRF value in AV1 might produce similar or better quality compared to a lower CRF in H.265 2. Rough Equivalence Guidelines: While not exact, here are some rough comparative CRF ranges: - H.264: CRF 18-28 - H.265: CRF 20-28 - AV1: CRF 25-35 **Script to see the quality difference by changing the CRF** It's recommended to test encodes at different CRF values and compare for your specific content: - File size versus visual quality - Review objective quality metrics You can use the next script for that: ```bash input_video="$1" initial_time="00:15:00" probe_length="00:01:00" ffmpeg -i "$input_video" -ss $initial_time -t "$probe_length" -c copy original_segment.mkv crfs=(28 30 32 34 36) echo "CRF Comparison:" >size_comparison.log echo "Original Segment Size:" >>size_comparison.log original_size=$(du -sh original_segment.mkv) echo "$original_size" >>size_comparison.log echo "-------------------" >>size_comparison.log echo "CRF Quality Metrics Comparison:" >quality_metrics.log echo "-------------------" >>quality_metrics.log for crf in "${crfs[@]}"; do # Encode test segments ffmpeg -i "$input_video" \ -ss "$initial_time" -t "$probe_length" \ -c:v libsvtav1 \ -preset 5 \ -crf $crf \ -g 240 \ -pix_fmt yuv420p10le \ preview_crf_${crf}.mkv # Create a side by side comparison to see the differences ffmpeg -i original_segment.mkv -i preview_crf_${crf}.mkv \ -filter_complex \ "[0:v][1:v]hstack=inputs=2[v]" \ -map "[v]" \ -c:v libx264 \ side_by_side_comparison_original_vs_crf_${crf}.mkv # Log file size size=$(du -h preview_crf_${crf}.mkv | cut -f1) echo "CRF $crf: $size" >>size_comparison.log # Measure PSNR and SSIM ffmpeg -i original_segment.mkv -i preview_crf_${crf}.mkv \ -filter_complex "[0:v][1:v]psnr=stats_file=psnr_${crf}.log;[0:v][1:v]ssim=stats_file=ssim_${crf}.log" -f null - psnr_value=$(grep "psnr_avg:" psnr_${crf}.log | awk -F'psnr_avg:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') ssim_value=$(grep "All:" ssim_${crf}.log | awk -F'All:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') # Log the results echo "CRF $crf:" >>quality_metrics.log echo " PSNR: ${psnr_value:-N/A}" >>quality_metrics.log echo " SSIM: ${ssim_value:-N/A}" >>quality_metrics.log echo "-------------------" >>quality_metrics.log done ffmpeg -i preview_crf_${crfs[1]}.mkv -i preview_crf_${crfs[-1]}.mkv \ -filter_complex \ "[0:v][1:v]hstack=inputs=2[v]" \ -map "[v]" \ -c:v libx264 \ side_by_side_comparison_crf_${crfs[1]}_vs_crf_${crfs[-1]}.mkv ffmpeg -i preview_crf_${crfs[1]}.mkv -i preview_crf_${crfs[-1]}.mkv \ -filter_complex "[0:v][1:v]psnr=stats_file=psnr_crfs_max_difference.log;[0:v][1:v]ssim=stats_file=ssim_crfs_max_difference.log" -f null - psnr_value=$(grep "psnr_avg:" psnr_crfs_max_difference.log | awk -F'psnr_avg:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') ssim_value=$(grep "All:" ssim_crfs_max_difference.log | awk -F'All:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') echo "CRF ${crfs[1]} vs CRF ${crfs[-1]}:" >>quality_metrics.log echo " PSNR: ${psnr_value:-N/A}" >>quality_metrics.log echo " SSIM: ${ssim_value:-N/A}" >>quality_metrics.log echo "-------------------" >>quality_metrics.log cat size_comparison.log cat quality_metrics.log ``` It will: - Create a `original_segment.mp4` segment so you can compare the output - Create a series of `preview_crf_XX.mkv` files with a 1 minute preview of the content - Create a list of `side_by_side_comparison` files to visually see the difference between the CRF factors - Create a quality metrics report `quality_metrics.log` by analysing the SSIM and PSNR tests (explained below) - A `size_comparison.log` file to see the differences in size. **Results** **CRF vs file size** I've run this script against three files with the next results: - Big non animation file: "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10", 1920x1080, profile high, bit rate 23439837 ``` Original Segment Size: 152M original_segment.mp4 ------------------- CRF 28: 12M CRF 30: 9.9M CRF 32: 8.5M CRF 34: 7.4M CRF 36: 6.6M ``` - Big animation file: "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10", 1920x1038, profile high ``` Original Segment Size: 136M original_segment.mp4 ------------------- CRF 28: 12M CRF 30: 9.5M CRF 32: 8.0M CRF 34: 6.8M CRF 36: 6.0M ``` - small non animation file: ``` Original Segment Size: 5.9M original_segment.mp4 ------------------- CRF 28: 5.4M CRF 30: 4.9M CRF 32: 4.5M CRF 34: 4.1M CRF 36: 3.7M ``` **Compare the SSIM and PSNR results** PSNR (Peak Signal-to-Noise Ratio) measures the quality of the reconstruction of the image. It calculates the ratio between the maximum possible signal (pixel values) and the noise (the error between the original and reconstructed image). Higher PSNR means better quality. The range typically goes between 30-50dB, where 40 dB is considered excellent and 35dB is good, while less than 30 dB indicates noticeable quality loss [SSIM (Structural Similarity Index Measure)](https://en.wikipedia.org/wiki/Structural_similarity_index_measure) evaluates the perceptual similarity between two images by considering luminance, contrast, and structure. It ranges from -1 to 1, where 1 means identical images. Typically, > 0.95 is considered very good - Big non animation file: ``` CRF Quality Metrics Comparison: ------------------- CRF 28: PSNR: 37.6604 SSIM: 0.963676 ------------------- CRF 30: PSNR: 37.6572 SSIM: 0.96365 ------------------- CRF 32: PSNR: 37.6444 SSIM: 0.963535 ------------------- CRF 34: PSNR: 37.6306 SSIM: 0.963408 ------------------- CRF 36: PSNR: 37.6153 SSIM: 0.963276 ------------------- CRF 30 vs CRF 36: PSNR: 51.0188 SSIM: 0.996617 ------------------- ``` - Big animation file: ``` CRF Quality Metrics Comparison: ------------------- CRF 28: PSNR: 34.6944 SSIM: 0.904112 ------------------- CRF 30: PSNR: 34.6695 SSIM: 0.903986 ------------------- CRF 32: PSNR: 34.6388 SSIM: 0.903787 ------------------- CRF 34: PSNR: 34.612 SSIM: 0.903616 ------------------- CRF 36: PSNR: 34.5822 SSIM: 0.903423 ------------------- CRF 30 vs CRF 36: PSNR: 49.5002 SSIM: 0.99501 ------------------- ``` - small non animation file: ``` CRF Quality Metrics Comparison: ------------------- CRF 28: PSNR: 35.347 SSIM: 0.957198 ------------------- CRF 30: PSNR: 35.3302 SSIM: 0.957124 ------------------- CRF 32: PSNR: 35.3035 SSIM: 0.956993 ------------------- CRF 34: PSNR: 35.2767 SSIM: 0.956848 ------------------- CRF 36: PSNR: 35.2455 SSIM: 0.95666 ------------------- CRF 30 vs CRF 36: PSNR: 49.4795 SSIM: 0.995958 ------------------- ``` **CRF conclusion** In all cases the PSNR and SSIM values are very good, there is not much variability between the different CRF and it looks it changes more based on the type of video, working worse for animation. The side by side comparisons in the three cases returned similar results that my untrained eye was not able to catch big issues. I have to say that my screens are not very good and I'm not very picky. The size reduction is in fact astonishing being greater as the size of the original file increases. So it will make sense to transcode first the biggest ones. As I don't see many changes, and the recommended setting is CRF 30 I'll stick with it. **Preset comparison** This parameter governs the efficiency/encode-time trade-off. Lower presets will result in an output with better quality for a given file size, but will take longer to encode. Higher presets can result in a very fast encode, but will make some compromises on visual quality for a given crf value. Since SVT-AV1 0.9.0, supported presets range from 0 to 13, with higher numbers providing a higher encoding speed. Note that preset 13 is only meant for debugging and running fast convex-hull encoding. In versions prior to 0.9.0, valid presets are 0 to 8. After checking above articles [1](https://ottverse.com/analysis-of-svt-av1-presets-and-crf-values/), [2](https://academysoftwarefoundation.github.io/EncodingGuidelines/EncodeAv1.html#crf-comparison-for-libsvtav1), [3](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Ffmpeg.md) and [4](https://wiki.x266.mov/blog/svt-av1-third-deep-dive) I feel that a preset of `4` is the sweet spot, going to `3` for a little better quality or `5` for a little worse. AOMediaCodec people agrees: "Presets between 4 and 6 offer what many people consider a reasonable trade-off between quality and encoding time". And [wiki x266 also agrees](https://wiki.x266.mov/blog/svt-av1-third-deep-dive): It appears as if once again preset 2 through preset 4 remain the most balanced presets all-around in an efficient encoding scenario, with preset 3 not offering much improvements over preset 4 in average scores but nicely improving on consistency instead, and preset 2 offering a nice efficiency and consistency uplift on top. Clear quality gains can be observed as we decrease presets, until preset 2, however the effectiveness of dropping presets is noticeably less and less important as quality is increased. **[keyframe interval](https://trac.ffmpeg.org/wiki/Encode/AV1#Keyframeplacement)** `-g` flag in ffmpeg. This parameter governs how many frames will pass before the encoder will add a key frame. Key frames include all information about a single image. Other (delta) frames store only differences between one frame and another. Key frames are necessary for seeking and for error-resilience (in VOD applications). More frequent key frames will make the video quicker to seek and more robust, but it will also increase the file size. For VOD, a setting a key frame once per second or so is a common choice. In other contexts, less frequent key frames (such as 5 or 10 seconds) are preferred. Anything up to 10 seconds is considered reasonable for most content, so for 30 frames per second content one would use -g 300, for 60 fps content -g 600, etc. By default, SVT-AV1's keyframe interval is 2-3 seconds, which is quite short for most use cases. Consider changing this up to 5 seconds (or higher) with the -g option ; `-g 120` for 24 fps content, `-g 150` for 30 fps, etc. Using a higher GOP via the `-g` ffmpeg parameter results in a more efficient encode in terms of quality per bitrate, at the cost of seeking performance. A common rule-of-thumb among hobbyists is to use ten times the framerate of the video, but not more than `300`. The 3 files I'm analysing are using 24fps so I'll use a `-g 240` **Film grain** Consider using grain synthesis for grainy content, as AV1 can struggle with preserving film grain efficiently, especially when encoding high-quality content like old films or videos with a lot of grain detail. This is due to the fact that AV1's compression algorithms are optimized for clean, detailed content, and they tend to remove or smooth out film grain to achieve better compression ratios. Grain synthesis is the process of adding synthetic grain to a video to simulate or preserve the original grain structure. *[SVT-AV1 guidelines](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Ffmpeg.md)* The film-grain parameter enables this behavior. Setting it to a higher level does so more aggressively. Very high levels of denoising can result in the loss of some high-frequency detail, however. [The film-grain synthesis](https://trac.ffmpeg.org/wiki/Encode/AV1#Filmgrainsynthesis) feature is invoked with `-svtav1-params film-grain=X`, where `X` is an integer from `1` to `50`. Higher numbers correspond to higher levels of denoising for the grain synthesis process and thus a higher amount of grain. The grain denoising process can remove detail as well, especially at the high values that are required to preserve the look of very grainy films. This can be mitigated with the `film-grain-denoise=0` option, passed via svtav1-params. While by default the denoised frames are passed on to be encoded as the final pictures (`film-grain-denoise=1`), turning this off will lead to the original frames to be used instead. Passing film-grain-denoise=0 may result in higher fidelity by disabling source denoising. In that case, the correct film-grain level is important because a more conservative smoothing process is used--too high a film-grain level can lead to noise stacking. AOMediaCodec people suggests that a value of 8 is a reasonable starting point for live-action video with a normal amount of grain. Higher values in the range of 10-15 enable more aggressive use of this technique for video with lots of natural grain. For 2D animation, lower values in the range of 4-6 are often appropriate. If the original video does not have natural grain, this parameter can be omitted. There are two types of video that are called "animation": hand-drawn 2D and 3D animated. Both types tend to be easy to encode (meaning the resulting file will be small), but for different reasons. 2D animation often has large areas that do not move, so the difference between one frame and another is often small. In addition, it tends to have low levels of grain. Experience has shown that relatively high crf values with low levels of film-grain produce 2D animation results that are visually good. 3D animation has much more detail and movement, but it sometimes has no grain whatsoever, or only small amounts that were purposely added to the image. If the original animated video has no grain, encoding without film-grain will increase encoding speed and avoid the possible loss of fine detail that can sometimes result from the denoising step of the synthetic grain process. For more information on the synthetic grain check [this appendix](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Appendix-Film-Grain-Synthesis.md) **Film grain analysis** Using a similar script as the one for the crf we can see the differences in film grain: ```bash input_video="$1" initial_time="00:15:00" probe_length="00:01:00" ffmpeg -i "$input_video" -ss $initial_time -t "$probe_length" -c copy original_segment.mkv grains=(0 4 8 12) echo "Grain Comparison:" >size_comparison.log echo "Original Segment Size:" >>size_comparison.log original_size=$(du -sh original_segment.mkv) echo "$original_size" >>size_comparison.log echo "-------------------" >>size_comparison.log echo "Grain Quality Metrics Comparison:" >quality_metrics.log echo "-------------------" >>quality_metrics.log for grain in "${grains[@]}"; do # Encode test segments ffmpeg -i "$input_video" \ -ss "$initial_time" -t "$probe_length" \ -c:v libsvtav1 \ -preset 4 \ -crf 30 \ -g 240 \ -pix_fmt yuv420p10le \ -c:a libopus \ -b:a 192k \ -svtav1-params tune=0:film-grain=${grain} \ preview_grain_${grain}.mkv # Create a side by side comparison to see the differences ffmpeg -i original_segment.mkv -i preview_grain_${grain}.mkv \ -filter_complex \ "[0:v][1:v]hstack=inputs=2[v]" \ -map "[v]" \ -c:v libx264 \ side_by_side_comparison_original_vs_grain_${grain}.mkv # Log file size size=$(du -h preview_grain_${grain}.mkv | cut -f1) echo "grain $grain: $size" >>size_comparison.log # Measure PSNR and SSIM ffmpeg -i original_segment.mkv -i preview_grain_${grain}.mkv \ -filter_complex "[0:v][1:v]psnr=stats_file=psnr_${grain}.log;[0:v][1:v]ssim=stats_file=ssim_${grain}.log" -f null - psnr_value=$(grep "psnr_avg:" psnr_${grain}.log | awk -F'psnr_avg:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') ssim_value=$(grep "All:" ssim_${grain}.log | awk -F'All:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') # Log the results echo "grain $grain:" >>quality_metrics.log echo " PSNR: ${psnr_value:-N/A}" >>quality_metrics.log echo " SSIM: ${ssim_value:-N/A}" >>quality_metrics.log echo "-------------------" >>quality_metrics.log done ffmpeg -i preview_grain_${grains[1]}.mkv -i preview_grain_${grains[-1]}.mkv \ -filter_complex \ "[0:v][1:v]hstack=inputs=2[v]" \ -map "[v]" \ -c:v libx264 \ side_by_side_comparison_grain_${grains[1]}_vs_grain_${grains[-1]}.mkv ffmpeg -i preview_grain_${grains[1]}.mkv -i preview_grain_${grains[-1]}.mkv \ -filter_complex "[0:v][1:v]psnr=stats_file=psnr_grains_max_difference.log;[0:v][1:v]ssim=stats_file=ssim_grains_max_difference.log" -f null - psnr_value=$(grep "psnr_avg:" psnr_grains_max_difference.log | awk -F'psnr_avg:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') ssim_value=$(grep "All:" ssim_grains_max_difference.log | awk -F'All:' '{sum += $2; count++} END {if (count > 0) print sum / count; else print "N/A"}') echo "grain ${grains[1]} vs grain ${grains[-1]}:" >>quality_metrics.log echo " PSNR: ${psnr_value:-N/A}" >>quality_metrics.log echo " SSIM: ${ssim_value:-N/A}" >>quality_metrics.log echo "-------------------" >>quality_metrics.log cat size_comparison.log cat quality_metrics.log ``` *Old Anime movie* The quality results give: ``` Grain Quality Metrics Comparison: ------------------- grain 0: PSNR: 34.6503 SSIM: 0.902562 ------------------- grain 4: PSNR: 34.586 SSIM: 0.900882 ------------------- grain 8: PSNR: 34.446 SSIM: 0.897354 ------------------- grain 12: PSNR: 34.2961 SSIM: 0.893525 ------------------- grain 4 vs grain 12: PSNR: 52.005 SSIM: 0.994478 ------------------- ``` The quality metrics show that with more grain the output file is less similar than the original, but the side by side comparison shows that even a grain of 12 is less noise than the original. It can be because the movie is old. `grain == 0` ![](anime-old-grain-0.jpg) `grain == 8` ![](anime-old-grain-8.jpg) And I'd say that grain 8 looks better than 0. *Old movie with low quality* ``` Grain Quality Metrics Comparison: ------------------- grain 0: PSNR: 35.3513 SSIM: 0.957167 ------------------- grain 4: PSNR: 35.3161 SSIM: 0.956516 ------------------- grain 8: PSNR: 35.275 SSIM: 0.955738 ------------------- grain 12: PSNR: 35.2481 SSIM: 0.955299 ------------------- grain 4 vs grain 12: PSNR: 59.7537 SSIM: 0.999178 ------------------- ``` With the increase of grain also the metrics differ from the original, but less drastically than the anime movie. I'm not able to see any notable difference in the side by side comparison at any grain level. It can be either because the low quality makes it undetectable or that the scenes of the sample don't represent well the noise contrast. **Conclusion** Given that I don't notice any notable change by tunning the parameter, I'll go with the suggested `film-grain=8`. **`pix_fmt` parameter** The `pix_fmt` parameter can be used to force encoding to 10 or 8 bit color depth. By default SVT-AV1 will encode 10-bit sources to 10-bit outputs and 8-bit to 8-bit. AV1 includes 10-bit support in its Main profile. Thus content can be encoded in 10-bit without having to worry about incompatible hardware decoders. To utilize 10-bit in the Main profile, use `-pix_fmt yuv420p10le`. For 10-bit with 4:4:4 chroma subsampling (requires the High profile), use `-pix_fmt yuv444p10le`. 12-bit is also supported, but requires the Professional profile. See ffmpeg -help encoder=libaom-av1 for the supported pixel formats. The [dummies guide](https://wiki.x266.mov/blog/av1-for-dummies) suggests to always use 10-bit color, even with an 8-bit source. AV1's internal workings are much more suited to 10-bit color, and you are almost always guaranteed quality improvements with zero compatibility penalty as 10-bit color is part of AV1's baseline profile. The AOMediaCodec people say that encoding with 10-bit depth results in more accurate colors and fewer artifacts with minimal increase in file size, though the resulting file may be somewhat more computationally intensive to decode for a given bitrate. If higher decoding performance is required, using 10-bit YCbCr encoding will increase efficiency, so a lower average bitrate can be used, which in turn improves decoding performance. In addition, passing the parameter `fast-decode=1` can help (this parameter does not have an effect for all presets, so check the parameter description for your preset). Last, for a given bitrate, 8-bit yuv420p can sometimes be faster to encode, albeit at the cost of some fidelity. I'll then use `-pix_fmt yuv420p10le` **[Tune parameter](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/CommonQuestions.md#options-that-give-the-best-encoding-bang-for-buck)** This parameter changes some encoder settings to produce a result that is optimized for subjective quality (tune=0) or PSNR (tune=1). Tuning for subjective quality can result in a sharper image and higher psycho-visual fidelity. This is invoked with `-svtav1-params tune=0`. The default value is 1. The use of subjective mode (--tune=0) often results in an image with greater sharpness and is intended to produce a result that appears to humans to be of high quality (as opposed to doing well on basic objective measures, such as PSNR). So I'll use it **[Multipass encoding](https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/CommonQuestions.md#multi-pass-encoding)** PSY people suggest not to encode the same video multiple times. This is a common mistake made by people new to video encoding. Every time you encode a video, you lose additional quality due to generation loss. This is because video codecs are lossy, and every time you encode a video, you lose more information. This is why it is important to keep the original video file if you frequently re-encode it. AV1 people says some encoder features benefit from or require the use of a multi-pass encoding approach. In SVT-AV1, in general, multi-pass encoding is useful for achieving a target bitrate when using VBR (variable bit rate) encoding, although both one-pass and multi-pass modes are supported. When using CRF (constant visual rate factor) mode, multi-pass encoding is designed to improve quality for corner case videos--it is particularly helpful in videos with high motion because it can adjust the prediction structure (to use closer references, for example). Multi-pass encoding, therefore, can be said to have an impact on quality in CRF mode, but is not critical in most situations. In general, multi-pass encoding is not as important for SVT-AV1 in CRF mode than it is for some other encoders. CBR (constant bit rate) encoding is always one-pass. I won't use multipass encoding then. **[Select the audio codec](https://jellyfin.org/docs/general/clients/codec-support/#audio-compatibility)** If the audio codec is unsupported or incompatible (such as playing a 5.1 channel stream on a stereo device), the audio codec must be transcoded. This is not nearly as intensive as video transcoding. When comparing audio encodings from a **compatibility** and **open-source** perspective, here are the key aspects to consider: **MP3 (MPEG-1 Audio Layer III)** *Compatibility* - **Highly Compatible**: Supported on virtually all devices and platforms, including legacy systems. - **Wide Adoption**: Default choice for audio streaming, portable devices, and browsers. *Open Source* - **Patented (Previously)**: MP3 was heavily patented until 2017. After patent expiration, it is now open for unrestricted use. - **Decoders/Encoders**: Open-source implementations exist (e.g., `LAME`), but the format's history with patents may make some projects hesitant. **AAC (Advanced Audio Codec)** *Compatibility* - **Widely Compatible**: Supported on most modern devices and platforms, including smartphones, streaming services, and browsers. - **Less Legacy Support**: Older devices may not support AAC compared to MP3. *Open Source* - **Partially Proprietary**: AAC is still patent-encumbered, requiring licensing for certain use cases. - **Limited Open Implementations**: While open-source decoders like `FAAD` exist, licensing concerns can restri…

GoingOffRoading added status:awaiting-triage type:plugin-request New plugin request labels Apr 21, 2023

GoingOffRoading closed this as completed Jul 6, 2023

GoingOffRoading reopened this Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AV1 (CPU) #333

AV1 (CPU) #333

GoingOffRoading commented Apr 21, 2023

Admin9705 commented Jul 6, 2023

GoingOffRoading commented Jul 6, 2023

AV1 (CPU) #333

AV1 (CPU) #333

Comments

GoingOffRoading commented Apr 21, 2023

What is your new plugin request?

Additional Context

Admin9705 commented Jul 6, 2023

GoingOffRoading commented Jul 6, 2023