RFC: youtube: Polymer UI and JSON endpoints for playlists #151

wlritchi · 2020-11-10T08:51:48Z

Before submitting a pull request make sure you have:

At least skimmed through adding new extractor tutorial and youtube-dl coding conventions sections
Searched the bugtracker for similar pull requests
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

We already had a few copies of Polymer-style pagination handling logic for certain circumstances, but now we're forced into using it for all playlists since we can no longer disable Polymer. Refactor the logic to move it to the parent class for all entry lists (including e.g. search results, feeds, and list of playlists), and generify a bit to cover the child classes' use cases.

Fixes #148.

We already had a few copies of Polymer-style pagination handling logic for certain circumstances, but now we're forced into using it for all playlists since we can no longer disable Polymer. Refactor the logic to move it to the parent class for all entry lists (including e.g. search results, feeds, and list of playlists), and generify a bit to cover the child classes' use cases.

wlritchi · 2020-11-10T08:53:45Z

So far this is not tested for many use cases, including ones that we know will be affected (search, feeds, list of playlists). I've only tested it for downloading all of a channel's uploaded videos.

someziggyman · 2020-11-10T09:00:32Z

So far this is not tested for many use cases, including ones that we know will be affected (search, feeds, list of playlists). I've only tested it for downloading all of a channel's uploaded videos.

Will test this fix with these IDs:
Regular playlist ID:
PLszW2az_oxFd7dFeCb1FFhk7c_eEer5n1
Mix playlist ID:
RDKR9wGi7gVLQ
Search playlist:
https://www.youtube.com/results?search_query=linkin+park+numb
And channel playlist:
https://www.youtube.com/user/TheLinuxFoundation/playlists

We now have two fixes released almost at the same time
#150

wlritchi · 2020-11-10T09:03:58Z

See also #150 (edit: some already have while I was typing this, apparently) which appears to be doing essentially the same thing. There's a few things it does better than this one, including noticing that the base extractor already has a _get_yt_initial_data method and using that, rather than borrowing the copy from the search implementation. But I'm out of cycles for the night. Edits welcome!

blackjack4494 · 2020-11-10T21:09:53Z

@wlritchi doesn't work with
python3 -m youtube_dlc "https://www.youtube.com/results?search_query=linkin+park+numb" -F -v
it will throw out error

  File "D:\gitkraken\yt-dlc\youtube_dlc\__main__.py", line 19, in <module>
    youtube_dlc.main()
  File "D:\gitkraken\yt-dlc\youtube_dlc\__init__.py", line 487, in main
    _real_main(argv)
  File "D:\gitkraken\yt-dlc\youtube_dlc\__init__.py", line 477, in _real_main
    retcode = ydl.download(all_urls)
  File "D:\gitkraken\yt-dlc\youtube_dlc\YoutubeDL.py", line 2104, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "D:\gitkraken\yt-dlc\youtube_dlc\YoutubeDL.py", line 830, in extract_info
    ie_result = ie.extract(url)
  File "D:\gitkraken\yt-dlc\youtube_dlc\extractor\common.py", line 532, in extract
    ie_result = self._real_extract(url)
  File "D:\gitkraken\yt-dlc\youtube_dlc\extractor\youtube.py", line 3450, in _real_extract
    data_json = self._process_initial_data(webpage)
AttributeError: 'YoutubeSearchURLIE' object has no attribute '_process_initial_data'

wlritchi · 2020-11-10T22:30:07Z

Ah, yup, that's definitely an artifact left over from refactoring. I had intended to let search just process the initial data without requesting any further pages (which I think matches what it did before), but specifying the number of pages to fetch is probably more useful. I think your change as presented above will cap all playlist downloads to the first page, though.

blackjack4494 · 2020-11-10T22:46:19Z

Ah, yup, that's definitely an artifact left over from refactoring. I had intended to let search just process the initial data without requesting any further pages (which I think matches what it did before), but specifying the number of pages to fetch is probably more useful. I think your change as presented above will cap all playlist downloads to the first page, though.

yup you are right. Haven't thought about it lol.
Any idea @wlritchi ? D:
this seems to work but well it's rather hacky
for page_num in (range(n) if n>0 else itertools.count(1)):

Also drop a few leftover methods in search that are no longer used.

wlritchi · 2020-11-10T22:51:42Z

Ah, yup, that's definitely an artifact left over from refactoring. I had intended to let search just process the initial data without requesting any further pages (which I think matches what it did before), but specifying the number of pages to fetch is probably more useful. I think your change as presented above will cap all playlist downloads to the first page, though.

yup you are right. Haven't thought about it lol.
Any idea @wlritchi ? D:
this seems to work but well it's rather hacky
for page_num in (range(n) if n>0 else itertools.count(1)):

Basically doing this except defaulting to None, and adjusting the counting so it always starts from 1 (convention: page 0 is the initial page and continuations start from 1). This way 0 can be passed to say "only download the initial page with no continuations". Also open to tweaking the counting so the initial page counts as page 1.

wlritchi · 2020-11-10T22:54:04Z

Oh, hmm, search with 0 continuations returns nothing. I thought it would be the first page of results.

blackjack4494 · 2020-11-10T22:58:38Z

Oh, hmm, search with 0 continuations returns nothing. I thought it would be the first page of results.

that's why max_pages=5 is needed in class YoutubeSearchURLIE
if you set it to 1 there will be 30 results. But I thought it would be good to get ~100 so I put 5 pages 😆

wlritchi · 2020-11-10T22:59:55Z

Oh I see, YoutubeSearchIE has its own implementation of _entries still, which interprets the parameter differently. That should also be merged into the superclass - will do.

blackjack4494 · 2020-11-10T23:01:27Z

Oh I see, YoutubeSearchIE has its own implementation of _entries still, which interprets the parameter differently. That should also be merged into the superclass - will do.

i set it to 5 now. If you plan any more changes hold on for now. Will merge this into master and release it. really have to sleep, can barely keep my eyes open :o

wlritchi · 2020-11-10T23:02:43Z

Oh I see, YoutubeSearchIE has its own implementation of _entries still, which interprets the parameter differently. That should also be merged into the superclass - will do.

i set it to 5 now. If you plan any more changes hold on for now. Will merge this into master and release it. really have to sleep, can barely keep my eyes open :o

👍 I'll put anything else on a separate branch since it's more maintenance/refactoring than bugfixing anyway. While I'm at it - thoughts on whether the initial page should be page 0, or page 1 (i.e. do most users prefer 0-based or 1-based indexing)?

Fabian42 · 2020-11-10T23:19:28Z

I prefer 0-based index at least, but I'm a programmer, so maybe not very representative. I don't think that there is anything paginated with page numbers left on YouTube, but if there was, it would likely start at 1. Things like "--playlist-start" are already 1-based.

ultrasound1372 · 2020-11-10T23:35:31Z

If things like --playlist-start are already 1 based then it should start at 1.

tygore587 · 2020-11-11T18:30:24Z

@blackjack4494 There is a bug in here. Playlist title for the output format is not set correct and is always NA

someziggyman mentioned this pull request Nov 10, 2020

[youtube] fix: playlist #150

Draft

8 tasks

manually set limit for youtubesearchurl

0f8566e

blackjack4494 marked this pull request as ready for review November 10, 2020 22:21

blackjack4494 self-requested a review November 10, 2020 22:21

blackjack4494 approved these changes Nov 10, 2020

View reviewed changes

Merge branch 'master' into youtube-playlist-polymer

c54f4aa

[youtube] max_pages=5 for search, unlimited for everything else

73ac856

Also drop a few leftover methods in search that are no longer used.

ytsearchurl 5 pages for around 100 results

104bfdd

blackjack4494 merged commit 00c38ef into blackjack4494:master Nov 10, 2020

pukkandan mentioned this pull request Nov 12, 2020

[Broken] Youtube Playlist download only sees first 100 videos #172

Closed

6 tasks

coletdjnz mentioned this pull request Nov 15, 2020

'uploader' information is null in YouTube playlist extractor #173

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: youtube: Polymer UI and JSON endpoints for playlists #151

RFC: youtube: Polymer UI and JSON endpoints for playlists #151

wlritchi commented Nov 10, 2020

wlritchi commented Nov 10, 2020

someziggyman commented Nov 10, 2020

wlritchi commented Nov 10, 2020 •

edited

Loading

blackjack4494 commented Nov 10, 2020

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020 •

edited

Loading

wlritchi commented Nov 10, 2020

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020 •

edited

Loading

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020

wlritchi commented Nov 10, 2020

Fabian42 commented Nov 10, 2020

ultrasound1372 commented Nov 10, 2020

tygore587 commented Nov 11, 2020

RFC: youtube: Polymer UI and JSON endpoints for playlists #151

RFC: youtube: Polymer UI and JSON endpoints for playlists #151

Conversation

wlritchi commented Nov 10, 2020

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

wlritchi commented Nov 10, 2020

someziggyman commented Nov 10, 2020

wlritchi commented Nov 10, 2020 • edited Loading

blackjack4494 commented Nov 10, 2020

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020 • edited Loading

wlritchi commented Nov 10, 2020

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020 • edited Loading

wlritchi commented Nov 10, 2020

blackjack4494 commented Nov 10, 2020

wlritchi commented Nov 10, 2020

Fabian42 commented Nov 10, 2020

ultrasound1372 commented Nov 10, 2020

tygore587 commented Nov 11, 2020

wlritchi commented Nov 10, 2020 •

edited

Loading

blackjack4494 commented Nov 10, 2020 •

edited

Loading

blackjack4494 commented Nov 10, 2020 •

edited

Loading