| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Also moves some microformat extraction from
_extract_watch_info_mobile to extract_watch_info where it belongs.
_extract_watch_info_mobile is really only for stuff visible on the
page, and thus specialized for either mobile or desktop.
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
| |
|
|
|
|
|
|
| |
Information from additional_info was being overrided with None.
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
|
|
|
|
|
|
| |
Uses previous and next buttons. Now can view more than just
first page of playlists page
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
|
|
|
|
|
| |
Fixes channel pages > 1
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
|
|
|
|
|
| |
Some searches have these renderers instead of the usual ones
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
|
|
|
|
|
|
| |
They cannot be viewed on their own, so change url in items to
go to the video+playlist instead
Signed-off-by: Jesús <heckyel@hyperbola.info>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
for instance, urls that start with // become https://
adjustment required in comments.py because the url was left as a
relative url in yt_data_extract by mistake and was using URL_ORIGIN
prefix as fix.
see #31
|
|
|
|
|
| |
This was causing an exception in subscriptions when it tried
to estimate the unix timestamp for the upload time
|
|
|
|
|
|
|
|
|
|
|
|
| |
[something]Continuation renderers, all of which are junk
except one. Check the items in each one until the one which
contains the items being sought is found.
The usage in extract_comments_info needed to be changed to
specify the items being sought. It was unspecified before which
is strictly incorrect since extract_items by default looks for
video/playlist/channel thumbnail items. It was relying on this
special case for continuations. But now that wouldn't work
anymore.
|
|
|
|
| |
Youtube added some pointless variation in variable names
|
| |
|
| |
|
|
|
|
| |
Also change superfluous use of multi_get to item.get nearby
|
|
|
|
|
| |
By checking first if it's in item_types rather than checking if it can be dug into first.
For example: this allows extracting things like sectionListRenderer
|
|
|
|
|
|
| |
independently
extract_items_from_renderer will extract given just a renderer rather than a response
|
|
|
|
|
|
|
| |
Change usage of multi_deep_get to multi_get where possible
Remove checking of type from calls to get functions (because it's very unlikely Youtube suddenly changes the type without changing the name of the variable or anything, and it takes up unnecessary space)
Remove all default=None arguments from get functions, since those are superflous.
Remove list_types constant since it's no longer in use.
|
| |
|
|
|
|
| |
Philosophically, a channel doesn't create itself.
|
|
|
|
|
|
|
|
| |
more robust
For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int.
Make extract_approx_int and extract_int only extract integers that are words.
So e.g. 342 will not be extracted from internetuser342
|
|
|
|
|
|
| |
names. Extract subscriber_count correctly.
Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
|
|
|
|
| |
It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
|
|
|
|
| |
yt_data_extract
|
|
|
|
|
|
| |
internal helper function names
Move get_captions_url in watch_extraction to bottom next to other exported, public functions
|
| |
|
|
|