aboutsummaryrefslogtreecommitdiffstats
path: root/youtube/yt_data_extract
Commit message (Collapse)AuthorAgeFilesLines
* Usage hqdefault thumbnail in related videosJesús2021-09-141-2/+3
|
* Redo av codec settings & selections to accomodate webmJames Taylor2021-09-061-1/+1
| | | | | | | | | | | | | | Allows for ranked preferences for h264, av1, and vp9 codecs in settings, along with equal preferences which are tiebroken using smaller file size. For each quality, gives av-merge a list of video sources and audio sources sorted based on preference & file size. It will pick the first one that the browser supports. Closes #84 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix videos added to playlist from channel missing author_idJames Taylor2021-08-311-1/+5
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Support more audio and video qualitiesJames Taylor2021-08-312-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Adds support for AV1-encoded videos, which includes any videos above 1080p. These weren't getting included because they did not have a quality entry in the format table at the top of watch_extraction.py. So get the quality from the quality labels of the format if it's not there. Because YouTube often includes BOTH AV1 and H.264 (AVC) for each quality, after these are included, there will be way too many quality options and the code needs to choose which one to use. The choice is somewhat hard: AV1 is encoded in fewer bytes than H.264 and is patent-free, however, it has less hardware support, so might be more difficult to play. For instance, on my system, AV1 does not work on 1080p, but H.264 does. Adds a setting about which to prefer, set to H.264 as the default. Also adds support for the lower quality mp4 audio quality, which now gets used at 144p to save network bandwidth. For similar reasons, this was not getting included because it did not have an audio_bitrate entry in the table. Prefer bitrate instead for the quality. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Add support for more qualities, merging video+audio using MSEJames Taylor2021-08-291-2/+10
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Revert "Add support for more qualities, merging video+audio using MSE"Jesús2021-08-291-10/+2
| | | | This reverts commit d56df02e7b1eba86baf511289208295b1f6c5a50.
* Add support for more qualities, merging video+audio using MSEJames Taylor2021-08-291-2/+10
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment reply url extraction due to youtube changes0.1.0James Taylor2021-08-231-3/+7
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comments extraction due to new response continuation key nameJames Taylor2021-08-231-2/+6
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix description extraction in search resultsJames Taylor2021-08-091-1/+5
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix (dis)like, music list extraction due to YouTube changes (again)James Taylor2021-08-092-9/+56
| | | | | | | | | | | | | | | YouTube reverted the changes they made that prompted f9f5d5ba. In case they change their minds again, this adds support for both formats. The liberal_update and conservative_update functions needed to be modified to handle the cases of empty lists, so that a successfully extracted 'music_list': [{'Author':...},...] will not be overwritten by 'music_list': [] in the calls to liberal_dict_update. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Switch to new comments api now that old one is being disabledJames Taylor2021-08-092-11/+34
| | | | | | | | | | | watch_comment api periodically gives the error "Top level comments mweb servlet is turned down." The continuation items for the new api are in a different arrangement in the json, so changes were necessary to the extract_items function. Signed-off-by: Jesús <heckyel@hyperbola.info>
* New age restriction bypass method since get_video_info was disabledJames Taylor2021-07-281-7/+2
| | | | | | | From https://github.com/yt-dlp/yt-dlp/issues/574#issuecomment-887171136 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix missing likes, dislikes, & music list due to Youtube changesJames Taylor2021-07-282-60/+121
| | | | | | | | | Also moves some microformat extraction from _extract_watch_info_mobile to extract_watch_info where it belongs. _extract_watch_info_mobile is really only for stuff visible on the page, and thus specialized for either mobile or desktop. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Capitalize name appJesús2021-06-102-2/+2
|
* Use extract_approx_int for comment likesJames Taylor2021-06-101-2/+2
| | | | | | | | Full digits no longer available Closes #64 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment like extraction due to Youtube changesJames Taylor2021-05-171-0/+2
| | | | | | Variable name changed from likeCount to voteCount Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix videos added to playlist from channel page not having authorJames Taylor2021-05-171-2/+3
| | | | | | Information from additional_info was being overrided with None. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Channel about: Add http:// to links without itJames Taylor2021-05-061-0/+2
| | | | | | So that the link is not interpretted as a relative link Signed-off-by: Jesús <heckyel@hyperbola.info>
* Channel: Allow going to next pages of playlists pageJames Taylor2021-03-152-1/+9
| | | | | | | Uses previous and next buttons. Now can view more than just first page of playlists page Signed-off-by: Jesús <heckyel@hyperbola.info>
* Use new channel api endpoint now that browse_ajax is disabledJames Taylor2021-03-031-0/+5
| | | | | | Fixes channel pages > 1 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment repliesJames Taylor2021-02-261-0/+6
| | | | | | | | | | | | | Comment reply protobuf now requires the channel id of the uploader of the video. Otherwise the endpoint returns 500. Instead of making the protobuf ourselves and passing this data around through query parameters, just use the ctoken provided to us but modify the max_replies field from 10 to 250. Fixes #53 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix signature decryption due to new base.js minifier rulesJames Taylor2021-02-231-7/+10
| | | | | | | | | | YouTube now includes e.g. {"fe": ...} instead of just {fe: ...} in the javascript object entries in the object holding the operation definitions. Fixes #2 Signed-off-by: Jesús <heckyel@hyperbola.info>
* yt_data_ext: support richGrid&richItem sometimes used on searchJames Taylor2021-02-131-1/+3
| | | | | | Some searches have these renderers instead of the usual ones Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix youtube mixesJames Taylor2020-12-181-0/+5
| | | | | | | They cannot be viewed on their own, so change url in items to go to the video+playlist instead Signed-off-by: Jesús <heckyel@hyperbola.info>
* channel: replace page #s w/ next page button using provided ctokenJames Taylor2020-12-181-0/+2
| | | | | | Since yt doesn't accept page #'s when sorting by oldest Signed-off-by: Jesús <heckyel@hyperbola.info>
* Improve ytInitialPlayerResponse extractionJames Taylor2020-12-171-2/+10
| | | | | | | Makes it work if there are additional javascripts statements after the playerResponse variable Signed-off-by: Jesús <heckyel@hyperbola.info>
* Always extract from html watch page to get base.js urlJames Taylor2020-12-122-14/+71
| | | | | | Youtube removed the url from the pbj responses. They are now only in the html page. Replaces previous fix for the missing base.js issue.
* Retrieve base.js url from html watch page when it's missingJames Taylor2020-12-092-1/+15
| | | | Fixes failure mode 3 in #22
* yt_data_ext: watch playlist: Fix missing author_url if no author_idJames Taylor2020-11-081-3/+2
| | | | | | | Embedded playlist info was missing author_url key if author_id was None. This caused KeyError in watch.py when it expected that key Closes #37
* Redo fix for failure mode 1 in issue #22James Taylor2020-10-211-4/+4
| | | | | Previous fix didn't work. Should work now. The non-embedded player response can still be present but the urls will be missing.
* remove trailing whitespaceszrose5842020-10-213-3/+3
|
* Use get_video_info to get video urls if player response missingJames Taylor2020-10-191-2/+8
| | | | Fixes failure mode 1 in #22
* yt_data_extract: normalize thumbnail and author urlsJames Taylor2020-10-192-12/+17
| | | | | | | | | | for instance, urls that start with // become https:// adjustment required in comments.py because the url was left as a relative url in yt_data_extract by mistake and was using URL_ORIGIN prefix as fix. see #31
* Specify video height in html so page doesn't shift down after loadJames Taylor2020-09-241-2/+9
| | | | | Use true video height extracted from youtube to handle videos shorter than their quality size. (e.g. widescreen videos)
* yt_data_extract: Fix time_published picking up 'Streaming' stringJames Taylor2020-08-121-1/+5
| | | | | This was causing an exception in subscriptions when it tried to estimate the unix timestamp for the upload time
* Switch to mobile api endpoint to fix 'Unknown error' blockageJames Taylor2020-08-111-9/+18
| | | | See https://github.com/iv-org/invidious/issues/1319#issuecomment-671732646
* extract_items: Handle case where continuation has multipleJames Taylor2020-08-112-11/+23
| | | | | | | | | | | | [something]Continuation renderers, all of which are junk except one. Check the items in each one until the one which contains the items being sought is found. The usage in extract_comments_info needed to be changed to specify the items being sought. It was unspecified before which is strictly incorrect since extract_items by default looks for video/playlist/channel thumbnail items. It was relying on this special case for continuations. But now that wouldn't work anymore.
* extract_channel_info: Improve error extractionJames Taylor2020-08-111-3/+6
| | | | | | | | Use extract_str function since it's not always 'simpleText' Make sure we don't output an empty error message if we don't know what it is. channel.py: Don't check if error message is empty, check if it's None
* Fix hls_manifest_url not included when there's no other formatsJames Taylor2020-06-281-2/+6
| | | | | | | Since there are no formats, it was retrying with the non-embedded playerResponse, which resulted in the hls_manifest_urls from the embedded player_response being overwritten with None. So use conservative_update instead
* Add dialog for copying urls to external player for livestreamsJames Taylor2020-06-282-11/+53
| | | | | Also for livestreams which are over whose other sources aren't present or aren't ready yet.
* Handle case where embedded player response missingJames Taylor2020-06-281-2/+10
| | | | | | | | Change so it extracts other stuff from regular playerResponse Extract formats from embedded player response, but fallback to regular one if that doesn't work. Sometimes there is no 'player' at top_level and the urls are in the regular playerResponse
* Do not override previous playability error if unknownJames Taylor2020-06-281-1/+1
|
* Fix previously live videos labeled as liveJames Taylor2020-05-291-1/+3
|
* Fix broken signature decryptionJames Taylor2020-05-271-1/+2
| | | | | | | | | The base.js url format changed, so the identifier at the end was no longer unique. So it was using the wrong cached decryption function Changes the identifier to just be the whole url so this won't happen again.
* Fix urls sometimes not extracted due to youtube changesJames Taylor2020-05-271-1/+2
| | | | | The 'cipher' parameter which contains the url is sometimes called 'signatureCipher' instead now.
* Fix error getting exit node ip if format urls are NoneJames Taylor2020-05-271-1/+1
|
* Fix comment count & disabled extraction not working sometimesJames Taylor2020-04-101-3/+14
| | | | because of A/B test.
* Fix related video extraction sometimes failingJames Taylor2020-04-101-2/+10
| | | | Youtube added some pointless variation in variable names
* Fix exception due to missing 'playlist' key in extracted infoJames Taylor2020-04-051-0/+3
| | | | | | Happens when there's an error on the page and there was no visible stuff on the page. 'playlist' wasn't set to None in that case.