aboutsummaryrefslogtreecommitdiffstats
path: root/youtube/yt_data_extract
Commit message (Collapse)AuthorAgeFilesLines
* Fix comment count extraction due to 'K/M' postfixesAstound2024-01-221-3/+3
| | | | YouTube now displays 2K comments instead of 2359, for instance
* ylist: show 100 videos per page instead of 20Astound2024-01-221-3/+14
| | | | | Also add an option to the internal playlist ctoken function for filtering out shorts, to be used in future anti-shorts features
* Fix comment count not extracted sometimesJesus2023-09-111-1/+2
| | | | YouTube created a new key 'commentCount' in addition to 'headerText'
* Fix related vids, like_count, playlist sometimes missingJesus2023-09-111-9/+13
| | | | | | | | | Cause is that some pages have the onResponseReceivedEndpoints key at the top level with useless stuff in it, and the extract_items function was searching in that instead of the 'contents' key. Change to use if blocks instead of elif blocks in the extract_items function.
* Filter out translated audio tracksJesus2023-09-111-0/+7
| | | | See comment in code
* Set hqdefault thumnail imagesJesus E2023-06-181-1/+1
|
* watch_extraction.py: fix conditionalJesus E2023-06-171-0/+1
|
* Fix minor formatting issuesJesus E2023-06-171-5/+5
|
* Merge short and video parsing even furtherJesus E2023-06-171-29/+24
| | | | | Use multi_get and multi_deep_get for tag differences Replace the duration check with conservative_update
* Merge short and video parsingJesus E2023-06-171-43/+25
|
* Fix parsing shortsJesus E2023-06-171-7/+7
| | | | | | Add check for extracting duration for shorts Make short duration extraction stricter Fix handling shorts with no views
* Add functional but preliminary channel tab supportJesus E2023-06-172-1/+48
| | | | | | | Add channel tabs to the channel template and script Update continuation token to request different tabs Add support for 'reelItemRenderer' format required to extract shorts
* Music list extraction: read from SONG fieldJesus E2023-05-281-1/+3
| | | | | This one is used when there is no corresponding YouTube video for the track
* Fix music list extractionJesus E2023-05-282-0/+32
| | | | Closes #160
* Partially fix age restricted videosJesus E2023-05-282-2/+2
| | | | | | | | | | Does not work for videos that require decryption because decryption is not working (giving 403) for some reason. Related invidious issue for decryption not working: https://github.com/iv-org/invidious/issues/3245 Partial fix for #146
* Update channel to new ctoken formatJesus E2023-05-282-7/+12
| | | | | | Huge thanks to @michaelweiser Different sortings still don't work for videos and playlists
* Fix failure to detect vp9.2 and mp4v.20.3 codecsJesus E2023-05-281-4/+3
|
* Fix fmt extraction mime_type regex failure as well as exceptionsJesus E2023-05-281-2/+4
|
* Remove leftover print statementJesus E2023-05-281-1/+0
|
* Fix likes countJesus E2023-05-281-21/+42
|
* Fix preview_thumbnailsJesús2022-05-301-2/+1
| | | | use 'deep_get' for storyboard
* Extract captions base_url using different method when missingJames Taylor2022-03-301-0/+19
| | | | | | | | | | The base url will be randomly missing. Take one of the listed captions urls which already has the &lang and automatic specifiers. Then remove these specifiers. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix exception when _captions_base_url is not presentJames Taylor2022-03-302-1/+6
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* handle missing storyboardzrose5842022-01-171-1/+2
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* add preview thumbnailszrose5842022-01-091-0/+2
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* update formatsJesús2021-12-271-4/+8
|
* Disable dislikesJesús2021-12-261-5/+0
| | | | Ref: https://blog.youtube/news-and-events/update-to-youtube/
* Revert "Usage hqdefault thumbnail in related videos"Jesús2021-09-141-3/+2
| | | | This reverts commit a0c3ca0159136d17eefa129176ae1904110238b8.
* Usage hqdefault thumbnail in related videosJesús2021-09-141-2/+3
|
* Redo av codec settings & selections to accomodate webmJames Taylor2021-09-061-1/+1
| | | | | | | | | | | | | | Allows for ranked preferences for h264, av1, and vp9 codecs in settings, along with equal preferences which are tiebroken using smaller file size. For each quality, gives av-merge a list of video sources and audio sources sorted based on preference & file size. It will pick the first one that the browser supports. Closes #84 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix videos added to playlist from channel missing author_idJames Taylor2021-08-311-1/+5
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Support more audio and video qualitiesJames Taylor2021-08-312-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Adds support for AV1-encoded videos, which includes any videos above 1080p. These weren't getting included because they did not have a quality entry in the format table at the top of watch_extraction.py. So get the quality from the quality labels of the format if it's not there. Because YouTube often includes BOTH AV1 and H.264 (AVC) for each quality, after these are included, there will be way too many quality options and the code needs to choose which one to use. The choice is somewhat hard: AV1 is encoded in fewer bytes than H.264 and is patent-free, however, it has less hardware support, so might be more difficult to play. For instance, on my system, AV1 does not work on 1080p, but H.264 does. Adds a setting about which to prefer, set to H.264 as the default. Also adds support for the lower quality mp4 audio quality, which now gets used at 144p to save network bandwidth. For similar reasons, this was not getting included because it did not have an audio_bitrate entry in the table. Prefer bitrate instead for the quality. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Add support for more qualities, merging video+audio using MSEJames Taylor2021-08-291-2/+10
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Revert "Add support for more qualities, merging video+audio using MSE"Jesús2021-08-291-10/+2
| | | | This reverts commit d56df02e7b1eba86baf511289208295b1f6c5a50.
* Add support for more qualities, merging video+audio using MSEJames Taylor2021-08-291-2/+10
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment reply url extraction due to youtube changes0.1.0James Taylor2021-08-231-3/+7
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comments extraction due to new response continuation key nameJames Taylor2021-08-231-2/+6
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix description extraction in search resultsJames Taylor2021-08-091-1/+5
| | | | Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix (dis)like, music list extraction due to YouTube changes (again)James Taylor2021-08-092-9/+56
| | | | | | | | | | | | | | | YouTube reverted the changes they made that prompted f9f5d5ba. In case they change their minds again, this adds support for both formats. The liberal_update and conservative_update functions needed to be modified to handle the cases of empty lists, so that a successfully extracted 'music_list': [{'Author':...},...] will not be overwritten by 'music_list': [] in the calls to liberal_dict_update. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Switch to new comments api now that old one is being disabledJames Taylor2021-08-092-11/+34
| | | | | | | | | | | watch_comment api periodically gives the error "Top level comments mweb servlet is turned down." The continuation items for the new api are in a different arrangement in the json, so changes were necessary to the extract_items function. Signed-off-by: Jesús <heckyel@hyperbola.info>
* New age restriction bypass method since get_video_info was disabledJames Taylor2021-07-281-7/+2
| | | | | | | From https://github.com/yt-dlp/yt-dlp/issues/574#issuecomment-887171136 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix missing likes, dislikes, & music list due to Youtube changesJames Taylor2021-07-282-60/+121
| | | | | | | | | Also moves some microformat extraction from _extract_watch_info_mobile to extract_watch_info where it belongs. _extract_watch_info_mobile is really only for stuff visible on the page, and thus specialized for either mobile or desktop. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Capitalize name appJesús2021-06-102-2/+2
|
* Use extract_approx_int for comment likesJames Taylor2021-06-101-2/+2
| | | | | | | | Full digits no longer available Closes #64 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment like extraction due to Youtube changesJames Taylor2021-05-171-0/+2
| | | | | | Variable name changed from likeCount to voteCount Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix videos added to playlist from channel page not having authorJames Taylor2021-05-171-2/+3
| | | | | | Information from additional_info was being overrided with None. Signed-off-by: Jesús <heckyel@hyperbola.info>
* Channel about: Add http:// to links without itJames Taylor2021-05-061-0/+2
| | | | | | So that the link is not interpretted as a relative link Signed-off-by: Jesús <heckyel@hyperbola.info>
* Channel: Allow going to next pages of playlists pageJames Taylor2021-03-152-1/+9
| | | | | | | Uses previous and next buttons. Now can view more than just first page of playlists page Signed-off-by: Jesús <heckyel@hyperbola.info>
* Use new channel api endpoint now that browse_ajax is disabledJames Taylor2021-03-031-0/+5
| | | | | | Fixes channel pages > 1 Signed-off-by: Jesús <heckyel@hyperbola.info>
* Fix comment repliesJames Taylor2021-02-261-0/+6
| | | | | | | | | | | | | Comment reply protobuf now requires the channel id of the uploader of the video. Otherwise the endpoint returns 500. Instead of making the protobuf ourselves and passing this data around through query parameters, just use the ctoken provided to us but modify the max_replies field from 10 to 250. Fixes #53 Signed-off-by: Jesús <heckyel@hyperbola.info>