aboutsummaryrefslogtreecommitdiffstats
path: root/youtube
Commit message (Collapse)AuthorAgeFilesLines
* Extraction: Correctly extract view_count for vids with 0 views.James Taylor2019-12-301-1/+9
| | | | Also change superfluous use of multi_get to item.get nearby
* extract_items: allow extracting items that are normally dug into for moreJames Taylor2019-12-261-5/+5
| | | | | By checking first if it's in item_types rather than checking if it can be dug into first. For example: this allows extracting things like sectionListRenderer
* yt_data_extract: Split up extract_items so renderer extraction works ↵James Taylor2019-12-261-47/+48
| | | | | | independently extract_items_from_renderer will extract given just a renderer rather than a response
* yt_data_extract.common: Simplify usage of get functions and remove dead codeJames Taylor2019-12-261-18/+11
| | | | | | | Change usage of multi_deep_get to multi_get where possible Remove checking of type from calls to get functions (because it's very unlikely Youtube suddenly changes the type without changing the name of the variable or anything, and it takes up unnecessary space) Remove all default=None arguments from get functions, since those are superflous. Remove list_types constant since it's no longer in use.
* yt_data_extract: Simplify extract_items so it needs only 1 while loopJames Taylor2019-12-261-32/+31
|
* items: commatize channel video count and playlist video countJames Taylor2019-12-241-2/+2
|
* extract_item_info: Don't extract author, author_id, etc. for channel itemsJames Taylor2019-12-241-7/+8
| | | | Philosophically, a channel doesn't create itself.
* Fix extract_approx_int not working for non-approx ints, make extract_int ↵James Taylor2019-12-241-2/+2
| | | | | | | | more robust For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int. Make extract_approx_int and extract_int only extract integers that are words. So e.g. 342 will not be extracted from internetuser342
* Channel searching: indicate if there's no resultsJames Taylor2019-12-231-1/+5
|
* Regression: Fix channel extraction 'items' key not present when there's no ↵James Taylor2019-12-231-2/+3
| | | | | | items. Examples: Empty channels, no search results
* Channel: Change search results to use next and previous page buttonsJames Taylor2019-12-235-39/+83
| | | | Because youtube doesn't give the number of search results, so previous behavior would give an error if a page number out of range was selected.
* Subscriptions: Cleaner error message when checking terminated channelsJames Taylor2019-12-221-1/+3
| | | | Don't display a nasty traceback in that case.
* Subscriptions: Make uploader name clickable, with link to channelJames Taylor2019-12-221-2/+4
|
* Finally fix video count on channels accessed through general urls, rather ↵James Taylor2019-12-221-19/+34
| | | | | | | | | | | | | than just channel id. It was set to a fake value of 1000 previously in order to ensure there would be enough page buttons. This was because two sequential requests are necessary (one to get the channel id corresponding to the custom url, another to get the number of videos from the "all uploaded videos" playlist, the url for which can be generated from the channel id). Since Tor has a high latency, I thought at the time that this would be too slow, but in practice it's not too big of a deal. Introduces cachetools dependency in order to cache the function which gets the number of videos. The get_channel_id function has also been fixed since the ajax api seems to have been removed.
* channel.py: Refactor channel_id route logic into general channel url logic.James Taylor2019-12-221-53/+21
| | | | | Deduplicates the code. channel_id logic was previously separate because of the need to get the number of videos and different page numbers Also makes search work for general urls, not just channel_id urls
* Rewrite channel extraction with proper error handling and new extraction ↵James Taylor2019-12-213-47/+48
| | | | | | names. Extract subscriber_count correctly. Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
* Fix extract_approx_int. Fixes incorrect subscriber count on channels.James Taylor2019-12-211-2/+2
| | | | It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
* Watch: Add padding in description box and urlize linksJames Taylor2019-12-201-1/+2
|
* Watch: display comment count and whether comments are disabledJames Taylor2019-12-202-8/+19
|
* Better error handling for incorrect watch page urlsJames Taylor2019-12-201-2/+4
| | | | | - Correctly handle /embed, /watch with no video ids - Correctly report error for this and for too short video ids
* Add custom 500 error page. Display the traceback. Center and format error ↵James Taylor2019-12-202-1/+39
| | | | | | page in general. Also add a link to github for reporting the exception.
* Add support for /embed urlsJames Taylor2019-12-201-2/+3
|
* Subscriptions: Display currently selected tag in page titleJames Taylor2019-12-201-1/+5
|
* Watch: Add border around badges such as unlisted badgeJames Taylor2019-12-201-1/+3
| | | | Especially for the light theme
* Fix regression: date extraction broken. Move constants to correct file in ↵James Taylor2019-12-202-2/+2
| | | | yt_data_extract
* Subscriptions: Display selected tag above videos.James Taylor2019-12-201-0/+6
| | | | Otherwise, it wasn't clear enough that a tag was selected.
* Merge branch 'modular-data-extract'James Taylor2019-12-1920-780/+1760
|\ | | | | | | | | | | | | | | | | | | | | Commits in this branch are prefixed with "Extraction:" This branch refactors data extraction. All such functionality has been moved to the yt_data_extract module. Responses from requests are given to the module and it parses them into a consistent, more useful format. The dependency on youtube-dl has also been dropped and this functionality has been built from scratch for these reasons: (1) I've noticed youtube-dl breaks more often than invidious (which uses watch page extraction built from scratch) in response to changes from Youtube, so I'm hoping what I wrote will also be less brittle. (2) Such breakage is inconvenient because I have to manually merge the fixes since I had to make changes to youtube-dl to make it do things such as extracting related videos. (3) I have no control over error handling and request pooling with youtube-dl, since it does all the requests (these would require intrusive changes I don't want to maintain). (4) I will now be able to finally display the number of comments and whether comments are disabled without making additional requests.
| * Extraction: Move non-stateful signature decryption functionality into ↵James Taylor2019-12-193-86/+110
| | | | | | | | yt_data_extract
| * Extraction: Move stuff around in files and put underscores in front of ↵James Taylor2019-12-193-38/+37
| | | | | | | | | | | | internal helper function names Move get_captions_url in watch_extraction to bottom next to other exported, public functions
| * Extraction: Move html post processing stuff from yt_data_extract to utilJames Taylor2019-12-199-52/+50
| |
| * Extraction: Split yt_data_extract.py into multiple filesJames Taylor2019-12-195-1190/+1188
| |
| * Extraction: Rewrite comment extraction, remove author_id and rename ↵James Taylor2019-12-193-80/+67
| | | | | | | | | | | | | | | | | | author_channel_id to that, fix bug in extract_items author_id (an internal sql-like integer previously required for deleting and editing comments) has been removed by Youtube and is no longer required. Remove it for simplicity. Rename author_channel_id to author_id for consistency with other extraction attributes. extract_items returned None for items instead of [] for empty continuation responses. Fixes that.
| * Extraction: Adjust related videos box to fit new time_published information wellJames Taylor2019-12-191-8/+8
| | | | | | | | | | | | time_published will be put to the right of the view_count in related videos Author will now always be above the other stats, since it doesn't make a difference in the big search result boxes since the description snippet is always very short (However, it's important the author isn't inline with the other stats in related video boxes since those are so narrow and the author name can be very long)
| * Extraction: Use accessibility data to get timestamp and to get views for ↵James Taylor2019-12-181-0/+10
| | | | | | | | recommended videos
| * Extraction: rename multi_get functions to more descriptive namesJames Taylor2019-12-183-68/+68
| |
| * Extraction: Rewrite item_extraction for better error handling and ↵James Taylor2019-12-1812-340/+305
| | | | | | | | readability, rename extracted names for more consistency
| * Extraction: Fix thumbnail and remove badges on related videosJames Taylor2019-12-172-4/+10
| |
| * Extraction: Fix mistake with age-restriction detectionJames Taylor2019-12-171-1/+1
| |
| * Extraction: Detect limited state and fix false detection as unlistedJames Taylor2019-12-173-2/+14
| |
| * Extraction: Make limited state videos workJames Taylor2019-12-171-1/+1
| |
| * Extraction: Extract info from microformat to get views for limited state ↵James Taylor2019-12-171-39/+60
| | | | | | | | videos, and as a fallback. Shorten some function names
| * Extraction: Add fallback playability error extraction from renderersJames Taylor2019-12-141-17/+24
| |
| * Extraction: Fix subtitles error when video has no automatic captions but has ↵James Taylor2019-12-141-1/+5
| | | | | | | | foreign language captions
| * Extraction: Fix subtitles not working on certain videos which require more ↵James Taylor2019-12-141-5/+15
| | | | | | | | parameters in the captions url
| * Extraction: Display that video is age-restrictedJames Taylor2019-12-121-7/+21
| |
| * Extraction: Bypass age-restrictionJames Taylor2019-12-122-35/+90
| |
| * Extraction: Add general subtitle extraction and translationJames Taylor2019-11-292-69/+126
| |
| * Extraction: extract automatic captionsJames Taylor2019-11-281-2/+32
| |
| * Extraction: extract fields from visible webpage if missing from playerResposneJames Taylor2019-11-251-31/+61
| |
| * Extraction: return and display any errors preventing video playbackJames Taylor2019-11-223-21/+40
| |