aboutsummaryrefslogtreecommitdiffstats
path: root/youtube/yt_data_extract
Commit message (Collapse)AuthorAgeFilesLines
* yt_data_extract: Simplify extract_items so it needs only 1 while loopJames Taylor2019-12-261-32/+31
|
* extract_item_info: Don't extract author, author_id, etc. for channel itemsJames Taylor2019-12-241-7/+8
| | | | Philosophically, a channel doesn't create itself.
* Fix extract_approx_int not working for non-approx ints, make extract_int ↵James Taylor2019-12-241-2/+2
| | | | | | | | more robust For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int. Make extract_approx_int and extract_int only extract integers that are words. So e.g. 342 will not be extracted from internetuser342
* Regression: Fix channel extraction 'items' key not present when there's no ↵James Taylor2019-12-231-2/+3
| | | | | | items. Examples: Empty channels, no search results
* Channel: Change search results to use next and previous page buttonsJames Taylor2019-12-231-1/+3
| | | | Because youtube doesn't give the number of search results, so previous behavior would give an error if a page number out of range was selected.
* Rewrite channel extraction with proper error handling and new extraction ↵James Taylor2019-12-212-45/+40
| | | | | | names. Extract subscriber_count correctly. Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
* Fix extract_approx_int. Fixes incorrect subscriber count on channels.James Taylor2019-12-211-2/+2
| | | | It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
* Fix regression: date extraction broken. Move constants to correct file in ↵James Taylor2019-12-202-2/+2
| | | | yt_data_extract
* Extraction: Move non-stateful signature decryption functionality into ↵James Taylor2019-12-192-1/+98
| | | | yt_data_extract
* Extraction: Move stuff around in files and put underscores in front of ↵James Taylor2019-12-193-38/+37
| | | | | | internal helper function names Move get_captions_url in watch_extraction to bottom next to other exported, public functions
* Extraction: Move html post processing stuff from yt_data_extract to utilJames Taylor2019-12-192-41/+1
|
* Extraction: Split yt_data_extract.py into multiple filesJames Taylor2019-12-194-0/+1188