aboutsummaryrefslogtreecommitdiffstats
path: root/youtube/yt_data_extract/common.py
Commit message (Collapse)AuthorAgeFilesLines
* yt_data_extract: Simplify extract_items so it needs only 1 while loopJames Taylor2019-12-261-32/+31
|
* extract_item_info: Don't extract author, author_id, etc. for channel itemsJames Taylor2019-12-241-7/+8
| | | | Philosophically, a channel doesn't create itself.
* Fix extract_approx_int not working for non-approx ints, make extract_int ↵James Taylor2019-12-241-2/+2
| | | | | | | | more robust For example, "354 subscribers" wasn't being extracted correctly be extract_approx_int. Make extract_approx_int and extract_int only extract integers that are words. So e.g. 342 will not be extracted from internetuser342
* Rewrite channel extraction with proper error handling and new extraction ↵James Taylor2019-12-211-2/+5
| | | | | | names. Extract subscriber_count correctly. Don't just shove english strings into info['stats']. Actually give semantic names for the stats.
* Fix extract_approx_int. Fixes incorrect subscriber count on channels.James Taylor2019-12-211-2/+2
| | | | It wasn't working because decimals such as 15.1M weren't considered, so it was extracting "1M"
* Fix regression: date extraction broken. Move constants to correct file in ↵James Taylor2019-12-201-1/+2
| | | | yt_data_extract
* Extraction: Move stuff around in files and put underscores in front of ↵James Taylor2019-12-191-9/+8
| | | | | | internal helper function names Move get_captions_url in watch_extraction to bottom next to other exported, public functions
* Extraction: Move html post processing stuff from yt_data_extract to utilJames Taylor2019-12-191-39/+0
|
* Extraction: Split yt_data_extract.py into multiple filesJames Taylor2019-12-191-0/+455