diff options
author | James Taylor <user234683@users.noreply.github.com> | 2021-08-31 13:38:28 -0700 |
---|---|---|
committer | Jesús <heckyel@hyperbola.info> | 2021-08-31 16:40:19 -0500 |
commit | 7c79f530a53e9ff4a9fc61d6b7adde6e9c241c62 (patch) | |
tree | fb56107188cda2871799c15571cb98e21cfff286 /youtube/yt_data_extract/common.py | |
parent | 30e59081b14c98b49f718a1bc131ac46d09c84bf (diff) | |
download | yt-local-7c79f530a53e9ff4a9fc61d6b7adde6e9c241c62.tar.lz yt-local-7c79f530a53e9ff4a9fc61d6b7adde6e9c241c62.tar.xz yt-local-7c79f530a53e9ff4a9fc61d6b7adde6e9c241c62.zip |
Support more audio and video qualities
Adds support for AV1-encoded videos, which includes any videos
above 1080p. These weren't getting included because they did
not have a quality entry in the format table at the top of
watch_extraction.py. So get the quality from the quality
labels of the format if it's not there.
Because YouTube often includes BOTH AV1 and H.264 (AVC) for each
quality, after these are included, there will be way too many
quality options and the code needs to choose which one to use.
The choice is somewhat hard: AV1 is encoded in fewer bytes than
H.264 and is patent-free, however, it has less hardware support,
so might be more difficult to play. For instance, on my system,
AV1 does not work on 1080p, but H.264 does. Adds a setting about
which to prefer, set to H.264 as the default.
Also adds support for the lower quality mp4 audio quality, which
now gets used at 144p to save network bandwidth. For similar
reasons, this was not getting included because it did not
have an audio_bitrate entry in the table. Prefer bitrate
instead for the quality.
Signed-off-by: Jesús <heckyel@hyperbola.info>
Diffstat (limited to 'youtube/yt_data_extract/common.py')
-rw-r--r-- | youtube/yt_data_extract/common.py | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/youtube/yt_data_extract/common.py b/youtube/yt_data_extract/common.py index ca999ba..f97597c 100644 --- a/youtube/yt_data_extract/common.py +++ b/youtube/yt_data_extract/common.py @@ -166,14 +166,17 @@ def extract_formatted_text(node): return [{'text': node['simpleText']}] return [] -def extract_int(string, default=None): +def extract_int(string, default=None, whole_word=True): if isinstance(string, int): return string if not isinstance(string, str): string = extract_str(string) if not string: return default - match = re.search(r'\b(\d+)\b', string.replace(',', '')) + if whole_word: + match = re.search(r'\b(\d+)\b', string.replace(',', '')) + else: + match = re.search(r'(\d+)', string.replace(',', '')) if match is None: return default try: |