yt-dlp started as a fork of youtube-dl in 2021. youtube-dl's development had stalled, RIAA takedown notices were flying around, and the community needed somewhere to go. What started as a maintenance fork is now a significantly more capable project - faster extraction, better format selection, and active maintainers who actually care about keeping it working.
The command line is fine for quick jobs. The Python API is where it gets interesting.
Installation
One pip command:
pip install yt-dlp
You also want ffmpeg for anything involving format merging, subtitle embedding, or post-processing. Ubuntu: apt install ffmpeg. Mac: brew install ffmpeg. Windows: download from ffmpeg.org and add to PATH. Skip it and you'll get best available single-stream instead of properly merged HD video - which matters above 720p.
Basic usage from Python
The core interface is the YoutubeDL class. Pass options as a dict, call download() with a list of URLs:
import yt_dlp
ydl_opts = {
'format': 'bestvideo+bestaudio/best',
'outtmpl': '%(title)s.%(ext)s',
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=dQw4w9WgXcQ'])
The format option grabs best video and best audio as separate streams and merges them. outtmpl controls the output filename using yt-dlp's template variables. That's the minimum for a working download.
Extracting info without downloading
This is the part that makes yt-dlp genuinely useful for applications rather than just scripts. You can pull all the metadata and available formats without touching the download:
import yt_dlp
def get_video_info(url):
ydl_opts = {'quiet': True, 'no_warnings': True}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(url, download=False)
return ydl.sanitize_info(info)
info = get_video_info('https://www.youtube.com/watch?v=dQw4w9WgXcQ')
print(info['title'])
print(info['duration']) # seconds
print(info['view_count'])
# Available formats
for fmt in info.get('formats', []):
print(f"{fmt['format_id']} - {fmt.get('height', 'audio')}p - {fmt['ext']}")
The sanitize_info() call strips non-serializable fields. The raw info dict has extractors and function references that break JSON encoding. Always sanitize before you serialize.
Format selection
This is where yt-dlp's flexibility actually shows. The format string is a mini-DSL that lets you express exactly what you want:
# Best quality under 1080p
'format': 'bestvideo[height<=1080]+bestaudio/best[height<=1080]'
# MP4 only
'format': 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]'
# Audio only, best quality
'format': 'bestaudio/best'
# Audio only, converted to MP3
'format': 'bestaudio/best',
'postprocessors': [{
'key': 'FFmpegExtractAudio',
'preferredcodec': 'mp3',
'preferredquality': '192',
}]
The slash is a fallback operator. bestvideo+bestaudio/best means merge separate streams if possible, otherwise fall back to the best single-stream. Most platforms above 720p don't serve pre-merged video, so you need that merge path.
Downloading a specific format ID
Once you've extracted info, you can target a specific format directly instead of using the DSL:
info = get_video_info(url)
formats = info.get('formats', [])
# Find a specific format
mp4_1080 = next(
(f for f in formats if f.get('height') == 1080 and f['ext'] == 'mp4'),
None
)
if mp4_1080:
ydl_opts = {
'format': mp4_1080['format_id'],
'outtmpl': 'output.%(ext)s',
}
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
ydl.download([url])
Progress hooks
For UI or logging integration, progress hooks track download state in real time:
def progress_hook(d):
if d['status'] == 'downloading':
pct = d.get('_percent_str', '?')
speed = d.get('_speed_str', '?')
print(f"Downloading: {pct} at {speed}")
elif d['status'] == 'finished':
print(f"Done: {d['filename']}")
ydl_opts = {
'format': 'best',
'progress_hooks': [progress_hook],
}
Cookies for authenticated content
Age-restricted YouTube videos, private content on platforms where you're logged in, and similar all need valid session cookies. yt-dlp can pull them straight from your browser:
ydl_opts = {
'format': 'best',
'cookiesfrombrowser': ('chrome',), # or 'firefox', 'safari', etc.
}
Or export cookies to a Netscape-format file from a browser extension and reference it: 'cookiefile': '/path/to/cookies.txt'. The file approach is more portable for server environments where browsers aren't installed.
Platform support and keeping it updated
yt-dlp ships with over 1,700 extractors. TikTok, Instagram, Facebook, Twitter/X, Vimeo, Dailymotion, Reddit, SoundCloud, Twitch, and hundreds more. You don't need to track which extractor handles which URL - it figures that out automatically from the URL pattern.
Extractor quality varies. YouTube support is excellent and almost never breaks. TikTok and Instagram need more frequent updates because their APIs change constantly - the yt-dlp team typically has a fix within a day or two of a breakage. If a URL that used to work suddenly fails, update first before debugging anything else:
pip install -U yt-dlp
Rate limiting in bulk downloads
Don't hammer platforms. yt-dlp has built-in sleep options:
ydl_opts = {
'format': 'best',
'sleep_interval': 2, # seconds between requests
'max_sleep_interval': 8, # upper bound for random sleep
'sleep_interval_requests': 1,
}
For anything downloading dozens of videos sequentially, this prevents rate limit triggers. Platforms notice unusual request velocity. A couple of seconds between downloads looks like a normal user. Thousands of requests in a minute does not.
The real reason to build on yt-dlp rather than rolling your own extraction is maintenance. When TikTok changes its API - and it will - someone else fights that battle and pushes a fix. You update a pip package. That alone is worth it.