Skip to content

Releases: jdepoix/youtube-transcript-api

v1.2.2

04 Aug 12:21
63eeec2
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.2.1...v1.2.2

v1.2.1

22 Jul 16:18
Compare
Choose a tag to compare

What's Changed

  • Added the property filter_ip_locations to WebshareProxyConfig. This allows for limiting the pool of IPs that Webshare will be rotating through to those located in specific countries. By choosing locations that are close to the machine that is doing the requests, latency can be reduced. Also, this can be used to work around location-based restrictions.
    ytt_api = YouTubeTranscriptApi(
        proxy_config=WebshareProxyConfig(
            proxy_username="<proxy-username>",
            proxy_password="<proxy-password>",
            filter_ip_locations=["de", "us"],
        )
    )
    
    # Webshare will now only rotate through IPs located in Germany or the United States!
    ytt_api.fetch(video_id)
    The full list of available locations (and how many IPs are available in each location) can be found here.
  • [Fixes #483] Add __all__ to __init__.py to support mypy --strict usage by @Jer-Pha in #486

New Contributors

Full Changelog: v1.2.0...v1.2.1

v1.2.0

21 Jul 10:43
da6920b
Compare
Choose a tag to compare

What's Changed

  • [BREAKING] Removed the deprecated methods get_transcript, get_transcripts and list_transcripts. They have already been deprecated in v1.0.0, but I've kept them around to allow for an easier migration to v1.0.0. However, these methods have led to a lot of issues being created due to people initializing a YouTubeTranscriptApi object and passing a proxy config into the constructor, but then calling the deprecated static methods on that object. As these methods are static they don't/can't access the state set in the constructor, therefore, the proxy config is ignored.

Migration Guide

If you're still using get_transcript, get_transcripts you have to change your code as follows:

# old API
transcript = YouTubeTranscriptApi.get_transcript("abc")

# new API
ytt_api = YouTubeTranscriptApi()
transcript = ytt_api.fetch("abc").to_raw_data()

If you're still using list_transcripts you have to change your code as follows:

# old API
transcript_list = YouTubeTranscriptApi.list_transcripts("abc")

# new API
ytt_api = YouTubeTranscriptApi()
transcript_list = ytt_api.list("abc")

Full Changelog: v1.1.1...v1.2.0

v1.1.1

03 Jul 13:42
d2a409d
Compare
Choose a tag to compare

What's Changed

  • IpBlocked exception is now raised when the timedtext endpoint returns a status code 429 #468
  • fixed typo in README.md by @alx in #463

New Contributors

  • @alx made their first contribution in #463

Full Changelog: v1.1.0...v1.1.1

v1.1.0

11 Jun 22:29
b716e24
Compare
Choose a tag to compare

What's Changed

  • Refactored the way the captions json is retrieved from scraping it from the /watch html to fetching it from the innertube API
  • Added a new exception called PoTokenRequired, which will be raised if timedtext urls are encountered that require a PO token, such that we get feedback from users ASAP if this happens again

Breaking

  • Unfortunately, I haven't been able to implement authentication for the innertube API yet. As I wanted to provide a fix for this issue ASAP, I decided to disable cookie authentication for the time being.

Full Changelog: v1.0.3...v1.1.0

v1.0.3

25 Mar 18:12
b706276
Compare
Choose a tag to compare

What's Changed

  • Refactored parsing of the JS var containing the transcript data, to make it more robust to changes in the formatting of the returned HTML

Full Changelog: v1.0.2...v1.0.3

v1.0.2

17 Mar 18:17
dc08c3f
Compare
Choose a tag to compare

What's Changed

  • Added retry mechanism, which will retry requests when Webshare proxies are used and RequestBlocked is raised, to trigger an IP rotation in case a user encounters a blocked residential IP
  • Added new error messages when RequestBlocked is raised despite proxies being used, to assist users in figuring out what the issue is
  • Fixed PEP-8 warning by @afourney in #396

New Contributors

Full Changelog: v1.0.1...v1.0.2

v1.0.1

12 Mar 20:30
aad8621
Compare
Choose a tag to compare

What's Changed

  • Adds a feature to allow proxy configs to prevent the HTTP client from keeping TCP connections open, as keeping TCP connections alive can prevent proxy providers from rotating your IP
    • adds the prevent_keeping_connections_alive() -> bool method to ProxyConfig objects
    • When initializing YouTubeTranscriptApi a Connection: close header will be added to the HTTP client, if a proxy config with prevent_keeping_connections_alive() == True is used
  • Added py.typed by @jkawamoto in #390

New Contributors

Full Changelog: v1.0.0...v1.0.1

v1.0.0

11 Mar 18:27
bf45008
Compare
Choose a tag to compare

What's Changed

  • Overhaul of the public API to move away from the static methods get_transcript, get_transcripts and list_transcripts
    • YouTubeTranscriptApi.get_transcript(video_id) is replaced with YouTubeTranscriptsApi().fetch(video_id)
    • YouTubeTranscriptApi.list_transcripts(video_id) is replaced with YouTubeTranscriptsApi().list(video_id)
    • There is no equivalent for YouTubeTranscriptApi.get_transcript in the new interface, as this doesn't provide any meaningful utility over just running [ytt_api.fetch(video_id) for video_id in video_ids]
    • By calling .fetch and .list on a YouTubeTranscript instance, we can share a HTTP session between all requests, which allows us to share cookies and reduces redundant requests, thereby saving bandwidth and proxy costs.
    • transcript.fetch() now returns a FetchedTranscript object instead of a list of dictionaries. This allows for adding metadata and utility methods to the returned object. You can still convert a FetchedTranscript object to the previously used format by calling fetched_transcript.to_raw_data().
    • You'll find more details on the updated API in the README. The old static methods can still be used, but have been deprecated and will be removed in a future version!
  • Added new exceptions types to make the cause of some common errors more clear and allow for catching/handling them
    • RequestBlocked is now raised if the request has been blocked by YouTube due to a blacklisted IP (which would previously raise TranscriptDisabled #303)
    • AgeRestricted is raised if the video is age restricted and requires cookie authentication (#111)
    • VideoUnplayable is raised if the video is unplayable for an unknown reason. When this happens the error message that YouTube would display on the WebPlayer is returned by the exception, which should make unknown errors more useful. (#219)
  • Added type hierarchy to configure proxies, which can now be passed into the constructor of YouTubeTranscriptApi. All proxy configs are located in the new module youtube_transcript_api.proxies.
    • Generic HTTP/HTTPS/SOCKS proxy can be configured using the GenericProxyConfig class (similarly to how it was done before using the requests dict)
    • Added integration of the proxy provider Webshare, which allows for easily setting up rotating residential proxies using the WebshareProxyConfig
    • You'll find more details on the proxy config classes and how to use them in the README
  • Added the option to pass a HTTP session into the YouTubeTranscriptApi constructor
    • Allows for setting a path to CA_BUNDLE file (#362, #312)
    • Allows for setting custom headers (#316)
    • Allows for sharing HTTP sessions between multiple instance of YouTubeTranscriptApi
  • Added type signatures to all interfaces

Contributors

Due to the rewrite of some interfaces I wasn't able to merge their PRs directly, but special thanks to the work done by @crhowell in #219 and by @andre-c-andersen in #337, as their PRs have been very useful in implementing the new exceptions types! 😊🙏

Full Changelog: v0.6.3...v1.0.0

v0.6.3

18 Nov 09:52
97522b7
Compare
Choose a tag to compare

What's Changed

  • Fix grammatical mistakes in README by @Jai0401 in #287
  • Update README.md - cookies extension and instructions for export by @samfisherirl in #339
  • [security] defusedxml.ElementTree instead of xml.etree.ElementTree by @vasiliadi in #352

New Contributors

Full Changelog: v0.6.2...v0.6.3