Skip to content

Itunes fixes #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 68 additions & 19 deletions src/gpodder/plugins/itunes.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@

#
# gpodder.plugins.itunes: Resolve iTunes feed URLs (based on a gist by Yepoleb, 2014-03-09)
# Copyright (c) 2014, Thomas Perl <m@thp.io>
# gpodder.plugins.itunes: Resolve iTunes feed URLs
# (initially based on a gist by Yepoleb, 2014-03-09)
# Copyright (c) 2014, Thomas Perl <m@thp.io>.
# Copyright (c) 2025, E.S. Rosenberg (Keeper-of-the-Keys).
#
# Permission to use, copy, modify, and/or distribute this software for any
# purpose with or without fee is hereby granted, provided that the above
# copyright notice and this permission notice appear in all copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
# REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
# AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
# INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
# LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
# OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
# PERFORMANCE OF THIS SOFTWARE.
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
# SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
# RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
# CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
# CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#


Expand All @@ -26,39 +27,54 @@
import urllib.parse

logger = logging.getLogger(__name__)
# As of 2025-05-11 200 is the upper limit according to
# https://performance-partners.apple.com/search-api
PAGE_SIZE = 200

class ITunesFeedException(Exception):
pass


@registry.feed_handler.register
def itunes_feed_handler(channel, max_episodes, config):
m = re.match(r'https?://(podcasts|itunes)\.apple\.com/(?:[^/]*/)?podcast/.*id(?P<podcast_id>[0-9]+).*$', channel.url, re.I)
expression = (
r'https?://(podcasts|itunes)\.apple\.com/(?:[^/]*/)?'
r'podcast/.*id(?P<podcast_id>[0-9]+).*$'
)
Comment on lines +40 to +43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we didn't do it before, we could move those to module-level pre-compiled regular expression objects, and then just use them here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is for the current scope, it is a good idea but figuring that out and reorganizing everything would change this into a monster PR.

m = re.match(expression, channel.url, re.I)
if m is None:
return None

logger.debug('Detected iTunes feed.')

itunes_lookup_url = 'https://itunes.apple.com/lookup?entity=podcast&id=' + m.group('podcast_id')
itunes_lookup_url = (
f'https://itunes.apple.com/lookup?entity=podcast&id='
f'{m.group("podcast_id")}'
)
try:
json_data = util.read_json(itunes_lookup_url)

if len(json_data['results']) != 1:
raise ITunesFeedException('Unsupported number of results: ' + str(len(json_data['results'])))
raise ITunesFeedException(
f'Unsupported number of results: {len(json_data["results"])}'
)

feed_url = util.normalize_feed_url(json_data['results'][0]['feedUrl'])

if not feed_url:
raise ITunesFeedException('Could not resolve real feed URL from iTunes feed.\nDetected URL: ' + json_data['results'][0]['feedUrl'])
raise ITunesFeedException(
f'Could not resolve real feed URL from iTunes feed.\n'
f'Detected URL: {json_data["results"][0]["feedUrl"]}'
)

logger.info('Resolved iTunes feed URL: {} -> {}'.format(channel.url, feed_url))
logger.info(f'Resolved iTunes feed URL: {channel.url} -> {feed_url}')
channel.url = feed_url

# Delegate further processing of the feed to the normal podcast parser
# by returning None (will try the next handler in the resolver chain)
return None
except Exception as ex:
logger.warn('Cannot resolve iTunes feed: {}'.format(str(ex)))
logger.warn(f'Cannot resolve iTunes feed: {ex}')
raise

@registry.directory.register_instance
Expand All @@ -69,6 +85,39 @@ def __init__(self):
self.priority = directory.Provider.PRIORITY_SECONDARY_SEARCH

def on_search(self, query):
json_url = 'https://itunes.apple.com/search?media=podcast&term={}'.format(urllib.parse.quote(query))

return [directory.DirectoryEntry(entry['collectionName'], entry['feedUrl'], entry['artworkUrl100']) for entry in util.read_json(json_url)['results']]
offset = 0

while True:
json_url = (
f'https://itunes.apple.com/search?media=podcast&term='
f'{urllib.parse.quote(query)}&limit={PAGE_SIZE}&offset='
f'{offset}'
Comment on lines +92 to +94
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it would be fine if this wasn't split into 3 lines, and just a single f-string on one line.

)
json_data = util.read_json(json_url)

if json_data['resultCount'] > 0:
for entry in json_data['results']:
if 'feedUrl' not in entry:
continue

title = entry['collectionName']
url = entry['feedUrl']
image = entry['artworkUrl100']

yield directory.DirectoryEntry(title, url, image)
returned_res += 1

offset += json_data['resultCount']
else:
# Unlike the podverse stop condition where we detect a resultCount
# smaller than the page size for apple we can only stop when 0
# results are returned because the API seems to consistently
# return more than the page size and does this in an inconsistent
# fasion, most often returning 210 results but based on my
# observartion any number between page size and page size + 10 is
# possible.
#
# With an API that does not obey its own rules the only valid stop
# condition is no results.

break
Comment on lines +98 to +123
Copy link
Member

@thp thp Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simplified to (note I removed returned_res, because it doesn't seem to be used?):

Suggested change
if json_data['resultCount'] > 0:
for entry in json_data['results']:
if 'feedUrl' not in entry:
continue
title = entry['collectionName']
url = entry['feedUrl']
image = entry['artworkUrl100']
yield directory.DirectoryEntry(title, url, image)
returned_res += 1
offset += json_data['resultCount']
else:
# Unlike the podverse stop condition where we detect a resultCount
# smaller than the page size for apple we can only stop when 0
# results are returned because the API seems to consistently
# return more than the page size and does this in an inconsistent
# fasion, most often returning 210 results but based on my
# observartion any number between page size and page size + 10 is
# possible.
#
# With an API that does not obey its own rules the only valid stop
# condition is no results.
break
if json_data['resultCount'] <= 0:
return
for entry in json_data['results']:
if 'feedUrl' not in entry:
continue
title = entry['collectionName']
url = entry['feedUrl']
image = entry['artworkUrl100']
yield directory.DirectoryEntry(title, url, image)
offset += json_data['resultCount']