Skip to content

Add support for GNOME archive format #2

@nicorikken

Description

@nicorikken

I just converted my public maillinglist subscriptions with the corresponding overflowing mailbox to a set of RSS feed with mailman-rss, with some help of a bash script and a crontab entry. I like it! 👍

But, one of the mailinglists I'm subscribing to fails with the current version of mailman-rss: https://mail.gnome.org/archives/dia-list/
Apart from all the styling on the page, there are still Mailman-format thread links available (https://mail.gnome.org/archives/dia-list/2016-May/thread.html), and so are archive files, even though they are not linked (https://mail.gnome.org/archives/dia-list/2016-May.txt.gz)

Seemingly the most simple solution would be to just try the Y-M.txt.gz URL if the links were not provided in the HTML. Maybe a differently structured page can even be detected, but that might not add much benefit.

Also now it fails pretty badly:

$ ./mailman-rss.py -c 1 https://mail.gnome.org/archives/dia-list/
<rss version="2.0">
<channel>
<title>The dia-list Archives</title>
<description>The dia-list Archives</description>
<link>https://mail.gnome.org/archives/dia-list/</link>
<item>
<author>Stefan Cronje &lt;Stefan.Cronje@qlink.co.za&gt;</author>
<title>Dia High Contrast Enhancements</title>
<pubDate>Thu, 28 Jan 2016 07:04:11 +0000</pubDate>
<guid isPermaLink="false">https://mail.gnome.org/archives/dia-list/HE1PR05MB1801F720129E3CCA49CA209CA6DA0@HE1PR05MB1801.eurprd05.prod.outlook.com</guid>
Traceback (most recent call last):
  File "./mailman-rss.py", line 212, in <module>
    main()
  File "./mailman-rss.py", line 208, in main
    print_rss(archive, mails)
  File "./mailman-rss.py", line 148, in print_rss
    body)
  File "/usr/lib/python2.7/re.py", line 171, in split
    return _compile(pattern, flags).split(string, maxsplit)
TypeError: expected string or buffer

Todo

  • Fail gracefully if no archive-files are available, stating the issue at hand
  • Try archive URL's (forced/guessed) if no archival files exist
  • Add options to explicitly turn the forced URL on or off to prevent other issues from happening

I'll probably set some time aside in the near feature to scratch my own itch, but I'm hopping to contribute it upstream to this repo to help others.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions