Repackaging Digg

I subscribe to Digg’s syndicated feed, although more to keep track of the zeitgeist than for information—I probably actually click through only one or two stories a day (unless I’m very very bored).

Despite the site’s popularity, the official feed is an absolute disaster. It fails to validate, for both syntactic (wacky use of multiple isPermalink attributes) and semantic (lots of meta-information shoved into tags hidden in the http://digg.com/docs/diggrss/ namespace) reasons. (Of course, the Digg web site also fails to validate, for nontrivial reasons.) Worst of all, even if readers understood all the custom tags the feed would still lack much of the most important information available on the web site, including both the original author of the linked story and the URL for the story itself. Digg’s official feed forces you to first link over to digg.com, where you see the same information as in the feed, and then click on the title link to get to the story you wanted in the first place. This kills keyboard navigation on my desktop machine, and the extra page-load is a big pain on a slow client like an iPhone.

I finally got sick of this (and needed a break from my main work) so I put together a script that reads the Digg feed as well as the Digg web site and builds a much friendlier atom feed. Each entry links directly to the main story, but includes the Digg page (with comments and such) in the via field; most readers thus make it easy to visit either one but optimize for reading the main story. I also list the story web site as the original author and relegate the Digg submitter to “contributor” status, although unlike the official feed, mine also includes a link to the contributor’s Digg user page. Finally, I include the story thumbnail in the feed as part of the HTML description.

Others are welcome to subscribe to this one-site mashup; most readers I’ve checked do intelligent caching so hopefully this shouldn’t take much bandwidth. I’m regenerating the feed every twenty minutes, so there should be only a little lag between my feed and the one from Digg.

I’ve tried to account for some of Digg’s weirdnesses (serious confusion about escaping in both web content and feed; occasional retraction of stories and lack of synchronization between web and feed), but if you see anything strange going on, or even if you just find the feed useful and would like to encourage me to keep it working, let me know. I’m not archiving all generated versions of the feed, so bug reports that include sources from my feed, Digg’s feed, and Digg’s web page are likely to result in patches much more quickly than reports without sources.