IT|Redux

Office 2.0 News

Friday, February 9th 2007 | Ismael Ghalimi

Since Yahoo! released Pipes, I have been looking for a good application I could build with this cool development tool. After a couple hours playing with it, I realized that it was nothing more than a synchronous feed processing tool, but a very easy to use one, so I decided to build a feed aggregator with it. I took feeds from the blogs of as many Office 2.0 applications as I could, and built an Office 2.0 News page with matching feed. Essentially, Office 2.0 News serves all the information you need to keep up to date with everything Office 2.0 related.

To build this page, I started from the information provided by the Office 2.0 Database. It currently lists 391 applications, but only 83 feeds are on record. I expect to have a complete database of feeds by the end of next week. From there, I entered all feeds manually using the Fetch source in Pipes. I tried to do it automatically by creating an RSS feed with 35 lines of PHP code from the JSON feed served by Dabble DB, but Pipes does not seem to allow the nesting of a Fetch source within a For Each operator. I then used a Sort operator to sort the feed in reverse chronological order, and a Unique filter to remove redundant entries. This gave me a nice little pipe, coming with both RSS and JSON feeds. Problem is, the Sort operator in Pipes does not seem to work very well, so I had to improvise a bit.

In order to sort my feed, I upgraded my Feed Digest account to the $11.99 a month Basic account level, which allows me to create 12 digests from 20 different sources, with 50 items per digest. With it, I created a digest, that is nothing more than yet another RSS feed showing the 50 most recent items from the feed I got out of Pipes. I then piped it to FeedBurner in order to create one last RSS feed, which address would never change. If you want to subscribe to the master Office 2.0 News feed, please use this one, for I might later on decide to use something other than Pipes upstream, or to get rid of Feed Digest in the middle.

Armed with this clean feed, I then wrote 47 lines of PHP code to properly display feed entries on the Office 2.0 News page, the trickiest bit being to perform a reverse lookup from the Office 2.0 Database in order to identify the source of the feed. Because of the way RSS feeds are structured, and the fact that some feed entries link to a FeedBurner page rather than the blog they originate from, the reverse lookup is not working in all cases, which is the reason why some entries currently display the source as being unknown. I expect this bug to be fixed very soon.

This little experiement taught me a couple of things: First, service cascading works, and we should thank standards such as RSS for it. Second, using a third-party service to fetch hundreds of feed and aggregate thousands of entries while caching the results for you is not a bad idea, especially if you’re running your blog from a single server. Third, Yahoo! Pipes works, but its capabilities are fairly limited so far, and sorting does not work at all, which is a shame really. I very much look forward to the next revision, which hopefully will let me connect from Dabble DB directly in order to automatically import the list of feeds that I want to aggregate. In the meantime, have fun with this stream of unfiltered Office 2.0 News.

Entry filed under: Office 2.0

6 Comments - Add a comment

1. Craig Cmehil  |  February 12th, 2007 at 12:19 am

Quite a bit more sophisticated than my own attempts, but impressive use of the tool.

2. Doron  |  February 12th, 2007 at 2:45 am

That is cool. I have been trying to do something like that in pipes, but wasn’t able to. I think a pipe with updates from all the blogs of the Office 2.0 services would also be cool, and much easier to do. Maybe use the content analyzer to see if they used words such as Update or New or Feature to only show these posts.

3. Ismael Ghalimi  |  February 12th, 2007 at 12:03 pm

Craig,

Thanks for the kind words.

Best regards
 -Ismael

4. Ismael Ghalimi  |  February 12th, 2007 at 12:34 pm

Doron,

I like your idea. I’ll look into it.

Best regards
 -Ismael

5. Bob Urry  |  February 13th, 2007 at 3:41 am

Hi Ismael,

Very innovative use of the services. I think that I would worry that such a layered approach to using such services could break your output should the provider of one of them choose to change their service, or perhaps go out of business. I think my concern is the level of service continuity, and the notification of changes by the providers.

If you’re creating a system internal to an organization or an outsourcing contract, you can agree on SLAs and so on. Prehaps we need something of a similar arrangement to give confidence that our own creations will continue, or that we have the time to make our own changes.

What do you think?

-Bob

6. Ismael Ghalimi  |  February 13th, 2007 at 4:55 pm

Bob,

I agree with you, service cascading creates many points of failure, and this is precisely an experiment aimed at finding out how to work around the issues such a fragile architecture raises. Part of the solution is in using industry standards. In this particular case, I could get rid of one or two layers in the middle, without losing much functionality, mainly because these services use RSS feeds both in and out.

Best regards
 -Ismael

Trackback this post  |  Subscribe to the comments via RSS Feed

Leave a Comment

Required

Required, hidden