Introduction to Yahoo Pipes

The Yahoo! Pipes logoToday Yahoo released an intriguing service called "Pipes". Based on the concept of unix pipes, it allows you to filter and process RSS feed data in a chain - using a very cool web graphical user interface. The result can be spit out in a variety of formats that can then be used in your RSS reader, on your blog, in a Greasemonkey script or even as the building blocks of more complex pipes.

There is still a lot of room for improvement on the tool (the wish list of functionality grows by the second!) but it's an awesome and inspiring start. The only major complaint from people is the lack of easy-to-understand documentation. Don't worry though - I'll present a step-by-step guide to get you going!


The simplest - listing an RSS feed

Ok, we can't get much more basic than this, so it's a good place to start. Taking an RSS feed, and doin' nothin' to it...

  1. Open the "Sources" dropdown list and drag the Fetch module on to the workspace.
    first step diagram
  2. In the "url" textbox enter your feed address. For this example we will use http://www.digg.com/rss/index.xml
  3. second step diagram
    Notice that when you select the Fetch box, the Debugger at the bottom of the screen will load the RSS feed and show you the returned results. The debugger shows you the output after every module in your pipe chain.

  4. Now, join the output of the Fetch module to the input of the Pipe Output module. When you click and drag a pipe from a module's output node you will see that all input nodes that accept that output will light up. This makes it easy to see where you are allowed to join stuff!

    Now we have joined our modules, the RSS feed is piped to the output. Click on the Pipe Output module and the final output is displayed in the debugger. This is how you can test your pipe sets without having to run them from your pipe lists.
    second step diagram
    If you join something to the wrong place, you can delete the join by clicking on the output node, and dragging to a blank space.

  5. Click on the "Save" button in the top right of your workspace. Enter a name in the text box in the top left and hit Save.
  6. To test the pipe, go back to your pipe homepage and click on your pipe's name. Click the "Run this Pipe" link.
    second step diagram

The Pipe Preview will load the Digg RSS feed. Phew - there's ya first pipe done!

A more complex example

Now that you've got the basics down, it's just a matter of playing around with the other modules to see what you can do. Here's what we'll do here - we'll make a feed that combines the latest stories from Slashdot with the stories from Digg that have over a certain number of diggs. Then we'll filter out all Digg stories that have "amazing" in the title, and get rid of any Slashdot stories from Zonk.

  1. Add two Fetch modules from the available Sources. To one add the URL http://www.digg.com/rss/index.xml and to the other http://rss.slashdot.org/Slashdot/slashdot. Next add a Union module from the Operators section. Join the two Fetch outputs to the Union inputs. (The same thing can be done adding extra urls to one Fetch module, but I just did it this way for now so we can join some more stuff!)

    second pipe, first step diagram

    Clicking on the Union module you will see that both RSS feeds are output in the debugger window.

  2. Now we will customise the results to suit our tastes... Add a Filter module from the Operators list. We want to "Block" items that match "Any" of our criteria.

    Click on the Rules dropdown list - notice it only contains the values "Title", "Body", "Publication Date". Join the output of the Union module to the input of the Filter.

    The dropdown box changes to a label that says "Updating...". Once it is loaded, it will contain a list of all the fields you can filter on from the data feeds! NOTE: This can be a little buggy. If it doesn't appear to properly refresh the lists, try removing the link and re-adding it. This should do the trick.

    Now, let's filter away crap we don't want

  3. First of all, let's get rid of low-dugg stories:

    From the dropdown list select "digg:diggCount" "is less than" and in the textbox type "500".
    Our output will only return Digg stories that have over 499 diggs. It will return all Slashdot stories.

  4. Next click on the plus to add a new Rule. Let's make this one "Title" "Does not contain" "amazing".
  5. For the final rule, let's not get Zonk's posts: "dc:creator" "Does not contain" "Zonk"

second pipe, second step diagram

Ahhh that's good. Now just join the Filter output to the Pipe Output and we've got our aggregation.

User Input

Finally, what if we want to change the minimum number of Diggs needed to be included? Well one way is to add a User Input field. Let's do that.

  1. Select Number Input from the User Inputs. Drop it next to our Filter. Fill in the following values (without the parenthesis!):
    Name: numDiggs (internal variable name)
    Prompt: Miniumn Diggs (this appears on the input screen)
    Position: 0 (Order of input boxes)
    Default: 500 (A default value for the input screen)
    Debug: 500 (A value just for you when you're testing)

  2. Drag the output from the Number Input control to the little input on the diggCount field in the Filter.
    Change the Debug value, and refresh the Pipe Output. You will see the number of Digg stories changes.

second pipe, third step diagram

If save your pipe and go back to your pipes home page, when you load this pipe you will be prompted to enter the "Minimum Diggs" in a textbox. Awesome!

second pipe, final step diagram

That's it really. Now you just need to start adding in other modules and figuring out what they do. The only tricky ones are the For-Each modules. But don't worry about them too much now.

The best thing about Yahoo Pipes is you can use one of it's many available methods to "Subscribe" to your new mashed up feed. Now all you gotta do is put on your thinking helmets and come up with some killer mash-ups! Magic!