are you tired of this? then return to latest posts
A week ago today I was arriving from a weekend spent amongst the Portuguese geekdom. I attended my very first Barcamp, in Coimbra. So, in the spirit of the event I prepared a talk about something that have been on my mind recently.
Instead of repeating the presentation about microformats, which is nothing more than an attempt to push adoption of microformats to increase semantics on websites, I decided to look at it from a different angle and show how you can muster the value of semantic content, available TODAY on our users' other websites/services.
If you're interested, check the presentation below. I gave the talk in Portuguese, but since there were so few slides, I translated them to English.
Feel free to give some feedback in the comments below.
For those of you who didn't attend, I wrote a tiny little script over two nights that acts as proof of concept. It simply scrapes a given page and tries to gather data of the user's attention profile.
The goal would be to be able to provide content in which the user is interested right off the bat, as soon as they sign up. I described the harvesting algorithm in one of the slides, but here is a small list of the steps:
You can test it for yourself at: http://workshop.andr3.net/tageater/
If you want the code, it's available for download as zip file. I didn't put it up on SVN or anything because it's just an example. It's written in PHP (5.0), requires curl, DOM and SimpleXML.
It uses the microformats transformer Optimus, written by Dmitry Baranovskiy.
It includes a very basic cache mechanism, based on the filesystem. Make sure the ./cache/cache folder has the right permissions, if you want to use it.
I'm releasing all this under the MIT License, but remember this is only a proof of concept, or in another words, do not use this in production.
Here's a few tests I made with it. First, and this is included in the script eve though it's deactivated, I pointed it at the all the URLs specified by the BarcampPT attendees. On top of that, to avoid skewing the results by grabbing URLs owned by fellow geeks, I pointed it towards 5 of the highlighted blogs at SAPO Blogs at the time.
A very important question came up from the audience right at the end.
If you have some more questions, shoot. :)