[rdfweb-dev] 558KB (KiloBlogs)

Leigh Dodds ldodds at ingenta.com
Mon Jul 14 16:49:57 UTC 2003


Hi,

I wrote up a scutter to do this very same job one lunchtime last 
week, but cancelled its run because it was slow (single threaded) 
but haven't found time to actually re-run it yet (busy with an 
application release at the moment).

I used TagSoup to make the pages well-formed, and then a normal 
SAX handler to listen for link tag call-backs. TagSoup is cool; it 
even manages to clean up the foaf wiki which Tidy fails on.

I can post the (grotty) code somewhere if someone wants it, but it 
was just an hours hack so would be easy to re-create.

Anyway in the first 1000 or so URLs all it came up with was:

http://www.stupidfool.org/ben.foaf
http://www.neilturner.me.uk/foaf.rdf
http://www.zonageek.com/sdelmont.foaf.rdf
http://www.crystalflame.net/foaf.rdf
http://www.begbie.com/foaf.rdf
http://www.richarderiksson.com/xml/foaf.xml
http://bitworking.org/foaf.rdf

If you want to compare output. I've got autodiscovery on my blog:

http://www.ldodds.com/blog

Cheers,

L.



More information about the foaf-dev mailing list