[rdfweb-dev] 558KB (KiloBlogs)

Danny Ayers danny666 at virgilio.it
Mon Jul 14 16:40:35 UTC 2003


Tidy then XSL on 558k+ pages might take a while...regexp does sound more
promising (but I'm hopeless at them, sorry). You doing this in Perl?

There's an autodiscovery on dannyayers.com, but if you can get a list of the
TypePad betas (e.g. http://danja.typepad.com/fecho/)  they should all have
one.

Cheers,
Danny.

> -----Original Message-----
> From: Eric Vitiello [mailto:eric at perceive.net]
> Sent: 14 July 2003 18:25
> To: danny666 at virgilio.it; rdfweb-dev at vapours.rdfweb.org
> Subject: RE: [rdfweb-dev] 558KB (KiloBlogs)
>
>
> > Good man! I look forward to hearing/seeing how it goes.
> >
> > I've a feeling my spider hacks might soon be getting a
> > hammering too ;-)
> >
> > > > FOAF auto-discovery and scutter, anyone? (That sounded as sane as
> > > > "millenium, hand and shrimp"...)
> > >
> > > Working on it.  Should have a FOAF scutter early next week.
>
> ok. the script is currently running grabbing websites and attempting
> autodiscovery.  I'm using a simple Regex (not allowing any
> variations), and
> I'd like to get everyone's input on a good flexible Rexex for grabbing the
> auto discovery <link>
>
> I've also thought about forcing all the pages that I grab to be
> well-formed
> XML and then parsing for the link tag with XSL, which would
> definitely make
> sure we catch them all, but the overhead of cleaning up all the pages with
> Tidy mightbe too much -- plus there would always be some pages
> that have too
> many errors.  Ideas?
>
> Also, it'd be handy to have a handful of websites known to have an
> autodiscovery tag to test against. so if anyone has the autodiscovery link
> tag, send me your URL.
>
> Thanks!
>
> --Eric
>
> --
> Eric Vitiello
> Perceive Designs <http://www.perceive.net>
>




More information about the foaf-dev mailing list