[redland-dev] XML special characters

Dave Beckett dave.beckett at bristol.ac.uk
Wed Mar 19 15:46:08 GMT 2003


>>>Jason Johnston said:
> 
> I'm running into a problem when parsing and re-serializing XML Literals, 
> where special characters such as ampersands and angle brackets are not 
> serialized as entity references as they should be, so the result is 
> invalid XML.
> 
> See the example below.  The file parsed in is identical to the 
> serialized result, except the "&amp; &gt; &lt;" becomes "& > <".
> 
> Any ideas?  Thanks in advance.
> --Jason
> 

It seems to be a bug in the Raptor parser.  It should be &-escaping
the chars when it makes the XML literals in the triples.  The
serializer output is very simple - it just prints the content.

Not sure what the easiest fix is for you here.  I guess you have to
look at each triple ($statement) that comes through from the $parser
stream and fix lonely &, < and > without disturbing any of the
<tag>s.  Tricky.  

Or you could try to mdoify the prettier RDF/XMl serializer code in
perl that is half-written and in the current CVS at:

  http://cvs.ilrt.org/cvsweb/redland/librdf/perl/serialize.pl

I was using it for prototyping a better output, it isn't supported,
incomplete but mostly works.  It might fix your immediate problem
while I fix the parser.

Dave

<snip/>



More information about the redland-dev mailing list