[redland-dev] raptor rdf/xml parsing and encoding

Sebastian Trüg strueg at mandriva.com
Fri May 30 09:38:19 BST 2008


On Friday 30 May 2008 01:43:00 Dave Beckett wrote:
> Sebastian Trüg wrote:
> > The raptor API says that all strings (URIs and literals) are utf8.
> > However, when parsing a file with encoding UTF-8 or encoding ISO8859-1
> > containing a literal with a german umlaut, I do not get utf8 in either
> > case.
>
> Can you file a bug and attach that file (or something minimal that
> demonstrates it)?

I did not want to do that before being sure that the problem is not self-made.

> > - Does raptor ALWAYS produce utf8 strings?
>
> Yes.
>
> > - Is the following code acceptable:
> >
> > void raptorTriplesHandler( void* userData, const raptor_statement* triple
> > ) {
> >    [...]
> >    switch( triple->object_type ) {
> >    case RAPTOR_IDENTIFIER_TYPE_LITERAL:
> >        fromUtf8( (const char*)triple->object );
> >    [...]
> >    }
> >    [...]
>
> I don't know what that does, but every raptor (& redland) literal string
> and URI string are all UTF-8.  Everywhere you see unsigned char*,
> basically.

See the "fromUtf8" as some method that takes utf8 data. The question is just 
if I handle the raptor_statement parameter correctly. The raptor 
documentation is pretty thin and the examples do not help either.

For example: the docu says "Representation of RDF triples inside Raptor. They 
are a sequence of three raptor_identifier"
But that seems not true since raptor_statement does not use raptor_identifier.
So I am confused.
And before I am not sure that I use the API correctly I don't want to dig into 
the problem.

Cheers,
Sebastian


More information about the redland-dev mailing list