[redland-dev] size of bdb database

Sébastien Pierre sebastien-lists at type-z.org
Fri Apr 7 12:55:21 BST 2006


Le 06-04-06 à 23:55, Dave Beckett a écrit :

> That's way more testing than I've done on how it scales.  I'm guessing
> the overhead is due to these factors:
> * the bdb backend stores indexes of an entire statement 3 times  
> (+contexts)
> * the entire parts of the statement are stored including URIs, not
> pointers to short identifiers
> * the statement varies in size
>
> The assumption was it was better to have less I/O requests than to do
> lots of read/writes to intern URIs.  So it's a disk space vs time  
> thing.

Seems like it is the same for time. I made a simple Python class that  
serialize its instance attributes to RDF Statements  
(Model.add_statement), and decided to generate 10, 100, 1000, 10000,  
etc. objects. From what I've seen I can barely create 300 objects per  
second which is a very low number for the fast machine I was testing on.

> It might be interesting to compare with the sqlite backend which does
> intern URIs, and probably works better for this size of data.  I'm
> speculating...

I will re-run the tests with the sqlite backend, and see if there is  
a difference (and I will also do so for the MySQL backend).

Cheers,

  -- Sébastien



More information about the redland-dev mailing list