Although my supervisor strongly recommend using JENA for RDF related work, but as I really don’t like Java (just personal preference), and wouldn’t want to install JRE/JVM (whatever it is called) at my shared server account, so I went to look for an alternative. After spending some time searching, I found this library called Redland and it provides binding for my current favorite language — PHP, so I decided to use this for my RDF work.
I used pure relational database approach (postgresql to be exact) to store information collected via flickr initially, but it wouldn’t really scale (or I just simply suck at designing/maintaining database tables) and my queries got killed numerously by the web host. Besides that, as the table grow larger, the amount of resource consumed to serve a query also grow, hence I needed to find another way to store the data. At first I thought of using some popular noSQL solutions, but my supervisor told me to turn the data into RDF format instead. So I went on with Redland and use postgresql (again) as storage.
However, for some reason, postgresql doesn’t seem to work efficiently and a simple query can take tens of minutes to run when it holds more than 100k RDF statements. For some reason, the hash storage doesn’t work on my Ubuntu development VM, so I switched the storage engine to MySQL after @dajobe’s suggestion.
However, as PHP binding apparently not that popular, so there aren’t much information / tutorial posted. I actually wanted to write a collection of scripts to collect data from flickr for my research project, so I began with finding an existing script for that task. However, I didn’t find good ones in PHP, so I ported this from Perl to PHP and use Redland. Then after knowing how it really works, I rewrote everything again from scratch (will put them up to bitbucket when I have time to clean up). I even wrapped the library with class methods, which I hope to release later (I only knew about another OO wrapper in PHP for Redland after almost done writing my scripts).
To use Redland’s PHP functions without OO wrapper, we always start with a statement to build a world. I am not very sure what this means (yes, I didn’t really read the documentation that thoroughly), but it seems that almost all constructor functions depend on it to create new object (yes, Redland has this OO feel although all the function calls are in procedural style). So, in PHP, the statement would look like:
$world = librdf_new_world();
Then you would want to decide where to store your RDF statements. For this piece of note, I will just use non-persistent memory store. So the function call to build a storage object is
$storage = librdf_new_storage($world, 'memory', $name, $options);
$name stores the name of the storage object, and
$options often carries the DSN, but in our case it is NULL. Now that we have storage defined, then we proceed with building a model to actually store RDF statements into the storage (it would be easy to think model as a database library, and storage as an abstraction layer).
$model = librdf_new_model($world, $storage, NULL);
Statements are consists of nodes, so let’s create some. To create a URI node, it is just as easy as
$foo = librdf_new_node_from_uri_string($world, 'urn:foo'); $bar = librdf_new_node_from_uri_string($world, 'urn:bar');
Creating a literal node would be just as simple as
$baz = librdf_new_node_from_literal($world, 'baz', NULL, FALSE);
Putting them into a statement
$statement = librdf_new_statement_from_nodes($world, $foo, $bar, $baz);
which is the equivalent to this
<urn:foo> <urn:bar> 'baz' .
To run a query to the model, just simply send a SPARQL statement as follows
$query = librdf_new_query( $world, 'sparql', NULL, <<
?object . } SPARQL );
Running the query and get result
$result = librdf_model_query_execute($model, $query); var_dump(librdf_query_results_to_string2($result, 'json', 'application/json', NULL, NULL));