<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>web(cslai) &#187; Semantic Web</title>
	<atom:link href="http://cslai.coolsilon.com/category/semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://cslai.coolsilon.com</link>
	<description>Findings and Notes in Web Development</description>
	<lastBuildDate>Tue, 06 Sep 2011 08:15:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Redland for RDF Work</title>
		<link>http://cslai.coolsilon.com/2011/05/13/redland-for-rdf-work/</link>
		<comments>http://cslai.coolsilon.com/2011/05/13/redland-for-rdf-work/#comments</comments>
		<pubDate>Fri, 13 May 2011 06:47:20 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[PHP]]></category>
		<category><![CDATA[Semantic Web]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=226</guid>
		<description><![CDATA[Although my supervisor strongly recommend using JENA for RDF related work, but as I really don&#8217;t like Java (just personal preference), and wouldn&#8217;t want to install JRE/JVM (whatever it is called) at my shared server account, so I went to look for an alternative. After spending some time searching, I found this library called Redland [...]]]></description>
			<content:encoded><![CDATA[<p>Although my supervisor strongly recommend using JENA for RDF related work, but as I really don&#8217;t like Java (just personal preference), and wouldn&#8217;t want to install JRE/JVM (whatever it is called) at my shared server account, so I went to look for an alternative. After spending some time searching, I found this library called <a href="http://librdf.org/">Redland</a> and it provides binding for my current favorite language &#8212; PHP, so I decided to use this for my RDF work.</p>
<p><span id="more-226"></span></p>
<p>I used pure relational database approach (postgresql to be exact) to store information collected via flickr initially, but it wouldn&#8217;t really scale (or I just simply suck at designing/maintaining database tables) and my queries got killed numerously by the web host. Besides that, as the table grow larger, the amount of resource consumed to serve a query also grow, hence I needed to find another way to store the data. At first I thought of using some popular noSQL solutions, but my supervisor told me to turn the data into RDF format instead. So I went on with Redland and use postgresql (again) as storage.</p>
<p>However, for some reason, postgresql doesn&#8217;t seem to work efficiently and a simple query can take tens of minutes to run when it holds more than 100k RDF statements. For some reason, the <a href="http://stackoverflow.com/questions/5882707/getting-statements-from-redland-hash-storage">hash storage doesn&#8217;t work</a> on my Ubuntu development VM, so I switched the storage engine to MySQL after <a href="https://twitter.com/#!/dajobe/status/67969726550253568">@dajobe&#8217;s suggestion</a>.</p>
<p>However, as PHP binding apparently not that popular, so there aren&#8217;t much information / tutorial posted. I actually wanted to write a collection of scripts to collect data from <a href="http://flickr.com/">flickr</a> for my research project, so I began with finding an existing script for that task. However, I didn&#8217;t find good ones in PHP, so I ported <a href="https://github.com/straup/p5-Net-Flickr-RDF">this from Perl</a> to PHP and use Redland. Then after knowing how it really works, I rewrote everything again from scratch (will put them up to bitbucket when I have time to clean up). I even wrapped the library with class methods, which I hope to release later (I only knew about another <a href="http://blog.literarymachine.net/?p=5">OO wrapper in PHP</a> for Redland after almost done writing my scripts).</p>
<p>To use Redland&#8217;s PHP functions without OO wrapper, we always start with a statement to build a world. I am not very sure what this means (yes, I didn&#8217;t really read the <a href="http://librdf.org/docs/api/index.html">documentation</a> that thoroughly), but it seems that almost all constructor functions depend on it to create new object (yes, Redland has this OO feel although all the function calls are in procedural style). So, in PHP, the statement would look like:</p>
<pre class="php"><code>$world = librdf_new_world();</code></pre>
<p>Then you would want to decide <a href="http://librdf.org/docs/api/redland-storage-modules.html">where to store your RDF statements</a>. For this piece of note, I will just use non-persistent memory store. So the function call to build a storage object is</p>
<pre class="php"><code>$storage = librdf_new_storage($world, 'memory', $name, $options);</code></pre>
<p>where <code>$name</code> stores the name of the storage object, and <code>$options</code> often carries the <abbr title="Database Source Name">DSN</abbr>, but in our case it is NULL. Now that we have storage defined, then we proceed with building a model to actually store RDF statements into the storage (it would be easy to think model as a database library, and storage as an abstraction layer).</p>
<pre class="php"><code>$model = librdf_new_model($world, $storage, NULL);</code></pre>
<p>Statements are consists of nodes, so let&#8217;s create some. To create a URI node, it is just as easy as</p>
<pre class="php"><code>$foo = librdf_new_node_from_uri_string($world, 'urn:foo');
$bar = librdf_new_node_from_uri_string($world, 'urn:bar');</code></pre>
<p>Creating a literal node would be just as simple as</p>
<pre class="php"><code>$baz = librdf_new_node_from_literal($world, 'baz', NULL, FALSE);</code></pre>
<p>Putting them into a statement</p>
<pre class="php"><code>$statement = librdf_new_statement_from_nodes($world, $foo, $bar, $baz);</code></pre>
<p>which is the equivalent to this</p>
<pre><code>&lt;urn:foo&gt; &lt;urn:bar&gt; 'baz' .</code></pre>
<p>To run a query to the model, just simply send a SPARQL statement as follows</p>
<pre><code>
$query = librdf_new_query(
    $world,
    'sparql',
    NULL,
<<<SPARQL
SELECT  ?subject, ?object
WHERE   {
            ?subject <urn:bar> ?object .
        }
SPARQL
);
</code></pre>
<p>Running the query and get result</p>
<pre><code>$result = librdf_model_query_execute($model, $query);
var_dump(librdf_query_results_to_string2($result, 'json', 'application/json', NULL, NULL));</code></pre>
<p>Enjoy!</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2011/05/13/redland-for-rdf-work/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Collaborative Tagging &amp; Folksonomy</title>
		<link>http://cslai.coolsilon.com/2011/02/17/collaborative-tagging-folksonomy/</link>
		<comments>http://cslai.coolsilon.com/2011/02/17/collaborative-tagging-folksonomy/#comments</comments>
		<pubDate>Thu, 17 Feb 2011 03:28:20 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=207</guid>
		<description><![CDATA[Folksonomy is a neologism of two words, ’folk’ and ’taxonomy’ which describes conceptual structures created by users [4, 5]. A folksonomy is a set of unstructured collaborative usage of tags for content classification and knowledge representation that is popularized by Web 2.0 and social applications [1, 5]. Unlike taxonomy that is commonly used to organize [...]]]></description>
			<content:encoded><![CDATA[<p>Folksonomy is a neologism of two words, ’folk’ and ’taxonomy’ which describes conceptual structures created by users <sup>[<a href="#cite-4">4</a>, <a href="#cite-5">5</a>]</sup>. A folksonomy is a set of unstructured collaborative usage of tags for content classification and knowledge representation that is popularized by Web 2.0 and social applications <sup>[<a href="#cite-1">1</a>, <a href="#cite-5">5</a>]</sup>. Unlike taxonomy that is commonly used to organize resources to form a category hierarchy, folksonomy is non-hierarchical and non-exclusive <sup>[<a href="#cite-3">3</a>]</sup>. Both content hierarchy and folksonomy can be used together to better content classification.</p>
<p><span id="more-207"></span></p>
<p>Users of Web 2.0 applications typically organize their created content with a set of terms or keywords, also commonly known as tags <sup>[<a href="#cite-3">3</a>, <a href="#cite-4">4</a>, <a href="#cite-6">6</a>, <a href="#cite-7">7</a>]</sup>. Spending additional time and effort to provide annotation such as tags to a resource usually implicitly implies the relevance and/or importance of the resource to a user <sup>[<a href="#cite-1">1</a>]</sup>. Therefore, besides being used for content classification, tags assignment can also be analyze to find the implicit relationship between users and the tagged content.</p>
<p>It is also observed that the frequency of term usage is proportional to the level of importance or connection to user, i.e. the more a tag is used, the more it is important to the user <sup>[<a href="#cite-1">1</a>, <a href="#cite-8">8</a>]</sup>. The choice of tags when annotating a resource often express the user interest and perceptions <sup>[<a href="#cite-8">8</a>]</sup>. Users usually have different ideas and interests when tagging a resource especially when the application allow the same resource tagged by multiple users <sup>[<a href="#cite-2">2</a>]</sup>. Therefore tagging activity pattern can be studied to find a user’s domain of interest.<br />
Users with common interests tend to use similar set of tags in tagging re- sources of interest <sup>[<a href="#cite-2">2</a>]</sup>. The similarity of users can then be calculated through the tagging pattern for recommending purpose as it provides social relationships between users <sup>[<a href="#cite-4">4</a>, <a href="#cite-7">7</a>]</sup>. This may allow more collaboration between users sharing similar interest.</p>
<p>Although the variety selection of tag helps in better representation of the resource, but this may cause some problems as users don’t tend to agree to each other <sup>[<a href="#cite-2">2</a>]</sup>. Observations also shows tagging behaviour follows power law distribution where tags are mostly assigned to a small subset of more<br />
popular resources <sup>[<a href="#cite-2">2</a>, <a href="#cite-7">7</a>]</sup>. Hence, when a resource gets popular, users having different interests will attempt to assign tags with keywords from all different domains. However, when different tags that are semantically related from each other are assigned to related resources (vocabulary mismatch problem), it may drastically decrease the level of relevance although the tags may actually share similar meaning <sup>[<a href="#cite-3">3</a>, <a href="#cite-7">7</a>]</sup>.</p>
<p>Correlation between resources is often established when they are assigned with common tags <sup>[<a href="#cite-8">8</a>]</sup>. The level of relevance can then be observed by studying the similarity in their set of assigned tags <sup>[<a href="#cite-8">8</a>]</sup>. The correlation and similarity between the tags assigned to resources may also help users to find unseen resources by browsing the tags.</p>
<p>The assigned tags can be used as a tool to enable discovery, sharing and collaboration of web resources <sup>[<a href="#cite-1">1</a>, <a href="#cite-2">2</a>, <a href="#cite-3">3</a>, <a href="#cite-8">8</a>]</sup>. Some applications even allow sharing of tagging information <sup>[<a href="#cite-2">2</a>, <a href="#cite-8">8</a>]</sup>. Hence, besides using tags as a tool for content classification and user interest study, they can be used to make collaboration easier.</p>
<p>A resource may be described by its assigned tags depending on the tagging behaviour <sup>[<a href="#cite-1">1</a>]</sup>. When an tagging system follows the bag-model, i.e. system that allows multiple users to tag a resource, the frequency of a tag is assigned also indirectly shows the increased relevance between the tag and the resource <sup>[<a href="#cite-1">1</a>, <a href="#cite-2">2</a>, <a href="#cite-8">8</a>]</sup>. Weights can be also calculated for each tag to rank the relevance with the resource.<br />
On the other hand, systems that implements the set-model do not provide enough data to deduce the level of importance or popularity of an assigned tag to the resource <sup>[<a href="#cite-8">8</a>]</sup>. Due to the unstructured nature of collabo- rative tagging, the system can be easily abused if users assign tags that do not describe the content <sup>[<a href="#cite-1">1</a>, <a href="#cite-8">8</a>]</sup>. The problem is more apparent for system implementing the set-model as there is no way to rank the assigned tags.</p>
<h2>References</h2>
<ol>
<li id="cite-1">I. Cantador, A. Bellog ́ın, and D. Vallet. Content-based Recommendation in Social Tagging Systems. Methodology, pages 237–240, 2010.</li>
<li id="cite-2">Y. Guo and J. B. Joshi. Topic-based personalized recommendation for collaborative tagging system. Number 10. ACM Press, New York, New York, USA, 2010.</li>
<li id="cite-3">R. Lambiotte and M. Ausloos. Collaborative tagging as a tripartite network, 2005.</li>
<li id="cite-4">P. Mika. Ontologies are us: A unified model of social networks and semantics. Web Semantics Science Services and Agents on the World Wide Web, 5(1):5–15, 2007.</li>
<li id="cite-5">C. Schmitz, A. Hotho, R. J ̈aschke, and G. Stumme. Mining Association Rules in Folksonomies, pages 261–270. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, 2006.</li>
<li id="cite-6">N. Shadbolt, T. Berners-Lee, and W. Hall. The Semantic Web Revisited. IEEE Intelligent Systems, 21(3):96–101, May 2006.</li>
<li id="cite-7">S. Siersdorfer and S. Sizov. Social recommender systems for web 2.0 folksonomies. ACM Press, New York, New York, USA, 2009.</li>
<li id="cite-8">M. Szomszor, C. Cattuto, H. Alani, K. OHara, A. Baldassarri, V. Loreto, and V. D. P. Servedio. Folksonomies, the Semantic Web, and Movie Recommendation. eprintsecssotonacuk, pages 71–84, 2007.</li>
</ol>
<div class="postscript">
I initially wrote this for my report, but thought this may make a good blog post, hence posting it over.
</div>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2011/02/17/collaborative-tagging-folksonomy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Drafting my first progress report</title>
		<link>http://cslai.coolsilon.com/2011/01/09/drafting-my-first-progress-report/</link>
		<comments>http://cslai.coolsilon.com/2011/01/09/drafting-my-first-progress-report/#comments</comments>
		<pubDate>Sun, 09 Jan 2011 06:32:21 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Maintenance]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[RDF]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=199</guid>
		<description><![CDATA[Just managed to migrate all my blog sites to one centralized multi-site, so no more half-baked solution and hopefully this brings better plugin compatibility. I have not check with other related services (like Google Webmaster Tools) whether this cause any breakage though. Well, the main purpose of this blog post is actually a draft of [...]]]></description>
			<content:encoded><![CDATA[<p>Just managed to <a href="http://www.clausconrad.com/blog/migrating-a-bunch-of-wordpress-blogs-to-a-single-wordpress-3-multi-site-installation">migrate</a> all my blog sites to one centralized multi-site, so no more <del datetime="2011-01-09T05:36:05+00:00">half-baked</del> solution and hopefully this brings better plugin compatibility. I have not check with other related services (like <a href="https://www.google.com/webmasters/tools/">Google Webmaster Tools</a>) whether this cause any breakage though. Well, the main purpose of this blog post is actually a draft of what I did for the past two months for my postgraduate programme. Yea, I should have posted more stuff to this blog (just realized that my last post here is already like half a year ago).</p>
<p><span id="more-199"></span></p>
<p>Long story short, I have been working on my research project two months ago. Throughout the mentioned time period, I basically spent a lot of time trying to read and understand some of the related topics in semantic web, information retrieval as well as machine learning as suggested by my supervisor.</p>
<p>I first started with revisiting <a href="http://cslai.coolsilon.com/2010/06/04/resource-definition-framework/">RDF</a> by putting more emphasis on how it is usually implemented. Yea, I know I did some when I was preparing my research proposal, but I didn&#8217;t go to the detail and skipped the section describing RDF/XML all together. </p>
<p>Recently there have been some debates on whether JSON should be used as  another data serialization format. I personally like JSON for its simplicity (especially comparing it to XML), but I find RDF/XML actually makes sense (read: good enough) because I find RDF is very URI dependent and it would probably be a bit awkward to see a lot of URI popping up in JSON. However, I am still very interested to see how this is sorted out, as <a href="http://www.w3.org/QA/2010/12/new_rdf_working_group_rdfjson.html">W3C initiated a group</a> to work on it. (Some thought-provoking read on this by <a href="http://webr3.org/blog/linked-data/opening-linked-data/">nathan@webr3</a>)</p>
<p>I have also read on RDFS for the first time and the more I read it, the more I find it related to my classes on AI (especially Prolog) in my Bachelor Degree. However, I am also slightly surprised to see how tolerant it is (leaving too much room for error?). Besides RDFS, I also read on OWL and find more similarity with Prolog (I really should post the summary of the readings here, yeah, too much to blog about, but lack of time and motivation).</p>
<p>What really surprised me was that with the combination of these (RDF, RDFS and OWL) technologies, I actually tried re-inventing the same thing for my last task in my previous job (over and over again). Although usage of RDF may not exactly fit the problem, but I feel it is close enough to solve the problem.</p>
<p>While looking for some articles on Linked Data Principle, I came across this <a href="http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html">TED talk</a> by <a href="http://www.w3.org/People/Berners-Lee/">Sir Tim-Berners-Lee</a>. Yeah, I do feel sorry for not publishing enough semantic data online after watching that, the talk was very motivating indeed (but is posting personal info that is machine readable without limit actually a good idea, I don&#8217;t feel like injecting a great portion of my life into social networking site, but that&#8217;s a different story).</p>
<p>I also read the beginner&#8217;s guide to SPARQL, and this is the real wow-factor to me. Although I don&#8217;t really like the SQL-like syntax, but the similarity to Prolog makes everything looks cool now. Yeah, I am pretty new in this, so I keep referring everything to Prolog XD. </p>
<p>Then I started to read on information retrieval, this is when everything suddenly seems not cool anymore. However, it did force me to do a lot of revision on statistics and some basic maths stuff (you will be surprised to see how much I forget after working for 2 years). And it wasn&#8217;t encouraging when I began to watch a video lecture on machine learning hoping to pick up in shorter time span because I find myself not understanding most of the lectures XD.</p>
<p>Yeah, that&#8217;s basically wraps up my progress for the past 2 months, getting myself excited with new stuff for the first month, and getting frustrated in the second month not able to understand the material.</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2011/01/09/drafting-my-first-progress-report/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Linked Data Principle</title>
		<link>http://cslai.coolsilon.com/2010/06/10/linked-data-principle/</link>
		<comments>http://cslai.coolsilon.com/2010/06/10/linked-data-principle/#comments</comments>
		<pubDate>Thu, 10 Jun 2010 08:35:40 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[URI]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=178</guid>
		<description><![CDATA[Semantic Web is not just about putting data on the web, but also making links to allow a person as well as a machine to explore the web of data. Links are made in the web of data connects arbitrary things together as described by RDF as opposed to links in the web of hypertext, [...]]]></description>
			<content:encoded><![CDATA[<p>Semantic Web is not just about putting data on the web, but also making links to allow a person as well as a machine to explore the web of data. Links are made in the web of data connects arbitrary things together as described by RDF as opposed to links in the web of hypertext, where links connects to only web-resources. Linkage of arbitrary things then allow related things to be found while performing search.</p>
<p><span id="more-178"></span></p>
<p>Besides, it is also a principle of linking data between systems and entities that allow rich self-describing inter-relations of available data across the globe on the web. Web of Data also marks a shift from publishing data in human readable HTML documents to machine readable documents that allows machines to be able to make inference out of the data published.</p>
<p>To efficiently link entities together, Sir Tim Bernes-Lee proposed four rules or expectations, as follows,</p>
<ul>
<li>Use <acronym title="Universal Resource Identifier">URI as names for things</acronym></li>
<li>Use <acronym title="Hyper-Text Transfer Protocol">HTTP</acronym> URIs to enable people to look up these names</li>
<li>Returns useful information using standard technology / format like RDF / SPARQL when someone looks up a URI</li>
<li>Allow users to discover more things by including links to other URI.</li>
</ul>
<p>However, it is important to know that breaking the rules does not neccessary destructive. It would only reduce inter-connectivity, which in turns discourages re-usability that results in making resources less valuable.</p>
<p>To use URI as names for things, Universal URI set of symbols should always be used to enable other parties to be able to process data that results in a consistent result. This also means that the risk of loosing meaning is reduced in the process. </p>
<p>It is also important to actually serve information on the web against a given URI. This allows data as well as metadata in specific standard formats such as RDF or OWL accessible. By publishing information about a resource, it enables others, especially applications and machines to properly understand the document.</p>
<p>To enable users to discover things, one of the ways is to provide inner as well as outer links information back to the user. By definition, it says that given a graph G, it is browsable if for the URI of any node in G, and if the URI is looked up, information returned that describes the node must satisfy the following conditions:</p>
<ul>
<li>Returns statements where the node is either a subject or object</li>
<li>Describes all blank nodes attached to the node by one arc.</li>
</ul>
<p>In short, it allows data to be represented in graph form and allow traversal. It is important for the query service to return RDF statements of that involves the specified node regardless it is a subject or an object. Note that the subgraph returned should be a minimum spanning tree (MSG) or known as RDF molecule.</p>
<p>However, if there is a statement that relate multiple entities, the statement should then be repeated for each of the entities. This then violates the rule where data must not store in more than one place mainly for consistency purpose. However, this becomes less of a concern if the statements are automatically generated.</p>
<p>Besides that, there may be a situation where the author of document A claims that it relates to document B, but the author of B may think otherwise. One of the reasons may be document A did not exist when B got published.</p>
<p>Multiple or expired data may also not be desirable at times. For example page visitor statistics data within a site introduction document. To solve this issue, one of the proposed way is to separate the statistics statement data out of the introduction document into an individual document.</p>
<p>In the end, links opens up the web of data to not only human beings, but also <acronym title="Artificial Intelligence">AI</acronym> processes to allow them to make inferences out of entities. Besides, it also encourages all parties to publsih data freely in an open standard format.</p>
<p>Data summarized and greatly simplified from the following sources:</p>
<ul>
<li>Sir Tim Bernes-Lee &#8211; <a href="http://www.w3.org/DesignIssues/LinkedData.html">Linked Data &#8211; Design Principle</a></li>
<li><a href="http://linkeddatatools.com/semantic-web-basics">Linked Data</a></li>
<p><!-- linkeddatatools.com -->
</ul>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2010/06/10/linked-data-principle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Resource Definition Framework</title>
		<link>http://cslai.coolsilon.com/2010/06/04/resource-definition-framework/</link>
		<comments>http://cslai.coolsilon.com/2010/06/04/resource-definition-framework/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 10:38:18 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[RDF]]></category>
		<category><![CDATA[RDF Graph]]></category>
		<category><![CDATA[RDF Triple]]></category>
		<category><![CDATA[Semantic Network]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[URI]]></category>
		<category><![CDATA[URL]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=168</guid>
		<description><![CDATA[As the name implies, Resource Definition Framework, or RDF in short, is a language to represent information about resources in world wide web. Information that can be represented is mostly metadata like title (assuming the resource is a web-page), author, last modified date etc. Besides representing resource that is network-accessible, it can be used to [...]]]></description>
			<content:encoded><![CDATA[<p>As the name implies, Resource Definition Framework, or RDF in short, is a language to represent information about resources in world wide web. Information that can be represented is mostly metadata like title (assuming the resource is a web-page), author, last modified date etc. Besides representing resource that is network-accessible, it can be used to represent things that cannot be accessed through the network, as long as it can be identified using a URI.</p>
<p><span id="more-168"></span></p>
<p>The main objective of RDF is to generate information that can be processed by applications by defining a standardized approach to represent resources. The usage of standardized language also enables interchanging of information between applications without loss of meaning. This allows third party applications to use / consume information created and because the information format is standardized, tools are readily available to manipulate the information.</p>
<p>As mentioned earlier, as long as a thing can be represented in the form of Universal Resource Identifier, URI, then it can be represented / described by RDF. URI is generally used to represent not only network-accessible things, but also non-network accessible things like a arbitrary human being, corporation, or even a book in a library as well as abstract concepts that do not necessarily exist physically like creator / author / modified date. URL, which stands for Uniform Resource Locator is a subset of URI.</p>
<p>RDF is a simple language that deals with only binary relationship, which involves a Subject, a Predicate and a Object. Given an example &#8220;web(cslai) is Jeffrey04&#8242;s blog&#8221;, we can re-structure the statement into subject = <strong>web(cslai)</strong>, predicate = <strong>owner</strong>, object = <strong>Jeffrey04</strong>. Then we can put this into a graph (kinda reminds me of Semantic Network), as follows (not RDF graph):</p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/005ba1ec678785f105db96331df36ce2.png" class="graphviz" />
<p>From the graph, we can see that the relationship between a subject and object is described by the predicate. Or another way of saying, <strong>subject</strong> ( web(cslai) ) has a property in the form of <strong>predicate</strong> (owner) that has a value of <strong>object</strong> ( Jeffrey04 ). </p>
<p>Before going to construct a RDF graph, it is important to know that as RDF is used to provide information on a resource, there are a set of basic rules to follow when constructing a RDF statement. Subject should always be a URI, or a blank node (will be discussed later) that denotes a resource, predicate must always be a URI and object can be another resource, a blank node or a constant represented by a character string.</p>
<p>Besides serializing a RDF graph into an XML file, the statements can also be written in the form of triples. Each statement in a graph is written as a simple triple of subject, predicate and object in exact order. Another point to note is that a graph is a primary manner of represent statement, and any other way to represent a statement is considered secondary.</p>
<p>The basic syntax of a triple requires URI to be enclosed in angle bracket or QNAME which kinda resembles XML vocabulary/namespace thingy, and literals to be enclosed in double quotes. For example</p>
<p><code>&lt;http://cslai.coolsilon.com/&gt; csterms:owner "Jeffrey04".</code></p>
<p>Blank node is introduced when a structured data presents as the object value. For example, given a triple as follows</p>
<p><code>exstaff85740 exterms:address "1501 Grant Avenue, Bedford, Massachusetts 01730"</code></p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/01bf64a8f3d55b53cd59a772f2668fea.png" class="graphviz" />
<p>Before discussing the graph, it is worth pointing out that each URI node is denoted by an ellipse, and literals denoted by a box. As seen on the graph, all the nodes are either a subject or an object while arcs are predicates. The respective RDF triples for the above graph are shown as follows:</p>
<pre><code>
exstaff:85740     exterms:address    exaddressid:85740 .
exaddressid:85740 exterms:street     "1501 Grant Avenue" .
exaddressid:85740 exterms:city       "Bedford" .
exaddressid:85740 exterms:state      "Massachusette" .
exaddressid:85740 exterms:postalCode "01730" .
</code></pre>
<p>As seen in both the graph as well as the RDF triples, a new node is created just to describe the concept of address. To represent the same piece of information in another way without having to create a new node, a blank node can be introduced, as follows</p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/4006e169b7558e7ad1cb9c673bf152de.png" class="graphviz" />
<p>In RDF triples form</p>
<pre><code>
exstaff:85740 exterms:address    ??? .
???           exterms:street     "1501 Grant Avenue" .
???           exterms:city       "Bedford" .
???           exterms:state      "Massachusette" .
???           exterms:postalCode "01730" .
</code></pre>
<p>As seen from the graph, address node is changed from a node with URI address into a node that doesn&#8217;t have address which is called a blank node. Then in the triplets it is written as a &#8216;???&#8217; instead of the complete URI as shown in the above example. However, besides using a &#8216;???&#8217; to denote a blank node in triples, we can also use another form of representation in case there are a lot of blank nodes that represents different things within a graph. By reusing the same example, the triples can be rephrased as follows</p>
<pre><code>
exstaff:85740 exterms:address    _:johnaddress .
_:johnaddress exterms:street     "1501 Grant Avenue" .
_:johnaddress exterms:city       "Bedford" .
_:johnaddress exterms:state      "Massachusette" .
_:johnaddress exterms:postalCode "01730" .
</code></pre>
<p>By breaking up the address into smaller structured parts, it enables external applications to manipulate the information in a more standardized way to produce a more predictable result. However, because RDF only deals with binary relationship, to properly describes N-ary relationship, it has to be broken into a list of binary relationship with the use of blank nodes. Somehow, this reminds me of something similar in prolog where you can create a kind of variable that the programmer do not need to explicitly name them.</p>
<p>Besides being used for the above situation, blank node is also often used in situation where there is no other way to properly and accurately describe a resource. For example, a person with email address johndoe@example.com, although mailto:johndoe@example.com is a valid URI, but as mailbox address is also used as an attribute, the better way of representing John Doe is by using a blank node with mailto:johndoe@example.com as object, as follows:</p>
<p><code>_:john exterms:mailbox &lt;mailto:johndoe@example.com&gt; .</code></p>
<p>An example of combining multiple RDF that is scattered around the internet is assuming there is a book that is authored by an author that uses johndoe@example.com, we can make inference by comparing the following triple</p>
<p><code>ex2terms:book78354 exterms:mailbox &lt;mailto:johndoe@example.com&gt; .</code></p>
<p>We can then deduce that The book is written by John Doe that has email address of johndoe@example.com.</p>
<p>However, because by default object allows any literals, it may make other application that consumes the information in trouble. For example, by looking at the triple below,</p>
<p><code>_:jeff exterms:age "24" .</code></p>
<p>There is no way to tell whether that 24 is a base 10 decimal, or is an octal number. Things can only go worse if the application is actually expecting a float. Type literal is introduced to solve the problem by allowing the specification of a particular datatype to be used in literals. Back to the previous example, to properly define my age, I should prepare a triple as follows:</p>
<p><code>_:jeff exterms:age "24"^^xsd:integer .</code> </p>
<p>Content is summarized / oversimplified from <a href="http://www.w3.org/TR/2004/REC-rdf-primer-20040210/">RDF Primer</a>.</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2010/06/04/resource-definition-framework/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Semantic Web</title>
		<link>http://cslai.coolsilon.com/2010/06/03/semantic-web/</link>
		<comments>http://cslai.coolsilon.com/2010/06/03/semantic-web/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 08:46:14 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Microformats]]></category>
		<category><![CDATA[Research Topic]]></category>
		<category><![CDATA[Semantic HTML]]></category>
		<category><![CDATA[Semantic Network]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/?p=152</guid>
		<description><![CDATA[I am currently preparing myself in applying a postgrad programme and is looking for a research topic. At first I wanted to do something that is related to cloud computing but after some discussion with people around me, they suggest me to do something on semantic web. While posting my notes here, I realized that [...]]]></description>
			<content:encoded><![CDATA[<p>I am currently preparing myself in applying a postgrad programme and is looking for a research topic. At first I wanted to do something that is related to cloud computing but after some discussion with people around me, they suggest me to do something on semantic web. While posting my notes here, I realized that I had posted something on semantic network that looks like the base of semantic web <a href="http://cslai.coolsilon.com/2008/03/20/semantic-network/">here</a> (Post still &#8220;Under construction&#8221; as of writing, will post the diagrams later tonight).</p>
<p><span id="more-152"></span></p>
<h2>Semantic Web</h2>
<p>Just thought it would be better if I start by stating a problem, imagine one day Alice is looking for the price for a piece of movie DVD, what she would do is to get to the movie store website and search for the movie DVD. However, Alice might not have been able to get the price if she is not around as the task cannot be automated easily using a computer. This is because the web page that displays the price is prepared for human being like us.</p>
<p>Therefore semantic web is a proposed solution to the problem where information should be represented not only for human being like us to read, but also for machines to be able to understand and manipulate it. In short, the definition says that Semantic Web is an extension to the World Wide Web, which information is given well-defined meaning that enable both lay person and computers to work in co-operation.</p>
<h2>Representing Information</h2>
<h3>Semantic HTML</h3>
<p>From what I remember, before Google emerge as the most popular search engine, webmasters used to include descriptive meta tags within a HTML document, as follows:</p>
<p><code class="html"><br />
&lt;meta name="keywords" content="good looking, handsome, single"&gt;<br />
&lt;meta name="description" content="About Jeffrey04"&gt;<br />
&lt;meta name="author" content="Jeffrey04"&gt;<br />
</code></p>
<p>Such that, when user search with the keywords as specified above, the document above will be returned as one of the search results. However, although metadata is displayed via meta tags as mentioned above, it is never enough to enable computers to understand that &#8220;Jeffrey04 is staying in Malaysia&#8221; or &#8220;Jeffrey04 works in Kuala Lumpur&#8221;.</p>
<p>Then CSS got popular as people around starts encouraging separation of content and presentation. Therefore, more people start using HTML tags like <code class="html">&lt;strong&gt;</code> instead of <code class="html">&lt;b&gt;</code> because <code class="html">&lt;b&gt;</code> doesn&#8217;t carry any meaning. Usage of tags with semantic meaning also enable users that relies on screen-reader to further understanding the material.</p>
<p>However, often times especially while HTML5 is still under drafting we group content into sections enclosed within a <code class="html">&lt;div&gt;</code> tag. For formatting purpose, as well as giving the block of information meaningful to machines, a class name is often given to the block, eg. <code class="html">&lt;div class="header"&gt;</code>. The usage of tag attribute to give meaning to a piece of information leads to the development of various microformats.</p>
<h3>Microformats</h3>
<p>When web-developers start using tags that accurately representing information within a document, numerous efforts are made to further mark up a document according to standard to ease machine processing. If one have ever does screen-scraping, they will feel the pain of trying to make the script to scrap the right information out of a HTML document.</p>
<p>To enable information to be read easily out of a HTML document, what a web-developer can do is to mark up the information following the specific standard. One of the standards is hCalendar, which is used to describe dated event, for example to mark up an event that is taking place at 6th June 2010, we would do:</p>
<p><code class="html"><br />
&lt;p class="vevent"&gt;John Doe is &lt;span class="summary"&gt;getting married&lt;/span&gt; on 6th June 2010 at the &lt;span class="location"&gt;Community Hall&lt;/span&gt;. The ceremony will be held from &lt;abbr class="dtstart" title="2010-06-06T14:00:00+08:00"&gt;2PM&lt;/abbr&gt; till &lt;abbr class="dtend" title="2010-06-06T16:00:00+08:00"&gt;4PM&lt;/abbr&gt;.&lt;/p&gt;<br />
</code></p>
<p>Therefore, to do screen-scraping, one would just need to search for (via either CSS or XPath) the particular block of content above.</p>
<p>(to be continued, next is on Resource Description Framework, RDF, which loosely-related to my previous post on semantic net as linked above)</p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2010/06/03/semantic-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Semantic Network</title>
		<link>http://cslai.coolsilon.com/2008/03/20/semantic-network/</link>
		<comments>http://cslai.coolsilon.com/2008/03/20/semantic-network/#comments</comments>
		<pubDate>Thu, 20 Mar 2008 08:26:50 +0000</pubDate>
		<dc:creator>Jeffrey04</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Semantic Network]]></category>

		<guid isPermaLink="false">http://cslai.coolsilon.com/2008/03/20/semantic-network/</guid>
		<description><![CDATA[Recently the term &#8220;Semantic Web&#8221; becomes extremely popular that Sitepoint blogs keep posting articles on this topic (1, 2). In my college days, I learned about Semantic Network and I wonder if there is some relationship between them. I&#8217;m not sure whether I get the concept correctly but in this article I would like to [...]]]></description>
			<content:encoded><![CDATA[<p>Recently the term &#8220;<a href="http://www.w3.org/2001/sw/">Semantic Web</a>&#8221; becomes extremely popular that <a href="http://www.sitepoint.com/blogs">Sitepoint blogs</a> keep posting articles on this topic (<a href="http://www.sitepoint.com/blogs/2008/03/14/preparing-your-sites-for-the-data-web/">1</a>, <a href="http://www.sitepoint.com/blogs/2008/03/15/semantic-web-or-why-yahoo-resisted-microsofts-takeover/">2</a>). In my college days, I learned about Semantic Network and I wonder if there is some relationship between them. I&#8217;m not sure whether I get the concept correctly but in this article I would like to revise a bit on semantic network before going to semantic web. Please correct me if I&#8217;m wrong.</p>
<p><span id="more-12"></span></p>
<h2>What is Semantic Network</h2>
<p>As far as I know, a semantic network is a type of knowledge representation. A mind map, IMO, is a type of network diagram which loosely based on semantic network. Back to the topic, a semantic network is usually drawn to represent some knowledge through interconnecting nodes with labeled arcs. One of the objectives in drawing a semantic network is to do inferencing by referring to the interconnecting nodes. (I will post diagram once I find my e-copy of the lecture slides to illustrate)</p>
<p>Because my lecturer discussed about Prolog before moving on to semantic network, so we discuss semantic network in conjunction with Prolog way to represent facts. The steps taken to produce a semantic network for the fact &#8220;Mary is a lecturer&#8221; may be as follows:</p>
<ol>
<li>Change the fact into Prolog fact: <code>lecturer(mary)</code></li>
<li>Rewrite the prolog fact into another form: <code>instance(mary, lecturer)</code></li>
<li>Now draw a semantic network with two nodes &#8211; <em>mary</em> and <em>lecturer</em>.</li>
<li>Then connect the two nodes with an arc labeled <em>instance</em> (diagrams coming soon)</li>
</ol>
<p>Of course the above produced network is simple because the given fact is simple in the first place. However, when there are more than two arities (predicates) in the prolog fact then the semantic network can  be reasonably complex as follows.</p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/9f0ba6fa42ca8d836481db5be2c70e84.png" class="graphviz" />
<p>My lecturer discussed the inferencing process with an example that uses multiple related facts. For example, we have two person, Peter and Jane each has the height of 180cm and 177cm. The initial semantic network may seems as follows:</p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/b6c2ae06a21738d22a8b89f75440f978.png" class="graphviz" />
<p>To do inferencing, a special procedure is added to the network to process the nodes. Without the special procedure then the analysis of the diagram would become limited.</p>
<img src="http://cslai.coolsilon.com/wp-content/tfo-graphviz/651477149c5a6e2ce938858fa98612ec.png" class="graphviz" />
<p>As you can see the semantic network got expanded and an arc is added to link the previously unrelated facts. Hence, we can make an inference through observing the diagram that Peter has the height of 180cm  greater than Jane&#8217;s at 177cm.</p>
<p>Then the semantic network can be expanded as complex as possible in anyway one may think logical. Of course there may be more to cover in this topic like expanding and partitioning the semantic network, using frames to overcome the limitations of semantic network etc. But I would just stop here to check whether I have made any mistake before moving on.</p>
<p>Further reading: <a href="http://www.jfsowa.com/pubs/semnet.htm">Semantic Network</a> by John F. Sowa </p>
<div style='clear:both'></div>]]></content:encoded>
			<wfw:commentRss>http://cslai.coolsilon.com/2008/03/20/semantic-network/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

