Notes on codes, projects and everything
Been trying my best to stick to the well-known UNIX Philosophy – “Do one thing and do it well”, so I have been breaking down my projects into numerous pieces of small tasks and rely on existing tools whenever possible. One of the existing tool that I use a lot is the GNU sort tool. Generally sort utility is really doing fine and dandy without having to configure anything, at least not until I realize the problem that leads to this post.
Back then, when I was still working on my postgraduate degree research, I used RDF, which was the preferred format in the world of Semantic Web to represent data. I eventually dropped the degree, and stopped following the development of the related technology and standards. Until I volunteered to update the import script for popit when I was looking for the next job/project.(more…)
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
To do node selection for DOM operations, one typically uses CSS selectors as (probably) popularized by jQuery. However, there is another alternative that is as powerful if not better known as XPath. XPath may be able to do a lot more than just selecting node (which I have no time to find out for now) but I will just focus on how to do node selection in this blog post.
I like how Kohana 3 organizes the classes, and I thought the same thing may be applied to my Zend Framework experimental project. Basically what this means is that I can name the controller class according to PEAR naming convention, and deduce the location of the file by just parsing the class name.
I was asked to evaluate fuzzy c-means to find out whether it is a good clustering algorithm for my MPhil project. So I spent the whole afternoon reading through some tutorial to get some basic understanding. Then I thought why not implement it in Clojure because it doesn’t look too complicated (I was so wrong…).