Notes on codes, projects and everything
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
Just happened to see this post a few months ago, and the author created another cloud that uses almost the same technique to ‘visualize’ a list of countries. The author uses PHP to generate the cloud originally and I thought I may be able to do in javascript. After some quick coding I managed to produce something similar to the first example, source code after the jump.
Folksonomy is a neologism of two words, ’folk’ and ’taxonomy’ which describes conceptual structures created by users [4, 5]. A folksonomy is a set of unstructured collaborative usage of tags for content classification and knowledge representation that is popularized by Web 2.0 and social applications [1, 5]. Unlike taxonomy that is commonly used to organize resources to form a category hierarchy, folksonomy is non-hierarchical and non-exclusive [3]. Both content hierarchy and folksonomy can be used together to better content classification.
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.
Should have done this earlier, I was just being lazy to go through all the steps to publish it properly. So here it is, the full source is published to bitbucket. Feel free to fork the project if you are interested. I have not attach a licence to it but it will most probably be BSD licence. I have also uploaded the latest 0.0.2 release to bitbucket and would update the download link posted previously soon.
While my static pages and site theme is still under construction, I went to CodeIgniter.com to see how to work with it and played the tutorial screencasts. The reason behind in considering CodeIgniter (CI) to be used in my next project is because I don’t feel like re-inventing the wheel. However to port my current project to use CI may cause some problems as there are differences in how we implement MVC structure.