Notes on codes, projects and everything
Traversing a tree structure often involves writing a recursive function. However, Python isn’t the best language for this purpose. Therefore I started flattening the tree into a key-value dictonary structure. Logically it is still a tree, but it is physically stored as a dictionary. Therefore it is now easier to write a simple loop to traverse it.
In the previous post, I re-implemented Annoy in 2D with some linear algebra maths. Then I spent some time going through some tutorial on vectors, and expanded the script to handle data in 3D and more. So instead of finding gradient, the perpendicular line in the middle of two points, I construct a plane, and find the distance between it and points to construct the tree.
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
I was thinking whether it is possible to avoid exposing PDO and PDOStatement objects to the users of my database library (mainly just me). While I was working on my project I sort of notice that there is a almost fixed pattern whenever I work with the database. With this in mind, I added in some new functions to the library, and decided to make a quick release for this.
I have recently made my Adium useless by moving all my IM accounts to my beloved Nokia N9. While moving my buddy lists of all major instant messaging services, I did a quick check on each of the contact to see past interaction. It is sort of surprising to see I don’t actually chat with them as frequent as I thought, so why do I “need” my Adium opened all the time?
After publishing the previous note on setting up my development environment, I find myself spending more time in the CLI (usually via SSH from host). Then I find myself not needing all the GUI apps in a standard Ubuntu desktop environment so I went ahead and set up a new environment based on Ubuntu Quantal server edition beta-1. For some reason my network stopped working and didn’t really want to spend time finding out the cause, so I reinstalled everything again today using the final installer, as well as the updated Virtualbox 4.2.6.
Semantic Web is not just about putting data on the web, but also making links to allow a person as well as a machine to explore the web of data. Links are made in the web of data connects arbitrary things together as described by RDF as opposed to links in the web of hypertext, where links connects to only web-resources. Linkage of arbitrary things then allow related things to be found while performing search.
It is very difficult to like the way vim handle plugins by default, so I was really thrilled to find out about pathogen when a geek I followed tweeted about it. It took me some time to actually re-organize my current configuration to this new format. Then I thought why not reorganize my .vimrc as well, as my current version looks a bit cryptic after a while.