Notes on codes, projects and everything
Traversing a tree structure often involves writing a recursive function. However, Python isn’t the best language for this purpose. Therefore I started flattening the tree into a key-value dictonary structure. Logically it is still a tree, but it is physically stored as a dictionary. Therefore it is now easier to write a simple loop to traverse it.
In the previous post, I re-implemented Annoy in 2D with some linear algebra maths. Then I spent some time going through some tutorial on vectors, and expanded the script to handle data in 3D and more. So instead of finding gradient, the perpendicular line in the middle of two points, I construct a plane, and find the distance between it and points to construct the tree.
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
The making of this plugin was completely a random act of hand-itchiness. A friend of mine (@cornguo) published a fun app online. There is a name for this kind of app, but I can’t recall at the moment. It typically displays some buttons (usually in a grid), and clicking them causes some sound to be played. The interesting part in cornguo’s app is that there’s a text-input field where the name of the buttons can be typed-in for replaying.
Everyone knows folksonomy is (or was) cool and useful, however, when it is applied in real life, then problem arises. The idea of blogging this came while I am struggling to get my literature review report done (been doing it for months, I am being so ridiculous, I know). As a matter of fact, as I am dying to get it done, there are a couple of things that I found to be blog-worthy. So, I will be publishing a couple of brief overview to some of the topics involved in the coming days in a really casual (read: lazy, and full of personal speculations) way to this very humble little blog of mine.
I have just re-started to find myself a job as my work in mybloggercon almost come to an end (after helping them to set up an April Fool Prank). I have sent some enquiry letters to apply for a job in web-development field mostly involves PHP. I prefer PHP over ASP.NET because I can have greater flexibilities in developing in PHP as what I experienced when I was developing my final year project.
This update took me quite a bit more time than I initially expected. Anyway, I have done some refactoring work to the original code, and thought it would be nice to document the changes. Overall, most of the changes involved the refactoring of function names. I am not sure if this would stick, but I am quite satisfied for now.