Notes on codes, projects and everything
In the previous post, I re-implemented Annoy in 2D with some linear algebra maths. Then I spent some time going through some tutorial on vectors, and expanded the script to handle data in 3D and more. So instead of finding gradient, the perpendicular line in the middle of two points, I construct a plane, and find the distance between it and points to construct the tree.
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.
Been trying my best to stick to the well-known UNIX Philosophy – “Do one thing and do it well”, so I have been breaking down my projects into numerous pieces of small tasks and rely on existing tools whenever possible. One of the existing tool that I use a lot is the GNU sort tool. Generally sort utility is really doing fine and dandy without having to configure anything, at least not until I realize the problem that leads to this post.
Sometimes I really doubt about the advantage of recycling old stuff to fund for new units beyond goodwill. Sure you get to convince yourself that you are saving the environment by doing so, and it also saves money in the long run. However, I didn’t realize how much it generates it may be after trying to work out an answer for a fictional IQ question.
I like how Kohana 3 organizes the classes, and I thought the same thing may be applied to my Zend Framework experimental project. Basically what this means is that I can name the controller class according to PEAR naming convention, and deduce the location of the file by just parsing the class name.
Just managed to migrate all my blog sites to one centralized multi-site, so no more
half-baked solution and hopefully this brings better plugin compatibility. I have not check with other related services (like Google Webmaster Tools) whether this cause any breakage though. Well, the main purpose of this blog post is actually a draft of what I did for the past two months for my postgraduate programme. Yea, I should have posted more stuff to this blog (just realized that my last post here is already like half a year ago).
Should have done this earlier, I was just being lazy to go through all the steps to publish it properly. So here it is, the full source is published to bitbucket. Feel free to fork the project if you are interested. I have not attach a licence to it but it will most probably be BSD licence. I have also uploaded the latest 0.0.2 release to bitbucket and would update the download link posted previously soon.