Notes on codes, projects and everything
The Nand2Tetris part I at coursera is very much my first completed course. It was so fun to actually work through the material and it feels amazing to know how simple it is to actually build a computer from scratch. While it is simple, it doesn’t mean the course itself is easy though. I was struggling to get the CPU wired up properly that I spent two to three days just to get it working.
One of my recent tasks involving crawling a lot of geo-tagged data from a given service. The most recent one is crawling files containing a point cloud for a given location. So I began by observing the behavior in the browser. After exporting the list of HTTP requests involved in loading the application, I noticed there are a lot of requests fetching resources with a common rXXX
pattern.
While working on a text classification task, I spent quite some time preparing the training set for a given document collection. The project is supposed to be a pure golang implementation, so after some quick searching I found some libraries that are either a wrapper to libsvm, or a re-implementation. So I happily started to prepare my training set in the libsvm format.
Previously, I started practising recursions by implementing a type check on lat (list of atoms), and ismember
(whether an atom is a member of a given lat). Then in the third chapter, named “Cons the Magnificent”, more list manipulation methods are being introduced.
Often times one would have to write code to evaluate logical statements. For example, given statement p and q, what is p implies q? As there’s no operator for implication in PHP, one would have to rewrite the statement that consists only in NOT (!
), AND (&&
) and OR (||
) operators. When there are a huge load of these statements, code can be difficult to read.
So apparently Annoy is now splitting points by using the centroids of 2 means clustering. It is claimed that it provides better results for ANN search, however, how does this impact regression? Purely out of curiosity, I plugged a new point splitting function and generated a new set of points.
(more…)