Notes on codes, projects and everything
One of my recent tasks involving crawling a lot of geo-tagged data from a given service. The most recent one is crawling files containing a point cloud for a given location. So I began by observing the behavior in the browser. After exporting the list of HTTP requests involved in loading the application, I noticed there are a lot of requests fetching resources with a common
I have recently made my Adium useless by moving all my IM accounts to my beloved Nokia N9. While moving my buddy lists of all major instant messaging services, I did a quick check on each of the contact to see past interaction. It is sort of surprising to see I don’t actually chat with them as frequent as I thought, so why do I “need” my Adium opened all the time?
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.
I wanted to try using virtuoso as the storage engine for Redland but unfortunately there is no librdf-storage-virtuoso package for Ubuntu. After getting some help from @dajobe, I attempted to build the packages myself. Although it takes quite some time to build packages, but not too difficult it seems.
So I first heard about Panda probably a year ago when I was in my previous job. It looked nice, but I didn’t really get the chance to use it. So practically it is a library that makes data looks like a mix of relational database table and excel sheet. It is easy to do query with it, and provides a way to process it fast if you know how to do it properly (no, I don’t, so I cheated).
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.