Notes on codes, projects and everything
With most of my stuff more or less set, I guess it is time to start documenting the steps before I forget. So I heard a lot of good things about docker for quite some time, but haven’t really have the time to do it due to laziness (plus my relatively n00b-ness in the field of dev-ops). Just a few months ago, I decided to finally migrate away from webfaction (thanks for all the superb support) to a VPS so I can run more things on it.
I just failed a programming assessment test miserably yesterday and thought I should at least document it down. However, the problem with this is that the questions are copyrighted, so I guess I would write it from another point of view. So the main reason I failed was because I chose the wrong strategy to the problem, thinking it should be solution but as I put in time to that I ended up creating more problems.
I wanted to try using virtuoso as the storage engine for Redland but unfortunately there is no librdf-storage-virtuoso package for Ubuntu. After getting some help from @dajobe, I attempted to build the packages myself. Although it takes quite some time to build packages, but not too difficult it seems.
This post is purely based on my own speculation as there’s no experiment on real-life data to actually back the arguments. I am currently trying to document down a plan for my experiment(s) on recommender system (this reminds me that I have not release the Flickr data collection tool :/) and my supervisor advised to write a paragraph or two on some of the key things. Since he is not going to read it, so I might as well just post it here as a note.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
I was asked to evaluate fuzzy c-means to find out whether it is a good clustering algorithm for my MPhil project. So I spent the whole afternoon reading through some tutorial to get some basic understanding. Then I thought why not implement it in Clojure because it doesn’t look too complicated (I was so wrong…).