Notes on codes, projects and everything
Sometimes I really doubt about the advantage of recycling old stuff to fund for new units beyond goodwill. Sure you get to convince yourself that you are saving the environment by doing so, and it also saves money in the long run. However, I didn’t realize how much it generates it may be after trying to work out an answer for a fictional IQ question.
So I first heard about Panda probably a year ago when I was in my previous job. It looked nice, but I didn’t really get the chance to use it. So practically it is a library that makes data looks like a mix of relational database table and excel sheet. It is easy to do query with it, and provides a way to process it fast if you know how to do it properly (no, I don’t, so I cheated).
With most of my stuff more or less set, I guess it is time to start documenting the steps before I forget. So I heard a lot of good things about docker for quite some time, but haven’t really have the time to do it due to laziness (plus my relatively n00b-ness in the field of dev-ops). Just a few months ago, I decided to finally migrate away from webfaction (thanks for all the superb support) to a VPS so I can run more things on it.
While working on a text classification task, I spent quite some time preparing the training set for a given document collection. The project is supposed to be a pure golang implementation, so after some quick searching I found some libraries that are either a wrapper to libsvm, or a re-implementation. So I happily started to prepare my training set in the libsvm format.
This is the second part of the golang learning rant log. Previously on (note (code cslai)) I managed to make each line in the CSV into a hash map. So today I am going to make it into JSON Lines.
I was invited to try Go (the programming language, not that board game) a few months ago, however I didn’t complete back then. The main reason was because it felt raw, compared to other languages that I know a fair bit better (for example Ruby). There was no much syntatic sugar around, and getting some work done with it feels “dirty”.
A new day, and a new post on job application. So this time instead of asking a snippet, I was actually asked to deliver some sort of a full application. Not sure why this was required, but I had fun creating them nonetheless. Though I would say I am not really a fan of creating visual stuff though (oh the crappy animation nearly killed me).
Another day, another programming assessment test. This time I was asked to generate some random data, then examine them to get their data type. Practically it is not a very difficult thing to do and I could probably complete it in fewer lines. I am pretty sure there are better ways to do this, as usual though.
I just failed a programming assessment test miserably yesterday and thought I should at least document it down. However, the problem with this is that the questions are copyrighted, so I guess I would write it from another point of view. So the main reason I failed was because I chose the wrong strategy to the problem, thinking it should be solution but as I put in time to that I ended up creating more problems.
The Nand2Tetris part I at coursera is very much my first completed course. It was so fun to actually work through the material and it feels amazing to know how simple it is to actually build a computer from scratch. While it is simple, it doesn’t mean the course itself is easy though. I was struggling to get the CPU wired up properly that I spent two to three days just to get it working.
While the previous file structure works well, I decided to tune some details before deploying the latest WordPress release. Besides that, I also started a new theme development project after my last theme which was developed more than 2 years ago. Thankfully, everything seems to work so far.
Folksonomy is a neologism of two words, ’folk’ and ’taxonomy’ which describes conceptual structures created by users [4, 5]. A folksonomy is a set of unstructured collaborative usage of tags for content classification and knowledge representation that is popularized by Web 2.0 and social applications [1, 5]. Unlike taxonomy that is commonly used to organize resources to form a category hierarchy, folksonomy is non-hierarchical and non-exclusive [3]. Both content hierarchy and folksonomy can be used together to better content classification.
A friend of mine recently posted a screenshot containing a code snippet for a fairly straight forward problem. So after reading the solution I suddenly had the itch to propose another solution that I initially thought would be better (SPOILER: Turns out it isn’t). Then mysteriously I stuck myself to my seat and started coding an alternative solution to it instead of playing Diablo 3 just now.
Sometimes, letting a piece of code evolving by itself without much planning does not usually end well. However I was quite pleased with a by-product of it and I am currently formalizing it. So the by-product is some sort of DSL for a rule engine that I implemented to process records. It started as some lambda functions in Python but eventually becomes something else.
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.