Notes on codes, projects and everything
While JSON is a fine data-interchange format, however it does have some limitations. It is well-known for its simplicity, that even a non-programmer can easily compose a JSON file (but humanity will surprise you IRL). Therefore, it is found almost everywhere, from numerous web APIs, to geospatial data (GeoJSON), and even semantic web (RDF/JSON).
Previously, I started practising recursions by implementing a type check on lat (list of atoms), and ismember
(whether an atom is a member of a given lat). Then in the third chapter, named “Cons the Magnificent”, more list manipulation methods are being introduced.
In the last part, I implemented a couple of primitive functions so that they can be applied in the following chapters. The second chapter of the book, is titled “Do it again, and again, and again…”. The title already hints that readers will deal with repetitions throughout the chapter.
I saw this article from alistapart, which is about Javascript’s prototypal object orientation. So the article mentioned Douglas Crawford, and I was immediately reminded about my struggle in understanding the language itself. Back then I used to also refer to his site for a lot of notes in Javascript. So I went back to have a quick read, and found this article that discusses the similarity between Javascript and Lisp.
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
I came across a video on Youtube on Pi day. Coincidently it was about estimating the value of Pi produced by Matt Parker aka standupmaths. While I am not quite interested in knowing the best way to estimate Pi, I am quite interested in the algorithm he showed in the video however. Specifically, I am interested to find out how easy it is to implement in Python.
Sometimes I really doubt about the advantage of recycling old stuff to fund for new units beyond goodwill. Sure you get to convince yourself that you are saving the environment by doing so, and it also saves money in the long run. However, I didn’t realize how much it generates it may be after trying to work out an answer for a fictional IQ question.
While working on a text classification task, I spent quite some time preparing the training set for a given document collection. The project is supposed to be a pure golang implementation, so after some quick searching I found some libraries that are either a wrapper to libsvm, or a re-implementation. So I happily started to prepare my training set in the libsvm format.
Semantic Web always sounds like some magic power stuff that a group of people keep yelling about. Chances are, if one is into web development, he/she would have heard of it somehow or other. However, despite the supposedly wide awareness about it, are we using it? Or rather, am I publishing enough data to Semantic Web? OK, I don’t, but why?
While following through the Statistical Learning course, I came across this part on doing regression with boosting. Then reading through the material, and going through it makes me wonder, the same method may be adapted to Erik Bernhardsson‘s annoy algorithm.
(more…)This update took me quite a bit more time than I initially expected. Anyway, I have done some refactoring work to the original code, and thought it would be nice to document the changes. Overall, most of the changes involved the refactoring of function names. I am not sure if this would stick, but I am quite satisfied for now.
I am currently preparing myself in applying a postgrad programme and is looking for a research topic. At first I wanted to do something that is related to cloud computing but after some discussion with people around me, they suggest me to do something on semantic web. While posting my notes here, I realized that I had posted something on semantic network that looks like the base of semantic web here (Post still “Under construction” as of writing, will post the diagrams later tonight).
I didn’t realize that I have been working for 3 weeks until the Labour Day which was a public holiday. Many things happened in these few weeks and I am still struggling to catch up with it. My superior and colleagues have been very helpful and offered me some helpful tutorials and books. I was instructed to build a event scheduler application using codeigniter in the first week and then work on a side project that extends a form using DOM methods and properties.