Notes on codes, projects and everything
While JSON is a fine data-interchange format, however it does have some limitations. It is well-known for its simplicity, that even a non-programmer can easily compose a JSON file
(but humanity will surprise you IRL). Therefore, it is found almost everywhere, from numerous web APIs, to geospatial data (GeoJSON), and even semantic web (RDF/JSON).
Previously, I started practising recursions by implementing a type check on lat (list of atoms), and
ismember (whether an atom is a member of a given lat). Then in the third chapter, named “Cons the Magnificent”, more list manipulation methods are being introduced.
In the last part, I implemented a couple of primitive functions so that they can be applied in the following chapters. The second chapter of the book, is titled “Do it again, and again, and again…”. The title already hints that readers will deal with repetitions throughout the chapter.
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
I came across a video on Youtube on Pi day. Coincidently it was about estimating the value of Pi produced by Matt Parker aka standupmaths. While I am not quite interested in knowing the best way to estimate Pi, I am quite interested in the algorithm he showed in the video however. Specifically, I am interested to find out how easy it is to implement in Python.
Sometimes I really doubt about the advantage of recycling old stuff to fund for new units beyond goodwill. Sure you get to convince yourself that you are saving the environment by doing so, and it also saves money in the long run. However, I didn’t realize how much it generates it may be after trying to work out an answer for a fictional IQ question.
While working on a text classification task, I spent quite some time preparing the training set for a given document collection. The project is supposed to be a pure golang implementation, so after some quick searching I found some libraries that are either a wrapper to libsvm, or a re-implementation. So I happily started to prepare my training set in the libsvm format.
As the name implies, Resource Definition Framework, or RDF in short, is a language to represent information about resources in world wide web. Information that can be represented is mostly metadata like title (assuming the resource is a web-page), author, last modified date etc. Besides representing resource that is network-accessible, it can be used to represent things that cannot be accessed through the network, as long as it can be identified using a URI.
I am currently doing some organization to my blogs. Few weeks ago, after spending months struggling to work in Ubuntu 7.10, I learned about symbolic links. Then I thought this would be good for my project file management. Therefore I started to re-organize my project file structure to utilize symbolic links. One of the projects that uses symbolic link is the current wordpress theme.
Recently the term “Semantic Web” becomes extremely popular that Sitepoint blogs keep posting articles on this topic (1, 2). In my college days, I learned about Semantic Network and I wonder if there is some relationship between them. I’m not sure whether I get the concept correctly but in this article I would like to revise a bit on semantic network before going to semantic web. Please correct me if I’m wrong.
It is useful to have the terminal around whenever I code. However, while real screen estate is finite, having a terminal further limiting the amount of information that can be displayed at the same time. The problem with the terminal is that I don’t really need it all the time, so I usually find it buried under a group of windows.