Notes on codes, projects and everything
While JSON is a fine data-interchange format, however it does have some limitations. It is well-known for its simplicity, that even a non-programmer can easily compose a JSON file
(but humanity will surprise you IRL). Therefore, it is found almost everywhere, from numerous web APIs, to geospatial data (GeoJSON), and even semantic web (RDF/JSON).
Previously, I started practising recursions by implementing a type check on lat (list of atoms), and
ismember (whether an atom is a member of a given lat). Then in the third chapter, named “Cons the Magnificent”, more list manipulation methods are being introduced.
In the last part, I implemented a couple of primitive functions so that they can be applied in the following chapters. The second chapter of the book, is titled “Do it again, and again, and again…”. The title already hints that readers will deal with repetitions throughout the chapter.
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
I came across a video on Youtube on Pi day. Coincidently it was about estimating the value of Pi produced by Matt Parker aka standupmaths. While I am not quite interested in knowing the best way to estimate Pi, I am quite interested in the algorithm he showed in the video however. Specifically, I am interested to find out how easy it is to implement in Python.
Sometimes I really doubt about the advantage of recycling old stuff to fund for new units beyond goodwill. Sure you get to convince yourself that you are saving the environment by doing so, and it also saves money in the long run. However, I didn’t realize how much it generates it may be after trying to work out an answer for a fictional IQ question.
While working on a text classification task, I spent quite some time preparing the training set for a given document collection. The project is supposed to be a pure golang implementation, so after some quick searching I found some libraries that are either a wrapper to libsvm, or a re-implementation. So I happily started to prepare my training set in the libsvm format.
As the name implies, Resource Definition Framework, or RDF in short, is a language to represent information about resources in world wide web. Information that can be represented is mostly metadata like title (assuming the resource is a web-page), author, last modified date etc. Besides representing resource that is network-accessible, it can be used to represent things that cannot be accessed through the network, as long as it can be identified using a URI.
I’m not sure why Windows Media Player 11 doesn’t like mp4/aac format but not being able to transfer my tracks in mp4 through MTP to my N81 really piss me off. At first I installed Songbird 1.0.0 to try synchronizing the tracks through the MTP addon but it doesn’t even play *.mp4 tracks out of the box under Windows XP SP3 (although it claimed otherwise). Weird thing happens when I was trying to synchronize anyway by having the files renamed to *.mp4 extension and it succeeded but without any meta-tag information.
Writing a usable form and database library has always been a painful experience. So why bother re-inventing the wheel when there are so many to choose from already? I am writing one mostly for learning purpose. After numerous attempts, I finally get my form and database library in shape. It is nowhere complete, but nor it is perfect, but it is currently the implementation that is closest to my original design. I will keep working on it so it can be used in my personal projects in the future.
Been trying my best to stick to the well-known UNIX Philosophy – “Do one thing and do it well”, so I have been breaking down my projects into numerous pieces of small tasks and rely on existing tools whenever possible. One of the existing tool that I use a lot is the GNU sort tool. Generally sort utility is really doing fine and dandy without having to configure anything, at least not until I realize the problem that leads to this post.