Notes on codes, projects and everything
While JSON is a fine data-interchange format, however it does have some limitations. It is well-known for its simplicity, that even a non-programmer can easily compose a JSON file (but humanity will surprise you IRL). Therefore, it is found almost everywhere, from numerous web APIs, to geospatial data (GeoJSON), and even semantic web (RDF/JSON).
Previously, I started practising recursions by implementing a type check on lat (list of atoms), and ismember
(whether an atom is a member of a given lat). Then in the third chapter, named “Cons the Magnificent”, more list manipulation methods are being introduced.
Recently I find some of my pet projects share a common pattern, they all are based on some kind of grids. So I find myself writing similar piece of code over and over again. While re-inventing wheels is quite fun, especially when you learn new way of getting things done with every iteration, it is actually quite tedious after a while.
In the last part, I implemented a couple of primitive functions so that they can be applied in the following chapters. The second chapter of the book, is titled “Do it again, and again, and again…”. The title already hints that readers will deal with repetitions throughout the chapter.
I saw this article from alistapart, which is about Javascript’s prototypal object orientation. So the article mentioned Douglas Crawford, and I was immediately reminded about my struggle in understanding the language itself. Back then I used to also refer to his site for a lot of notes in Javascript. So I went back to have a quick read, and found this article that discusses the similarity between Javascript and Lisp.
Sometimes, letting a piece of code evolving by itself without much planning does not usually end well. However I was quite pleased with a by-product of it and I am currently formalizing it. So the by-product is some sort of DSL for a rule engine that I implemented to process records. It started as some lambda functions in Python but eventually becomes something else.
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
I came across a video on Youtube on Pi day. Coincidently it was about estimating the value of Pi produced by Matt Parker aka standupmaths. While I am not quite interested in knowing the best way to estimate Pi, I am quite interested in the algorithm he showed in the video however. Specifically, I am interested to find out how easy it is to implement in Python.
Traversing a tree structure often involves writing a recursive function. However, Python isn’t the best language for this purpose. Therefore I started flattening the tree into a key-value dictonary structure. Logically it is still a tree, but it is physically stored as a dictionary. Therefore it is now easier to write a simple loop to traverse it.
I’m not sure why Windows Media Player 11 doesn’t like mp4/aac format but not being able to transfer my tracks in mp4 through MTP to my N81 really piss me off. At first I installed Songbird 1.0.0 to try synchronizing the tracks through the MTP addon but it doesn’t even play *.mp4 tracks out of the box under Windows XP SP3 (although it claimed otherwise). Weird thing happens when I was trying to synchronize anyway by having the files renamed to *.mp4 extension and it succeeded but without any meta-tag information.
I didn’t realize that I have been working for 3 weeks until the Labour Day which was a public holiday. Many things happened in these few weeks and I am still struggling to catch up with it. My superior and colleagues have been very helpful and offered me some helpful tutorials and books. I was instructed to build a event scheduler application using codeigniter in the first week and then work on a side project that extends a form using DOM methods and properties.
Back then when I was attending a job interview, I was asked to write a Fizz Buzz program to prove that my coding ability. There was only a pen and a piece of paper, so basically means there’s no way I can refer to the documentation for the API syntax. Fortunately I somehow managed to remember and not screw up.
Just recently I volunteered to do a pre-101 kinda workshop for people wanting to learn programming. I had done this a few times in the past, but in different settings and goals in mind. The whole structure predates the sessions but I can’t remember when I first created them.
(more…)Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.