Notes on codes, projects and everything
The Internet Censorship Dashboard is a project that aggregates data fetched from the OONI API, to provide an overview of the current state of Internet Censorship experienced by users mainly in Southeast Asia. The current form was built a couple of years ago, and recently got funded to get it updated to work better with new APIs.(more…)
Back then, when I was still working on my postgraduate degree research, I used RDF, which was the preferred format in the world of Semantic Web to represent data. I eventually dropped the degree, and stopped following the development of the related technology and standards. Until I volunteered to update the import script for popit when I was looking for the next job/project.(more…)
In recent years, I start to make my development environment decouple from the tools delivered by the package manager used by the operating system. The tools (compiler, interpreters, libraries etc) are usually best left unmodified so other system packages that rely on them keeps working as intended. Also another reason for the setup is I wanted to follow the latest release as much as possible, which cannot be done unless I enroll myself to a rolling release distro.(more…)
Just recently I volunteered to do a pre-101 kinda workshop for people wanting to learn programming. I had done this a few times in the past, but in different settings and goals in mind. The whole structure predates the sessions but I can’t remember when I first created them.(more…)
I don’t quite remember when did I first heard about Category Theory, but the term stuck in my head for quite a while. Eventually I attempted to start looking for tutorials on the topic, but it is hard to find one that I actually understand. Most of them are either leaning too much to the Mathematics side, or too much to the Programming side.(more…)
This is the year I kept digging my old undergraduate notes on Statistics for work. First was my brief attempt wearing the Data Scientist performing ANOVA test to see if there’s correlation between pairs of variables. Then just recently I was tasked to analyze a survey result for a social audit project.(more…)
So apparently Annoy is now splitting points by using the centroids of 2 means clustering. It is claimed that it provides better results for ANN search, however, how does this impact regression? Purely out of curiosity, I plugged a new point splitting function and generated a new set of points.(more…)
After a year and half, a lot of things changed, and annoy also changed the splitting strategy too. However, I always wanted to do a proper follow up to the original post, where I compared boosting to Annoy. I still remember the reason I started that (flawed) experiment was because I found boosting easy.(more…)
Writing a usable form and database library has always been a painful experience. So why bother re-inventing the wheel when there are so many to choose from already? I am writing one mostly for learning purpose. After numerous attempts, I finally get my form and database library in shape. It is nowhere complete, but nor it is perfect, but it is currently the implementation that is closest to my original design. I will keep working on it so it can be used in my personal projects in the future.
I’m not sure why Windows Media Player 11 doesn’t like mp4/aac format but not being able to transfer my tracks in mp4 through MTP to my N81 really piss me off. At first I installed Songbird 1.0.0 to try synchronizing the tracks through the MTP addon but it doesn’t even play *.mp4 tracks out of the box under Windows XP SP3 (although it claimed otherwise). Weird thing happens when I was trying to synchronize anyway by having the files renamed to *.mp4 extension and it succeeded but without any meta-tag information.
I have been following this excellent guide written by Benjamin Thomas to set up my virtual machine for development purpose. However, when I am starting to configure a Ubuntu Quantal alpha machine, parts of the guide became inapplicable. Hence, this post is written as a small revision to the previously mentioned guide.
One of my recent tasks involving crawling a lot of geo-tagged data from a given service. The most recent one is crawling files containing a point cloud for a given location. So I began by observing the behavior in the browser. After exporting the list of HTTP requests involved in loading the application, I noticed there are a lot of requests fetching resources with a common