Notes on codes, projects and everything
While my static pages and site theme is still under construction, I went to CodeIgniter.com to see how to work with it and played the tutorial screencasts. The reason behind in considering CodeIgniter (CI) to be used in my next project is because I don’t feel like re-inventing the wheel. However to port my current project to use CI may cause some problems as there are differences in how we implement MVC structure.
Often times, I am dealing with JSONL files, though panda’s DataFrame is great (and blaze to certain extend), however it is offering too much for the job. Most of the received data is in the form of structured text and I do all sorts of work with them. For example checking for consistency, doing replace based on values of other columns, stripping whitespace etc.
Writing a usable form and database library has always been a painful experience. So why bother re-inventing the wheel when there are so many to choose from already? I am writing one mostly for learning purpose. After numerous attempts, I finally get my form and database library in shape. It is nowhere complete, but nor it is perfect, but it is currently the implementation that is closest to my original design. I will keep working on it so it can be used in my personal projects in the future.
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
I am starting to like programming in Javascript, especially after understanding more about how Javascript handles scopes. Surprisingly this take me like a year to figure out how scope is managed. The result of the findings is that I start doing a lot of experiments and discovered more interesting stuff through them.
So my cheat with dask worked fine and dandy, until I started inspecting the output (which was to be used as an input for another script). While the script seemed to work fine, however when I started to parse each line I was hit with some funny syntax errors. After some quick inspection I found some of the lines was not printed completely.