Notes on codes, projects and everything
I am Lai Choon Siang. The place where I am living currently is in Subang Jaya, Malaysia. Most of the time, people I met online know me by my Internet Alias “Jeffrey04”. In my free time, I like reading books, traveling, taking photographs and blogging. I am currently maintaining two blogs; first consists mostly life journals in Mandarin language and the second, which is the blog that you are reading at, would be used to keep my findings and notes in web-development field. Besides that, I am also help in operating a Chinese blogger community website which helps to promote blogging in Mandarin Language among Malaysian.
I started learning about computer when I was in the age of 10. The first program I learned was Delta Drawing and recent studies claimed that it is actually a type of simplified LOGO language. I also learned a bit of LOGO before Windows 95 became popular and I quited the class which taught LOGO. Then I spent some time in learning the HyperText Markup Language (HTML) around the age of 12 and started to write some simple websites and applied CSS in later years. After getting used with installing scripts, then I built some bulletin boards using YaBB, YaBB SE, SMF, phpBB and Discuz without knowing much about programming.
In my secondary school life, I joined Malaysia Red Crescent Society and held the position as an Afternoon Session Representative, then Assistant Secretary, followed by Secretary and finally as a President. Besides that, I also joined Chinese Language Society and held position as a form representative. In these societies, I joined some camping activities, marching competitions and attended some courses such as basics in first aid. I also participated in educational slide making competition for physics when I was in form 5.
Then I entered Tunku Abdul Rahman College to take my Diploma in Computer Science and Management Mathematics. Throughout the two years of studies, I learned various programming languages such as C, basic JAVA, HTML and CSS, Classic VB and SQL as well as other subjects like Accounting, SDLC and Mathematics such as Statistics, Algebra, Calculus, Introductory Discrete Maths etc. Besides that, I also spent some time in learning PHP after learning the key-concepts of programming in college and built a simple blog. However, I migrated the blog to WordPress because of the lack of time and limited knowledge in spam prevention.
After obtaining my Diploma, I continued my studies in Advanced Diploma cum Bachelor’s Degree (in conjunction with Campbell University, North Carolina, USA) of Computer Science and Management Mathematics on a partial scholarship offered by the college. Throughout the two years of study, the new programming languages and technologies learned are procedural C# with ASP.NET, Object Oriented VB.NET, Oracle PL/SQL and Microsoft T-SQL, SWI-Prolog as well as some theory subjects such as basics in Database Design and Maintenance, basic Operating System principals, Windows 2003 Administration and Maintenance, Artificial Intelligence, OOAD as well as Mathematics subjects such as Quality Control, Applied Statistics, Theory of Interest, Mathematics of Life Insurance and Financial Market, Discrete Mathematics, Operational Research, Cryptography etc. I also took some liberal arts unit in Introduction to Psychology, Music Appreciation and Introduction to Short Story.
The topic for my final year seminar was on Solving Linear and Non-linear Equations using Numerical Analysis and built a web 2.0-ish tourism information website titled “Trekkr” with co-operation of Regina Foo. The final year project was built using PHP5 and MySQL5 with full Object Oriented MVC architectural. We didn’t use any framework at that moment as we wanted to keep the project simple (and the MVC architectural design is posted here). The final year project website was being uploaded to a real life web-server offered by my friend Clemence for testing in production environment and was taken down after the development completes. The web server used was Apache and was operating on Linux Operating System.
I am currently graduated from college and is looking for a job in web-development field. Unless I am developing projects other than PHP5, I would be on LAMP for my web-development projects. I used Eclipse PDT to develop my final year project but I am shifting to OpenKomodo for my future PHP5 projects. The operating system that I am using most of the time is Ubuntu Linux as it doesn’t cost me much money to work on it.
Another half a day spent on figuring out how to package my daemon properly, fortunately with help from friends over at #harmattan IRC channel as well as cckwes, I finally get the deb package generated properly. So just a quick reminder on what my daemon does, it is just a quick hack that toggles the ‘allow background connections’ on and off depending which kind of data network a user is connected to. Apparently I am not the only one who are looking for this, as a feature request was filed long long time ago.
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
Folksonomy is a neologism of two words, ’folk’ and ’taxonomy’ which describes conceptual structures created by users [4, 5]. A folksonomy is a set of unstructured collaborative usage of tags for content classification and knowledge representation that is popularized by Web 2.0 and social applications [1, 5]. Unlike taxonomy that is commonly used to organize resources to form a category hierarchy, folksonomy is non-hierarchical and non-exclusive . Both content hierarchy and folksonomy can be used together to better content classification.
While JSON is a fine data-interchange format, however it does have some limitations. It is well-known for its simplicity, that even a non-programmer can easily compose a JSON file
(but humanity will surprise you IRL). Therefore, it is found almost everywhere, from numerous web APIs, to geospatial data (GeoJSON), and even semantic web (RDF/JSON).
Recently I switched my search code to Annoy because the input dataset is huge (7.5mil records with 20k dictionary count). It wasn’t without issues though, however I would probably talk about it next time. In order to figure out what each parameters meant, I spent some time watching through the talk given by the author @fulhack.