Notes on codes, projects and everything
Trekkr is by far the largest program I have ever done as a student. Trekkr is my final year project which is co-developed with Regina. It took us about half a year in carrying out research, learning some new concepts and developing the system. The final product is still not even near to perfect but major functionalities are already done.
We carried out extensive research in determining which architecture to apply. We learned in college about 3-tier design but we wanted to find another architecture to compare with the 3-tier design as we do not really like about it. Then we heard a lot about MVC and did some research and learning about it. The reason we didn’t use popular frameworks was we wanted to learn more through developing the website.
After evaluating the two architectures and we found out that these two architectures can be combined, hence we came out with a design which is explained in this article.
*Quoted from my own final documentation.
This is one of the most important modules in the system where users are able to interact with each other the most. This is the module where user posts information to a specific location to complement the profile of the location. To start, the user would have to upload an image to the system and create the location profile as instructed, once the profile is created, other users may be able to contribute content or define tags to the profile.
However, this module doesn’t allow a piece of content to be edited directly, what the system does is create a duplicate entry for the content that is to be edited and let the user to edit from the duplicate copy. This is because each piece of content is bounded to a rating by other users, the system could not guarantee that after modification the content is still deserves the rating. Besides that, by implementing the edit system in this manner, now all users are able to edit each other’s entry without worrying about the impact to the original copy in terms of rating. As long as a piece of content is being written well and being promoted by members, then it will be promoted to the front-page.
The site administrator would not play an important role in this module as the power is distributed to the users. The users are expected to maintain the quality of the location profile, this is the reason why the users are allowed to rate a single location profile for multiple times.
Besides that, there is an internal promotion of content in each location profile. As many users may be able to contribute content, to users who just want to have a quick view to a place, this may cause them to confuse. Therefore two distinct views are created, which are the normal and the draft view. The normal view will display the most popular, or the content with the highest rating in each sub-section. However, if the user wishes to know more, he or she may go to the draft view to view all content contributed by all other users. Guests, or users without an account in the system, would be restricted to only the normal view of the location profile.
The travelogue module is a module to enable users to post their travelogues. The version submitted in this project allows the user to attach a photo. Besides posting, the system also enables the original author of a travelogue to be able to edit the content. For other users, editing others’ travelogue will result in an error screen informing that they are not allowed to edit the travelogue entry. However, if a user posted a travelogue that contains inappropriate content, then the travelogue may be deleted by the site administrator. Tags are used to describe the content of the travelogue, for example, if the user posts a story about his journey to Malacca, then he or she may tag the entry as “Malaysia, Malacca, A’Farmosa” etc.
The searching module is a module to search for a travelogue entry or a location profile using tags. A tag, as mentioned earlier, is a metadata to describe a piece of information used in web 2.0 applications. Therefore the developer finds it reasonable to search the tag in order to get to the desired information. The searching interface will be either through a web form provided by the system or just through a specific format of website address. This is to provide flexibility to user without having them to load the web form whenever they want to search for information.
The demo site can be accessed here.
After delaying for quite some time, I think I should start the project before I get bored with it. The project will be either hosted on this current domain (coolsilon.com) at least for now and will probably move to another domain if needed. The site will be either a blog aggregator or just a simple article submission site that works kinda like digg / reddit, however, to be promoted to the frontpage the submission would have to impress the opposite group.
Although my supervisor strongly recommend using JENA for RDF related work, but as I really don’t like Java (just personal preference), and wouldn’t want to install JRE/JVM (whatever it is called) at my shared server account, so I went to look for an alternative. After spending some time searching, I found this library called Redland and it provides binding for my current favorite language — PHP, so I decided to use this for my RDF work.
Javascript is getting so foreign to me these days, but mostly towards a better direction. So I recently got myself to learn react through work and the JSX extension makes web development bearable again. On the other hand, I picked up a little bit on Vue.js but really hated all the magic involved (No I don’t enjoy putting in code into quotes).
Implementing a Information Retrieval system is a fun thing to do. However, doing it efficiently is not (at least to me). So my first few attempts didn’t really end well (mostly uses just Go/golang with some bash tricks here and there, with or without a database). Then I jumped back to Python, which I am more familiar with and was very surprised with all the options available. So I started with Pandas and Scikit-learn combo.
So apparently Annoy is now splitting points by using the centroids of 2 means clustering. It is claimed that it provides better results for ANN search, however, how does this impact regression? Purely out of curiosity, I plugged a new point splitting function and generated a new set of points.
(more…)