(note (code cslai))

Cold-start and Ramp-Up problems

Everyone knows folksonomy is (or was) cool and useful, however, when it is applied in real life, then problem arises. The idea of blogging this came while I am struggling to get my literature review report done (been doing it for months, I am being so ridiculous, I know). As a matter of fact, as I am dying to get it done, there are a couple of things that I found to be blog-worthy. So, I will be publishing a couple of brief overview to some of the topics involved in the coming days in a really casual (read: lazy, and full of personal speculations) way to this very humble little blog of mine.

Let’s start with a brief overview of folksonomy, which is typically a system (usually web 2.0 applications like flickr, youtube and del.icio.us) that allow users to annotate published content with keywords, also known as tags. The tagging may be done by all users or just the publisher of the particular content depending on the policy applied by the system. The usage of these tags is mainly for organization, but it may also help in discovering some unseen content. Well, I can really write a whole article just about folksonomy, but let’s just move on to the topic of the day.

Firstly, one may ask, WTF is a ramp-up problem? Imagine when a system grows as large as youtube for instance, where hours of video clips get uploaded in a minute (or even seconds, I don’t quite remember, but first heard here), expecting the publisher to individually tag their videos with relevant keywords is not really practical, considering there are already a number of them don’t even bother to provide description. No big deal then, one may just say. However, this pose a problem to
users like us, if there’s is no enough information provided, how are we supposed to find out what the video is about without having to waste our time watching it that has a chance of becoming yet another yet-another-friggin-cat-video-that-waste-n-minutes-of-my-life when I am really looking for something else?

If we can’t, how can youtube decide whether to put that video to a search result? Of course, it may be possible for a site like youtube to actually generate a transcript using their voice recognition technology. However, if the web application is developed by just a small startup, how are they going to find out the content of a multimedia content like video clips for instance?

What about cold-start problems then? A cold-start problem happens when a new user registers himself (or herself) to a system and the system doesn’t know anything about him due to the lack of information. It is a problem because the system wouldn’t be able to recommend him to other content published by other users that is relevant to him unless he start publishing and annotate his content or rather start ‘liking’ (rating) content published by others.

Wait a minute, didn’t I just read cold-start problem also applies to the scenario described in ramp-up problem mentioned above, you may ask. It’s true that there are people who generalize these two as cold-start problem, but I prefer to have them separated. Although it is true that they are both really similar in nature, but they are still two different cases, it makes explanation easier without the risk of confusing the audience.

So how does people respond to these problems? Some system would try to work out some implicit annotations from what they have in hand, however the result wouldn’t be just as accurate but still better than nothing most of the time. One of the real solutions to the problem is by encouraging users to tag by setting some challenging goals, aka. making it fun. In order to make the process fun, some uses the badge (or achievement) system like Stackoverflow, and some make tagging as a game. The other solution would be the implementation of auto-tagging feature, but that often involves a lot of computing magic and may not be really feasible.

On the other hand, for cold-start problem, a questionnaire can sometimes help in building a profile for the user to kick start his participation. However, the questionnaire can be left unanswered for uninterested user, or answered randomly. Another way of building a useful profile would be require a new user to rate or annotate a set of content until enough information is captured. Unless the process involved is fun, otherwise I don’t really see it as something that is doable. Demographic information would be used as a workaround to this problem by using it to complement the user’s profile. So, a new user would be assumed to share the same taste as the whole population in the beginning until they develop their own taste from the system’s POV.

I guess that’s it for this very brief note on cold-start and ramp-up problems. I would try to publish more in these couple of days while I am wrapping up my literature review report. My supervisor is gonna get so pissed off with me.

Some references (in no particular order):

Exit mobile version