Everyone knows folksonomy is (or was) cool and useful, however, when it is applied in real life, then problem arises. The idea of blogging this came while I am struggling to get my literature review report done (been doing it for months, I am being so ridiculous, I know). As a matter of fact, as I am dying to get it done, there are a couple of things that I found to be blog-worthy. So, I will be publishing a couple of brief overview to some of the topics involved in the coming days in a really casual (read: lazy, and full of personal speculations) way to this very humble little blog of mine.
Let’s start with a brief overview of folksonomy, which is typically a system (usually web 2.0 applications like flickr, youtube and del.icio.us) that allow users to annotate published content with keywords, also known as tags. The tagging may be done by all users or just the publisher of the particular content depending on the policy applied by the system. The usage of these tags is mainly for organization, but it may also help in discovering some unseen content. Well, I can really write a whole article just about folksonomy, but let’s just move on to the topic of the day.
Firstly, one may ask, WTF is a ramp-up problem? Imagine when a system grows as large as youtube for instance, where hours of video clips get uploaded in a minute (or even seconds, I don’t quite remember, but first heard here), expecting the publisher to individually tag their videos with relevant keywords is not really practical, considering there are already a number of them don’t even bother to provide description. No big deal then, one may just say. However, this pose a problem to
users like us, if there’s is no enough information provided, how are we supposed to find out what the video is about without having to waste our time watching it that has a chance of becoming yet another yet-another-friggin-cat-video-that-waste-n-minutes-of-my-life when I am really looking for something else?
If we can’t, how can youtube decide whether to put that video to a search result? Of course, it may be possible for a site like youtube to actually generate a transcript using their voice recognition technology. However, if the web application is developed by just a small startup, how are they going to find out the content of a multimedia content like video clips for instance?
What about cold-start problems then? A cold-start problem happens when a new user registers himself (or herself) to a system and the system doesn’t know anything about him due to the lack of information. It is a problem because the system wouldn’t be able to recommend him to other content published by other users that is relevant to him unless he start publishing and annotate his content or rather start ‘liking’ (rating) content published by others.
Wait a minute, didn’t I just read cold-start problem also applies to the scenario described in ramp-up problem mentioned above, you may ask. It’s true that there are people who generalize these two as cold-start problem, but I prefer to have them separated. Although it is true that they are both really similar in nature, but they are still two different cases, it makes explanation easier without the risk of confusing the audience.
So how does people respond to these problems? Some system would try to work out some implicit annotations from what they have in hand, however the result wouldn’t be just as accurate but still better than nothing most of the time. One of the real solutions to the problem is by encouraging users to tag by setting some challenging goals, aka. making it fun. In order to make the process fun, some uses the badge (or achievement) system like Stackoverflow, and some make tagging as a game. The other solution would be the implementation of auto-tagging feature, but that often involves a lot of computing magic and may not be really feasible.
On the other hand, for cold-start problem, a questionnaire can sometimes help in building a profile for the user to kick start his participation. However, the questionnaire can be left unanswered for uninterested user, or answered randomly. Another way of building a useful profile would be require a new user to rate or annotate a set of content until enough information is captured. Unless the process involved is fun, otherwise I don’t really see it as something that is doable. Demographic information would be used as a workaround to this problem by using it to complement the user’s profile. So, a new user would be assumed to share the same taste as the whole population in the beginning until they develop their own taste from the system’s POV.
I guess that’s it for this very brief note on cold-start and ramp-up problems. I would try to publish more in these couple of days while I am wrapping up my literature review report. My supervisor is gonna get so pissed off with me.
Some references (in no particular order):
- Chumki Basu, Haym Hirsh, and William Cohen. Recommendation as clas- sification: Using social and content-based information in recommendation.
13
In Proceedings Of The National Conference On Artificial Intelligence, vol- ume pp, pages 714–720. JOHN WILEY & SONS LTD, JOHN WILEY \& SONS LTD, 1998. ISBN 0262510987. URL http://scholar.google.com/ scholar?hl=en&btnG=Search&q=intitle:Recommendation+as+Classification: +Using+Social+and+Content-Based+Information+in+Recommendation#0. - R Farzan and P Brusilovsky. Social navigation support in a course recommendation system. In V Wade, H Ashman, and B Smyth, editors, Adaptive Hypermedia and Adaptive WebBased Systems 4th International Conference AH 2006, volume 4018 of Lecture Notes in Computer Science, pages 91–100. Springer Berlin Heidelberg, 2006. ISBN 9783540346968. doi: 10.1007/11768012. URL http://www.springerlink. com/content/v4221425gm747700/.
- Rong Hu and Pearl Pu. Using Personality Information in Collaborative Filtering for New Users. Proceedings of the 2nd ACM RecSys10 Workshop on Recommender Sys- tems and the Social Web, 2010. URL http://www.dcs.warwick.ac.uk/~ssanand/ RSWeb_files/Proceedings_RSWEB-10.pdf#page=23.
- Stefan Siersdorfer and Sergej Sizov. Social recommender systems for web 2.0 folk- sonomies. ACM Press, New York, New York, USA, 2009. ISBN 9781605584867.
17
doi: 10.1145/1557914.1557959. URL http://portal.acm.org/citation.cfm? doid=1557914.1557959. - Martin Szomszor, Ciro Cattuto, Harith Alani, Kieron OHara, Andrea Baldassarri, Vittorio Loreto, and Vito D P Servedio. Folksonomies, the Semantic Web, and Movie Recommendation. eprintsecssotonacuk, pages 71–84, 2007. URL http://eprints. ecs.soton.ac.uk/14007/.
- Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M Pennock. Methods and metrics for cold-start recommendations. Proceedings of the 25th an- nual international ACM SIGIR conference on Research and development in infor- mation retrieval SIGIR 02, (Sigir):253, 2002. doi: 10.1145/564376.564421. URL http://portal.acm.org/citation.cfm?doid=564376.564421.
- Douglas Eck, Paul Lamere, Thierry Bertin-Mahieux, and Stephen Green. Auto- matic Generation of Social Tags for Music Recommendation. Learning, 20:1–8, 2007. URL http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.67. 7891&rep=rep1&type=pdf.
- H Berger, M Denk, M Dittenbach, A Pesenhofer, and D Merkl. Photo-Based User Profiling for Tourism Recommender Systems. In Giuseppe Psaila, editor, ECommerce and Web Technologies, volume 4655/2007, pages 46–55. Springer, 2007. doi: 10.1007/978-3-540-74563-1\ 5. URL http://www.springerlink.com/index/ jv44060013345257.pdf.
- Hao Ma, Haixuan Yang, Michael R Lyu, and Irwin King. Sorec: social recom- mendation using probabilistic matrix factorization. In Proceeding of the 17th ACM conference on Information and knowledge management, pages 931–940. ACM, 2008. ISBN 9781595939913. doi: 10.1145/1458082.1458205. URL http://portal.acm. org/citation.cfm?id=1458205.
- Upendra Shardanand and Pattie Maes. Social information filtering: algorithms for automating word of mouth. In I R Katz, R Mack, L Marks, M B Rosson, and J Nielsen, editors, Proceedings of the ACM Conference on Human Fac- tors in Computing Systems, volume 1 of Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems, pages 210–217. ACM Press/Addison- Wesley Publishing Co., 1995. ISBN 0201847051. doi: 10.1145/223904.223931. URL http://portal.acm.org/citation.cfm?id=223931&coll=portal& d…N=53154003&ret=1.