Personalized Social Search Based on the User’s Social Network
David Carmel et al. -
IBM Research Lab in Haifa, Israel (CIKM-2009)
Personalizing the
search process considers the searcher’s personal attributes & preferences so
that the system is able to, for example, do personalized query expansion,
re-rank the result list or filter presented information. However, few studies
consider the preferences of the user’s related people in social network which
can be utilized to enrich the user’s preferences. In the paper “Personalized
Social Search Based on the User’s Social Network”, Carmel et al. presents a
study that leverages both aforementioned information resources to improve
social search, which is the search process over “social” data (e.g., documents
or people) gathered from Web 2.0 applications. Particularly, they plan to
re-rank search results by
considering the relationships with other users in the searcher’s social network
(SN).
Firstly, Carmel et al.
build user profiles from the data collected in the system IBM Lotus Connections,
a social software application, by using SaND (i.e., an aggregation tool for
information discovery & analysis over the social data). They consider two
kinds of relations between two entities, which are direct relations and
indirect relations (i.e., Two entities are indirectly related if both are
directly related to the same entity). Moreover, in order to see different
influences from different SN, they experiment with three types of SNs for
personalization. Each network maps a user to a list of related users weighted
according to their relationship strength.
- Familiarity SN: direct familiarity relation and indirect familiarity relation
- Similarity SN: similarity between two individuals according to common activity; for example, co-usage of the same tag and co-tagging of the same document
- Overall SN: contains all related people according to the full relationship model
- Topic-based: the user’s top related terms serve as the user’s Topic-based profile
For each user u, the profile includes the ranked list
of users related to u, and the ranked
list of related terms. The re-ranking mechanism computes three scores to rank entities;
the non-personalized SaND score, the score derived from related users, and the
score derived from related tags. The impacts of the three scores are controlled
by two parameters.
For evaluation, they run
an offline study and a user study. The result from the offline study shows that
the non-personalized method performs worst; on the other hand, the
Similarity-SN significantly outperforms the other two networks. In the user
study, they recruit 240 participants with 577 personal queries. The precision
of the search results of the personalized search strategies is measured by
NDCG@10 and P@10. The result also shows that the non-personalized method receives
the worst performance. However, for personalized methods, the Overall-SN which
is the worst in the offline study, performs best in this study. The difference
between two studies may be because of different kinds of queries or discriminating against authored or commented
documents, and biases tagged documents in the off-line approach.
Comments
Post a Comment