Wednesday, August 1, 2012

Summary of "Credibility Improves Topical Blog Post Retrieval - Weerkamp et al."

The authors introduce 11 indicators of credibility to improve the effectiveness of topical blog retrieval. Their indicator are one blog and on post level. Beside some syntactic indicators, they also present the timeliness of posts, the regularity of blogs, and the consistency of blogs.

The timeliness of a post is defined as the temporal distance of a blog post to a news post of the same topic. In this paper, topics seem to be term occurrences. Nonetheless, it is very interesting to incorporate traditional media.

The posting/publishing behaviour of a blogger is called regularity. Hereby, the authors assume that a credible blog has a very regular posting behaviour. In contrast, related research often assumes this as an indicator for splogs.

The topical consistency of a blog represents its topical fluctuation. The authors define the consistency similar to the query clarity, which remembers me a bit of the tf/idf score. As contrast to related work, the authors do not use the natural ordering of posts.

Nevertheless, the author show that their indicator can improve the topical blog retrieval significantly (using the blog06 data set).
Take a look at the paper.

No comments:

Post a Comment