Nifty is an evolution of the Memetracker project.
This time they implemented an incremental approach to their meme clustering.
What I specially like about their paper is the massive test data set of over 20 terabyte.
This data set consists of 6.1 billion blog post that they collected over 4 years.
One dry run of their clustering takes hereby less than 5 days.
For detail information please refer to the paper.
No comments:
Post a Comment