Monday, January 7, 2013

"News Information Flow Tracking, Yay!" - Suen et al.

Nifty is an evolution of the Memetracker project.
This time they implemented an incremental approach to their meme clustering.
What I specially like about their paper is the massive test data set of over 20 terabyte.
This data set consists of 6.1 billion blog post that they collected over 4 years.
One dry run of their clustering takes hereby less than 5 days.
For detail information please refer to the paper.

"Influence and Correlation in Social Networks" - Anagnostopoulos et al.

The authors describe a statistical test for actions in social networks.
This test helps to distinguish correlation from causality for information propagation in social networks.
Hereby, the author's test first calculates the probability for one user influencing the other user.
Afterwards, they shuffle action times of the user and test again.
To get a detailed insight into the statistical concept please take a look at the paper.