Friday, August 23, 2013

"Counting Little Words in Big Data" - C.Chung and J. Pennebaker

I just read a really nice paper about the usage of words counts to derive psychological measures.
Their method called language style matching (LSM) was able to generate nice correlations with diverse phenomena.
For example, the show a relation between LSM and relationship stability (via dating protocols), income distribution (via craigslist) and wiki site rankings (via discussion threads).

Further, they refer to some nice example of how to use huge collection of data to derive new knowledge. gives you a real time impression of the current feeling of twitter users and serves as historical view on our linguistic features.

