A group of academic researchers have obtained the complete server logs for the Everquest 2 MMORPG. It’s four years of data for over 400,000 players – the resulting dataset is nearly 60TB. That’s right, terabytes. Combined with some demographic surveys there is interesting datamining potential here.
This is also interesting because apparently the standard tools don’t quite scale to the task of analyzing this data:
Regardless of format, many one-pass, exhaustive algorithms simply choke on a dataset this large, which is forcing his group to use some incremental analysis methods or to work with subsets of the data.
Some items in the results that I found interesting:
- Gender turned out to be a negative influence on interactions: even after their low numbers were taken into account, female players avoided interacting with each other.
- Older women turned out to be some of the most committed players but significantly under-reported the amount of time they spent in the game by three hours per week (men under-reported as well, but only by one hour).
One other tidbit from the article that I would like to know more about:
Jaideep Srivastava is a computer scientist doing work on machine learning and data mining—in the past, he has studied shopping cart abandonment at Amazon.com, a virtual event without a real-world parallel.
I’ll return to the topic of datamining in the future.
(Tip of the hat to Slashdot for the link.)
(This also reminds me a bit of the researchers who were looking at World of Warcraft as a model of infectious diseases.)