This article reveals something that might strike fear in our anonymously blogging hearts: a new program that uses algorithms to match anonymous writing samples and styles with the original author. The work evolved out of the "Gender Guesser" or "Gender Genie" project a team of researchers in 2003 from the Illinois Institute of Technology and Bar-Ilan University in Israel.
DHS uses technology to unmask Anonymous hacktivists
by Adrian Lee
Two years ago, the Department of Homeland Security created a priority tasking: how to tie Internet writing to Anonymous hacktivists that would stand up in a court of law. According to one DHS official who was not authorized to comment publicly, the project grew out of the "Gender Guesser" program created by a group of researchers in 2003.
"We figured if an algorithm could guess the gender of a writer, then it could eventually match the style of an anonymously written text to the actual author."
Using writing samples from hundreds of known authors and volunteers within the DHS, researchers at the Illinois Institute of Technology and Bar-Ilan University in Israel reunited and began working on a program that would expand upon their previous work.
Although the algorithm is in its early stages, it matches writing samples of at least 500 words to another sample of known writing correctly at least 85% of the time.
Dr. Adam Levi, lead researcher, explains the high correlation, "People can't hide who they are. They unconsciously give away elements that are unique to their writing: turns of phrase, word order, and even punctuation. All of these elements are unconscious, much like we found when working the 'Gender Guesser' program a decade ago."
Another official at DHS hopes that the algorithm will begin to work on smaller samples.
"We hope to eventually be able to match extremely small writing samples, such as those found on Twitter, to assist law enforcement in apprehending and prosecuting those who hide behind anonymity on the web."
The rest of the article is here