Big Data may be an overhyped buzz word today, with recent provocative headlines about Big Data from advancing human Genome research to find specific markers for cancer or genetic diseases, to Big Data analytics raising big privacy questions, as well as the problems of our digital footprint haunting us forever. However one recent headline gave me pause, as it asked the hypothetical question of whether Big Data could have prevented the recent Fort Hood shootings.
Many government entities are increasingly using Big Data analytics for decision making and solving some of their toughest problems. One such Big Data program was started with seed funding from DOD's Defense Advanced Research Project Agency (DARPA), a project that analyses communications from Veterans who have opted in.
The purpose of this program, the Durkheim Project, named after Emile Durkheim, a sociologist whose 1897 publication of "Suicide" provided an early text analysis for suicide risk, is to be able to predict the risk of harmful behaviour by Veterans. According to the Department of Veterans Affairs, an alarming rate of 22 Veterans take their lives every day. This project analyses data from Veteran's Social Media and mobile communications for the purpose of real-time monitoring of text content and behavioural patterns that are statistically correlated with tendencies of harmful behaviour, such as suicide.
In a recent article, Could Big Data Have Prevented the Fort Hood Shooting? - Defense One, by Patrick Tucker, he states "Had Army Spec. Ivan Lopez been enrolled in the Durkheim Program, which uses an algorithm that mines social media posts for indicators of suicidal behaviour, it might have picked up clues that a clinician could have missed in time for an intervention. "One of the founders of the Durkheim Project, Chris Poulin, is quoted in this article as saying, "Given the highly agitated state of the shooter, we may have been able to get him help before he acted, had he been in our system." Indeed, Poulin and his team of researchers developed an algorithm based on a repository of text, updated and analysed by artificial intelligence systems, that accurately predicts suicide 70 % of the time, an improvement over average clinical diagnosis, which is about 50% accurate.
While the Durkheim Project may make a major difference in clinical diagnosis, and thus may be seen as a panacea for predicting harmful behaviour among the population that has opted in, a few problems with this approach may exist. How can the results of this study be extended Veterans who show tendencies of harmful behaviour and are not enrolled? Some of the phrases cited that are most closely associated with suicide include "worthlessness" and" frightening." Is it as simple as providing health care providers the key words that are coupled with the behavioural patterns that statistically correlate with tendencies for harmful behaviour?
Sign up for Computerworld eNewsletters.