MIT's AI-based system combs through data and presents suspicious activity to human analysts. Credit: Kalyan Veeramachaneni/MIT CSAIL
Neither humans nor AI have proved overwhelmingly successful at maintaining cybersecurity on their own, so why not see what happens when you combine the two? That's exactly the premise of a new project from MIT, and it's achieved some pretty impressive results.
Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and machine-learning startup PatternEx have developed a new platform called AI2 that can detect 85 percent of attacks. It also reduces the number of "false positives" -- nonthreats mistakenly identified as threats -- by a factor of five, the researchers said.
The system was tested on 3.6 billion pieces of data generated by millions of users over a period of three months. The researchers presented a paper summarizing the project earlier this month at the IEEE International Conference on Big Data Security.
"You can think about the system as a virtual analyst," said CSAIL research scientist Kalyan Veeramachaneni, who developed AI2 with Ignacio Arnaldo, a chief data scientist at PatternEx and a former CSAIL postdoc. "It continuously generates new models that it can refine in as little as a few hours, meaning it can improve its detection rates significantly and rapidly."
Even as fears abound regarding the job-replacing potential of artificial intelligence, it's becoming increasingly apparent that combining AI with human insight can deliver much better results than either side could produce alone. Just last week, for example, Spare5 released a new platform that applies a combination of human insight and machine learning to help companies make sense of unstructured data.
In the world of cybersecurity, human-driven techniques typically rely on rules created by living experts and therefore miss any attacks that don’t match the rules. Machine-learning approaches, on the other hand, rely on anomaly detection, which tends to trigger false positives that create distrust of the system and still end up having to be investigated by humans.
Creating cybersecurity systems that merge human and computer-based approaches isn't easy, though, partly because of the challenge of manually labeling cybersecurity data for the algorithms. For many tasks, such as visual recognition, labeling is just a matter of enlisting a few human volunteers on a crowdsourcing site like Amazon Mechanical Turk, but not many workers have the skills needed to apply labels like "DDOS" or "exfiltration attacks," Veeramachaneni said. "You need security experts."
Experts, meanwhile, tend to be short on time. Recognizing that constraint, AI2 uses machine learning first to find the most important potential problems; only then does it show the top events to analysts for labeling. On day one of its training, AI2 picks the 200 "most abnormal" events using unsupervised machine learning and gives them to the human expert, MIT explained. Those analysts then confirm which events are actual attacks, and the system incorporates that feedback into its models for the next set of data.
Sign up for Computerworld eNewsletters.