Photo - (From left) Ekaterina Pshehotskaya, the Technology Development Director from InfoWatch with Tamara Sokolova, a linguist with InfoWatch at the conference in Kuala Lumpur. Malaysia.
Russian DLP specialist InfoWatch, a spinoff from security solutions firm Kaspersky Lab, presented a complimentary career workshop called 'What I Really Do As Linguist in Data Leakage Prevention (DLP)' - as part of the International Conference on Cyber-Crime Investigation and Cyber Security (ICCICS2014) held on November 17-19 2014 at the Asia Pacific University of Technology and Innovation (APU) in Kuala Lumpur, Malaysia.
"Nowadays we can see a global increase of data leaks in every field of business," said Pshehotskaya, InfoWatch's technology development director during her conference presentation. "Leakage and theft can pass right to the direct or financial losses, for example, loss of customers, loss of money, reputational loss, negative references. And it can pass to the indirect losses - data transfer to direct competitors and some enticements."
Before their appearances on the conference stage, Pshehotskaya together with InfoWatch linguist Tamara Sokolova outlined to Computerworld Malaysia how some of the latest content analysis advances, derived from linguistic technologies, are being successfully used in a technical sphere to combat data loss prevention (DLP).
ICCICS2014 sees the third international scientific involvement by InfoWatch in 2014. Earlier this year, InfoWatch technical experts presented their technologies on the International Conference on Computing Technology and Information Management (ICCTIM2014) in Dubai and on the International Conference on Digital Security and Forensics (DigitalSec2014) in Czech Republic.
Could you give a quick introduction to how InfoWatch content analysis technologies have proved useful in cybercrime investigations?
Tamara Sokolova: InfoWatch has got a set of different technologies that are aimed at preventing cyber crimes connected with data confidentiality. The most common case is when malicious insider leaks internal documents, marked as confidential - InfoWatch efficiently prevents such incidents with help of digital fingerprint technology.
More advance malefactors may delete these confidential marks from documents and try to leak confidential data as is. To prevent such incidents InfoWatch has implemented special data classification technologies, which are able to detect confidential data in unstructured array.
Companies of oil and gas segment possess such strictly confidential data as minefield charts, upstream data, etc. Financial institutions value financial insider information, cash collection data, etc. All these data are confidential and as such is covered by InfoWatch content analysis technologies for efficient protection against leak or loss.
One of the most sensitive types of data is personal data. InfoWatch analysis technologies have different modules allowing detecting personal data in outgoing data flows: ID numbers, credit card numbers, phone numbers, e-mail addresses, etc. For Malaysia for example we implemented detection of MyKad, NRIC, etc.
Another case not directly connected with cyber crime is when company employees use expletives in business correspondence which is strictly prohibited by corporate policy. InfoWatch analysis technologies successfully detect such violations.
Are you able to detail any cases where this approach has helped recently?
Tamara Sokolova: Early this year, InfoWatch implemented a project in big Indian transporting and logistics company. The customer used InfoWatch solution to monitor HTTPs (Gmail and Hotmail), block sensitive data printing and copying out, and track data moving out via Shadow copies. Initial project run for 5 months with very high satisfactory of CIO, about 20 major data leaks were prevented. A customer engaged expansion to other group units.
What new developments are you hoping to introduce in Malaysia?
Ekaterina Pshehotskaya: In my presentation, I will introduce InfoWatch approach to processing unstructured data: self-learning system, InfoWatch Autolinguist module providing full automation of document categorization, terms extraction and weight assignment.
Tamara Sokolova: I will introduce an advanced technology of clustering small data volumes which simplifies the process of setting the technologies of data analysis. Data clustering means dividing unstructured text arrays into subject groups: at the input we have various documents with different topics and content and on the and at the output we get documents classified into several categories, for example, financial, HR, tendering, logistics, etc.
Sign up for Computerworld eNewsletters.