According to research firm IDC, the world is expected to create 180 zettabytes of data in 2025, up from less than 10 zettabytes in 2015. This data deluge will not benefit any company much unless the firm is able to turn those data into actionable insights. I spoke to Gideon Mann (pictured above), head of data science at Bloomberg, to find out more on how data science can provide financial institutions a leg up on competitors.
Bank IT Asia: Even though the term data science has been used a lot lately, not everyone truly understands it. How different is data science from big data and analytics?
Gideon Mann: Data science is really the merger of statistics and programming. Big data is typically more a programming technique, while analytics are statistical techniques. Data science is really about applying machine learning methods to large volumes of data.
At Bloomberg, data science is non-conventional and focuses on three technology areas -- natural language processing (NLP), information retrieval and search, and core machine learning.
What are the benefits that data science can offer to the financial sector which traditional analytics can't?
When I think about traditional analytics in the financial sector, I take that to mean financial mathematics. Financial mathematics is really best used to look at structured data -- data that is fairly robust, not as noisy and where you have some ideas if they may fit models. For example, Black-Scholes Model for determining fair prices of options.
In contrast, I think data science and machine learning can have the most impact when you have new data sources that are unstructured, that are noisy and not that robust. There are additional benefits in marrying this unstructured data with structured data that will have an advantage over traditional analytics.
Finally, in cases where you have a few parameters to work with in traditional financial mathematics, you can apply data science to the same problem when you collect enough data. You can increase the parameter space and get a potentially more effective model.
What should financial institutions do if they want to fully benefit from data science?
First, you need to have a team that knows data science and machine learning. Many people say they can do both things, but they have to take classes and be educated. If you know probability, statistics, linear algebra, optimisation and calculus, then you know enough to know machine learning. But you just have to take that coursework.
A second consideration is around technology support. Data science and machine learning methods are very computational intensive and you have to have enough machines, infrastructure that is built out to hold all the data and algorithms, and be prepared to support all the pieces that you need.
Lastly, data science methods require a certain amount of labeled data - data which a human has sat down and annotated. What this means is that for every input, there is a corresponding output that you want. Without investing in the labeling process, you usually cannot reap the benefits of data science. You have to think about your annotation strategy in addition to your data strategy.
Sign up for Computerworld eNewsletters.