Another emerging problem with data science is that it is very difficult to maintain a data analysis system over time, given its complexity. As the people who developed the algorithms to analyze data move on to other jobs or retire, an organization may have difficulty finding other people to understand how the code works, Markl said.
Another challenge will be visualization, said Pak Chung Wong, chief scientist at the Department of Energy's Pacific Northwest National Laboratory. Visualization has long been a proven technique to help humans pinpoint trends and unusual events buried in large amounts of data, such as log files.
Standard visualization techniques may not work well with petabyte and exabyte-sized datasets, Wong warned. Such datasets may be arranged in hierarchies that can go 60 levels deep. "How can you represent that?" he asked.
Sign up for Computerworld eNewsletters.