Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

British Gas to use Apache Cassandra for latest big data project to connect customers' boilers

Margi Murphy | Dec. 12, 2014
British Gas will use Apache Cassandra to crunch data collected from customers' boilers - to predict when they might fail - from January.

British Gas will use Apache Cassandra to crunch data collected from customers' boilers — to predict when they might fail — from January.

The new database will support British Gas' Connected Homes unit, an innovative startup based on lean principles that operates Hive, its smart thermostat which allows users to see energy usage in their homes through a dashboard on their smartphone.

Its parent company British Gas has been an Apache house for some time, including database Hadoop.

The popularity of the Hive app, which currently competes with Google-backed Nest in the Apple iStore, has resulted in an unprecedented stream of data, which is currently pooled from 300,000 customers and processed by a team of 16.

To support this scale, the team will use a combination of Apache Spark and Cassandra, giving British Gas "the ability to process the data quickly and give people an increasingly near real-time experience" head of data analytics, Jim Annings told ComputerworldUK.

He added: "At the moment, we tell the customer how they used 20 percent of their energy recently. In the future, we see this as telling the customer: 'We see you have just walked out the door and your energy usage is high, maybe you should check that you haven't left the oven on'."

But British Gas plans to further connect its customers' homes, and will begin to install sensors on a handful of customers' boilers from January, in an experimental phase of its 'Connected Boilers' project. During this pilot the data team will use Spark and Cassandra to develop algorithms that can predict when a boiler is about to fail, so the utility can send a notification to the customer and the engineering teams.

Annings explained that this is only possible with a database that has the potential to stream in real-time, unlike Hadoop.

He said: "We're dealing largely with time series data, and Spark is 10 to 100 times quicker as it is operating on data in-memoryCassandra delivers what we need today and if you look at the Internet of Things space; that is what is really useful right now."

While Connected Homes is a separate, lean organisation within British Gas, its 200-year-old parent is undergoing a digital transformation including telephony, a new CRM systems and refining its legacy data.

 

Sign up for Computerworld eNewsletters.