For real-time data, a key goal is lowest possible business latency, i.e. the time between the arrival of new data from the exchange, through to the trading application that makes decisions based on the new data. The competition for low latency hinges on the fact that if two companies are running similar trading strategies, then the company that can deliver new data to its algorithms first will be in a stronger position.
The measure here is how much faster than the competition this can be done, not the absolute time taken from the exchange. The benefits can manifest themselves in numerous ways. For example, if one firm is significantly faster than others, the headstart allows more thorough pre-trade risk checks to be undertaken.
At present many firms are facing a trade-off in terms of pre-trade checks versus fastest possible execution of trading strategies. It is system limitations that are creating these trade-offs, rather than any structural challenge. Typically, the key to balancing this is to reduce the number of steps through which the data is routed, the time taken for each step, and the time spent converting data between different formats between steps. Firms should demand the levels of performance that enable this.
Once the exchange data has been received, other points to consider in terms of management are:
* Store intra-day and real-time data in-memory to provide the fastest possible access
* Run analytics and queries directly on the data as it is received over feed-handlers
* Eliminate the cost in time and memory of marshalling of data between different formats by using a single database format throughout - for streaming queries, CEP, intra-day storage and history.
* Use 'publish and subscribe' mechanisms to offload processing from the main server to chained servers, thus making maximum use of cores and machines.
Here, the main problem is the sheer size of the data, which can run into the petabytes. All this data must be stored in a way that permits efficient access for all of the various applications that consume it. Unlike the real-time database, there is no way a historical database can be stored in RAM; instead it must be stored on disk.
Several strategies can be used to maximise performance on large historical databases.
Data should be stored on various media depending on the speed of access required, for example, very fast access will be required for data from the previous few days whereas older data can be stored on more traditional drives that take longer to retrieve. Importantly this not only provides cost savings but can also enhance performance. Data compression can also be used to reduce storage requirements, and this can help to reduce network latency.
Sign up for Computerworld eNewsletters.