Toomre Capital Markets LLC

Real-Time Capital Markets -- Analytics, Visualization, Event Processing, and Intelligence

Algorithmic Trading: Stream Processing at Breaking Speeds

Algorithmic trading is a hot topic in the global financial community, especially with the coming complete implementation of Regulation National Market System ("Reg NMS") in the United States and Markets in Financial Instruments Directive ("MiFID") in Europe. Recent Toomre Capital Markets posts White Paper: Market Risk and Algorithmic Trading and Pondering Broker/Dealer Liquidity Management have generated considerable web traffic to this site from North America, Europe (particularly London and Germany) and South East Asia (particularly Australia). Some inquires have asked why Toomre Capital Markets LLC works closely with StreamBase Systems, which makes very fast enterprise-class stream processing engine software. Others have been confused about whether the StreamBase software is an algorithmic trading application like the Apama ESP Platform from Progress Software or Velocity from Vhayu Technologies.

The StreamBase software is not an algorithmic trading application per se. Rather it is very fast streaming technology that many leading financial firms and large asset managers use (or employ firms like TCM) to develop their own custom applications is areas like smart order routing, market data applications, algorithmic trading, transaction cost analysis (both pre-trade and post-trade analytics), and real-time compliance programs. In short, the StreamBase software offers:

  • High Data Volume / Low Latency Processing
  • Stream-Based Programming with StreamSQL
  • Unification of Real-Time Data Streams with Historic Data
  • Graphical Integrated Development Environment
  • Enterprise-class Infrastructure
  • On January 14th 2007, Computerworld (Australia) published an article that explains why, for instance, Bridgewater Associates, an investment manger with over $160 billion in assets under management including more than $20 billion in the firm's hedge fund product, Pure Alpha, uses stream processing software. The article Stream Processing at Breaking Speeds written by Robert L. Mitchell starts:

    At investment management firm Bridgewater Associates, access to real-time data is measured in market ticks. Data feeds containing quote and trade activity are expected to stream in at 124,000 messages per second this year, so even subsecond delays in the arrival of data can affect trading decisions and put the U.S.-based organization at a disadvantage.

    Monitoring high volumes of data that have very low latency requirements is beyond the capabilities of transactional databases, which must write each transaction to disk, so financial services firms traditionally build their own custom applications to keep up. "There is a lot of effort required to build a framework that could perform and deal with lots of data concurrently," says Ed Thieberger, head of training technology at Bridgewater.

    Recently, however, Bridgewater and other financial services firms have found an alternative in stream processing tools. Stream processing software goes by a variety of names, including streaming databases and event stream processing. The technology includes an engine that monitors data as it flows into and out of databases and other applications and can easily tap into external data feeds or internal message queues. All the data the engine gathers is held in memory to speed processing.

    With data volumes increasing, organizations are running out of options for real-time processing. Financial services firms have little choice but to pursue stream processing because data quantities are starting to outstrip the capabilities of even custom-developed tools.

    "At these volumes, traditional techniques won't scale," says Mike Stonebreaker, co-founder and chief technology officer at Massachusetts-based StreamBase Systems. Bridgewater's custom C++ program could handle 18,000 messages per second -- more than the 900 a relational database could support, but far short of the data volumes it faces [in 2006 and beyond]." In contrast, the StreamBase engine can handle more than 140,000 messages per second.

    The key to this type of software is that the users set up rules-based or time-based parameters that tell the stream processing engine what to look for. The software then continually monitors all of the data streams that are defined in that streaming query. One streaming application can also feed another with one or more resulting data streams. Hence, a possible algorithmic trading system can include multiple streaming queries.

    One streaming query might clean up and time-align multiple market data feeds from say Reuters, Wombat, and Comstock to ensure that the resulting consolidated data stream is both complete (missing data issue) and most timely (while eliminating lagging duplicate data received from the other feeds). "At Bridgewater, Thieberger uses StreamBase's streaming technology to watch for delays in data feeds coming in from providers of market data. If one feed falls behind, StreamBase immediately issues an alert and splices in the missing data from another source. 'The tool is very well suited to represent all of the rules we want to implement that lead to decisions about how we are trading,' Thieberger says.

    Another streaming query might combine an investment firm's historic transaction data with a data stream containing the day's completed transaction reports to determine to which broker/dealer a trade should be directed. This allows minimizing transaction costs such as higher commission fees at certain broker/dealers while meeting volume agreements and commission requirements for research and other investment services.

    A third streaming query might act as an Execution Management Service ("EMS"). This would continually take in data on open portfolio manager orders, submitted orders (not yet completed) as well as one or more feeds of completed trade transaction data. The resulting stream would contain real-time open order data including the amount that still needs to be executed.

    This open order data stream then could be combined with the consolidated market price data feed to calculate various pre-trade analytic statistics (like expected implementation shortfall). These real-time statistics vary as time, individual security prices (and localized volatility), and open order amounts change. The resulting stream either could be used as input to additional streaming queries or some other application (like a trader's open orders screen display which might only be updated every second).

    Yet another streaming query might contain the quantitative logic for one or more custom algorithmic trading models like Volume Weighted Average Price ("VWAP"), Time Weighted Average Price (TWAP) or Implementation Shortfall ("IS"). Frequently, the open order size for an individual security is larger than the quantity that can be immediately executed against the prevailing "best price" bid or offer. Hence, the order must be broken up into multiple smaller transactions. The trading algorithms, which vary from being surprisingly simple to devilishly complex, use pre-determined rules and parameters to decide how 'best' to set the smaller transaction parameters (like timing, venue, price and amount) and then automatically generate the appropriate electronic trade instructions as a stream of instructions.

    The other portion of transaction cost analysis ("TCA"), real-time post-trade analytics, might be in another streaming query. This would use portfolio manager order data parameters, the pre-trade analytics data, and streaming completed transaction data to compute how well the transactions were completed. Resulting real-time statistics might include how well the whole trade was completed relative the portfolio manager's decision price or how much the predicted pre-trade and actual post-trade transactions costs differ.

    Other streaming queries also might be added to a financial institution's customized algorithmic trading system. Governance, compliance and risk ("GRC") rules and algorithms often are added. Hence, an institutional investor in the era of Reg NMS and MiFID might add a streaming query that checks in real-time whether the new 'best execution' rules were violated. As more of the transaction volume is conducted by market internalizers and in dark pools, such a query helps to ensure that there is not a "trade-through" of some better available price. Other streaming queries might provide real-time statistics on economic capital used by trader, desk or department, while others might produce individual and aggregated real-time market-risk statistics for client portfolios.

    As the above description indicates, the value of stream processing solution like that by StreamBase Systems is limited mainly by the user's creativity. "To keep latency low, stream processing systems place data that must be retained in memory and discard everything else. Nothing is stored on disk.
    Streaming databases say, 'Let's not try to store everything. Let's just watch everything as it flies by and keep running totals,' such as the total number of transactions per second, says Eric Rogge, an analyst at Ventana Research." Ed Thieberger, head of training technology at Bridgewater Associates, "measures the success of stream processing both in reduced development costs and faster time to market. 'We haven't had to build a framework that does what StreamBase does,' he says. In addition, once StreamBase is pointed at the data streams to be measured, business analysts can construct queries using a drag-and-drop user interface rather than rely on programmers, Thieberger says."

    Toomre Capital Markets LLC works closely with StreamBase Systems and is active in the algorithmic trading sector. Please contact TCM as indicated below should the reader or his organization desire some assistance with algorithmic trading or the StreamBase software. TCM also is available for consulting engagements. Comments and thoughts are welcome.