0% found this document useful (0 votes)
45 views9 pages

Quantitative Trading Data by The Numbers WD

Quantitative trading data is characterized by its high speed, volume, and complexity, necessitating advanced data management strategies that can handle low-latency, time-series data. The evolution of technologies like streaming analytics and generative AI has transformed data management on Wall Street, emphasizing the importance of real-time insights and historical context. Conventional databases are often inadequate for the demands of quant trading, leading to inefficiencies that can be mitigated by specialized time-series databases designed for high-frequency data processing.

Uploaded by

Shirley Carneiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views9 pages

Quantitative Trading Data by The Numbers WD

Quantitative trading data is characterized by its high speed, volume, and complexity, necessitating advanced data management strategies that can handle low-latency, time-series data. The evolution of technologies like streaming analytics and generative AI has transformed data management on Wall Street, emphasizing the importance of real-time insights and historical context. Conventional databases are often inadequate for the demands of quant trading, leading to inefficiencies that can be mitigated by specialized time-series databases designed for high-frequency data processing.

Uploaded by

Shirley Carneiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

QUANT TRADING

DATA MANAGEMENT
BY THE NUMBERS

“Data is the oxygen of our information society.”


Ana Botín, Executive Chair, Banco Santander*
QUANT TRADING DATA IS
DIFFERENT
Quantitative trading data is different. Market “Real-time data is the lifeblood
data is fast, voluminous and complicated, and of trading. Without it, traders
operate in the dark.”
the tech landscape continues to evolve, merge,
and morph quickly. Regulation reigns. - Michael Bloomberg¹

Thanks to pioneering work in streaming analytics, time-series data management and, now, generative AI, data management on
Wall Street has come a long way from the first digitized trading rooms of the 80s. The challenge, of course, is time. The time
between a market micro-movement and the action that must be taken. The time to build new trading strategies that out-alpha
your competitors. The time it takes to build systems that implement new ideas yet comply with ever-increasing compliance
demands.

The other challenge is change. There are new types of data to unlock, new opportunities thanks to the rise of of generative AI
and vector databases, and new demands to run applications as managed services in the cloud, but safely, securely, and with
the same low-latency demands as on-premise, bare-metal performance.

From handling latency to massive volumes and throughput, to time-series comparison on the fly, trading data is just different.
But, hey, this is Wall Street – let’s look at how trading data is different by the numbers.

QUANTITATIVE TRADING DATA


VALUE

TIME
µs ms sec minute hour week month quarters

Quantitative trading data must yield insights across a full range of


decision-making latency – from near real-time to strategic,
historical insights.

¹ 19 Inspirational Quotes About Data, The Pipeline


QUANT TRADING DATA MANAGEMENT BY THE NUMBERS
10 MILLISECONDS
MATTER
Low-latency decision-making with data is essential for quantitative trading, as it
is for any automated environment. On Wall Street, milliseconds matter, but so
does the right strategic context to make a good decision, not just a fast decision.

Quant trading demands a data fabric that provides insight into high-frequency
data and historical context at the same time.

VALUE

FOR QUANT
TRADING,
MILLISECONDS
MATTER...

TIME
µs ms sec minute hour week month quarters

VALUE

...HISTORICAL
CONTEXT MATTERS
TOO, AT THE SAME
TIME.

TIME

µs ms sec minute hour week month quarters


QUANT TRADING DATA MANAGEMENT BY THE NUMBERS
1
It’s not just fast data, it’s tick data,
BILLION and tick data is different. Each
change in market price, order,
TICKS PER DAY execution and position matters.
FOR 2,700
STOCKS IN A Streaming data lies at the heart of quantitative trading
REGIONAL systems, so the first job of a data platform is to
process fast data in a way that organizes it by time for
MARKET ¹ analysis, query, and replay.

That means three things: high-frequency, low-latency,


and time series structure for storage.

HIGH-FREQUENCY
HIGH LOW
High-volume ingestion is the first step to effective tick
FREQUENCY LATENCY
data management. And it’s not just a billion ticks a day, but
financial data bursts in important ways, so ingest rates of
100,000 ticks a second are common.
TIME
Quant trading management uses in-memory processing,
SERIES
time-series data management, and streaming analytics to
ingest tick data at these extreme rates.

LOW LATENCY TIME SERIES


Depending on the data and use case, acceptable latencies Finally, quantitative trading data should be
can range from low single-digit milliseconds to seconds. organized in temporal order for aggregation,
comparison, query, analysis, and replay. For
Low-latency absorption of data is achieved by databases intraday trading decisions, every tick and state
that use in-memory and micro-batching techniques. change must be stored, indexed, and queryable.

A solid quant trading data platform handles streaming data Time-series quant trading databases are
from any financial system, from market data like ICE, structured in a way that stores data in temporal
Bloomberg, and NYSE to order management systems, order so queries can be executed quickly for low-
execution management systems, and more. latency decision-making.

Quantitative trading data is different. Design systems for high-volume,


low-latency, time series data management and you’ll be off to the
quantitative trading races.

QUANT TRADING DATA MANAGEMENT BY THE NUMBERS


FROM
3 32
Conventional data technologies aren’t designed for time-series, streaming, or
PROJECTS PER YEAR

vector data. This mismatch impedes progress and creates a drag on IT


productivity, experimentation, and innovation.
In one study, a top-tier bank measured the code required to manage
tick and trading data with a conventional store (SQL and NoSQL based)
and a store designed for streaming, time-series, and vector data
management.

They found the time needed (initial development and ongoing support) FORCING TRADING
to require 5 versus 15 FTEs, and 45 days versus six months. The biggest
DATA INTO A
areas of time and code inefficiency came from:
CONVENTIONAL
1. High-frequency data collection,
DATABASE IS LIKE
2. Aggregation, cleansing, normalizing data, POUNDING A
3. Performance tuning, scale-out SQUARE PEG INTO
4. AI/ML model operationalization, recoding of Python A ROUND HOLE.
5. Cloud integration and deployment

They found these tasks were all significantly faster with a fit-for-
purpose data platform with time-series, streaming, and vector support.

The resulting increased cadence in delivery was due to re-use and fit
with the data architecture designed for streaming data.

Red rows are non-value-add tasks like connecting to


unstructured and streaming data, error correction,
data quality, tuning.

Green rows are value-add business logic tasks.

Project output increased from 3 to 32 projects per


year. Sprint time reduced to 1.5 months from 6
months and headcount per project from 15 to 5 FTE.
¹ Top-Tier Investment Bank, 2018-2021 comparative productivity benchmark with streaming analytics versus conventional SQL database,
project rate measured. Details of study are internal, redacted results approved to share. Applications tested averaged 4 venue connections and
12 use cases of trade data processing.

QUANT TRADING DATA MANAGEMENT BY THE NUMBERS


1 LINE OF CODE

As Steve Jobs said, the fastest, most


efficient, easiest-to-debug line of code is "The fastest,
most efficient,
the line you never write. most error-
free, easiest to
Domain specific programming languages maintain code
are popular for quantitative trading is the code
you never
because, for time series data, they’re
Mark Palmer, with Midjourney write."
designed to eliminate looping structures, - Steve Jobs

which makes them up to 100 times more


efficient.

LOADING, JOINING, AND ANALYZING DATA


WAS NOT ONLY FASTER AND REQUIRED LESS
MEMORY, IT COULD ALSO BE DONE WITH FAR
LESS CODE.* ¹ Nick Psaris, Q Tips: Fast, Scalable and Maintainable Kdb+

Domain specific languages fit the


domain in which they’re used. In the
case of KX, hundreds of lines of code
are often replaced with a single
operator.

QUANT TRADING DATA MANAGEMENT BY THE NUMBERS


7 &7 INNOVATIVE
TRADING APPS
IDEAS YOU
CAN STEAL

In the capital markets, real-time data isn’t oil, its oxygen. We’ll explore seven
innovative case studies and the secrets of their success.

QUANT RESEARCH EXECUTION


& DIGITAL TWIN ANALYTICS
A real-time view of the market for as- Analyze trade prices to determine
if comparisons favorable pricing and cost

ALGORITHMIC CONTINUOUS
TRADING SURVEILLANCE
Trade execution in real time with Continuously anticipate anomalies and
historical context proactively maintain compliance

REAL-TIME TICKER GENERATIVE AI


PLANT QUANT RESEARCH
A real-time view of trillions of market Explore edge cases and uncover novel
data and order events a day strategies

MACHINE LEARNING
AS A SERVICE To explore these, download Seven Innovative
Trading Apps and Seven Best Practices You
Can Steal eBook
Self-service access to high-frequency
and historical data via Python
1 5 5
MILLISECOND
VS.

MINUTES
WITH

TIMES MORE DATA

Conventional data technologies aren’t designed for streaming data, or will


require careful and sophisticated programming to force-fit quantitative data
inside. Choose carefully.
Time-series databases designed for streaming data are an essential tool for any
quantitative trading desk. They’re designed for streaming data, organizing it by
time, and supporting high-speed ingest.

CONVENTIONAL
CONVENTIONAL DATABASES ARE OF DATABASES ARE
LIMITED USE FOR QUANT TRADING A BAD FIT FOR
QUANT TRADING
Data technologies like SQL, Snowflake, or NoSQL databases weren’t built to
manage streaming time-series data. While it’s possible to store quant trading
data in a store like this, it’s sub-optimal and may lead to systems that are unable
to answer trading questions in a reasonable timeframe.

This is because conventional databases are designed to manage rows and columns of data,
graphs, or purely unstructured data.

Quantitative trading requires data structured in temporal order to provide not only the ingest rates required, but also query
performance. Because five minutes isn’t fast enough.

The nature of data is changing

1970s 1990s 2000s 2010s 2020s

Structured Multi-dimensional Unstructured Streaming Vector / GenAI

QUANT TRADING DATA MANAGEMENT BY THE NUMBERS


LEARN MORE AT KX.COM

You might also like