Time Series Database
Time Series Database
A time series database (TSDB) is a software system that fashion. These series may be organized hierarchically
is optimized for handling time series data, arrays of num- and optionally have companion metadata available with
bers indexed by time (a datetime or a datetime range). In them. The server often supports a number of basic calcu-
some elds these time series are called proles, curves, orlations that work on a series as a whole, such as multiply-
traces. A time series of stock prices might be called a ing, adding, or otherwise combining various time series
price curve. A time series of energy consumption might into a new time series. They can also lter on arbitrary
be called a load prole. A log of temperature values over patterns dened by the day of the week, low value lters,
time might be called a temperature trace. high value lters, or even have the values of one series
Despite the disparate names, many of the same mathe- lter another. Some TSDBs also build in additional sta-
matical operations, queries, or database transactions are tistical functions that are targeted to time series data.
useful for analysing all of them. The implementation of For example, consider the following hypothetical time
a database that can correctly, reliably, and eciently im- series or prole expression:
plement these operations must be specialized for time- SELECT nymex/gold_price * nymex/gold_volume
series data.
TSDBs are databases that are optimized for time series To analyze this, the TSDB would join the two series
data. Software with complex logic or business rules and
nymex/gold_price and nymex/gold_volume based on the
high transaction volume for time series data may not be overlapping areas of time for each, multiply the values
practical with traditional relational database management
where they intersect, and then output a single composite
systems. Flat le databases are not a viable option ei- time series.
ther, if the data and transaction volume reaches a maxi-
mum threshold determined by the capacity of individual More complex expressions are allowed. TSDBs often al-
servers (processing power and storage capacity). Queries low users to manage a repository of lters or masks that
for historical data, replete with time ranges and roll ups specify in some way a pattern based on the day of a week
and arbitrary time zone conversions are dicult in a re- and a set of holidays. In this way, one can readily assem-
lational database. Compositions of those rules are even ble time series data. Assuming such a lter exists, one
more dicult. This is a problem compounded by the free might hypothetically write
nature of relational systems themselves. Many relational SELECT onpeak( cellphoneusage )
systems are often not modelled correctly with respect to
time series data. TSDBs on the other hand impose a
model and this allows them to provide more features for which would extract out the time series of cellphoneusage
doing so. that only intersects that of 'onpeak'. Some systems might
generalize the lter to be a time series itself.
Ideally, these repositories are often natively implemented
using specialized database algorithms. However, it is pos- This syntactical simplicity drives the appeal of the TSDB.
sible to store time series as binary large objects (BLOBs) For example, a simple utility bill might be implemented
in a relational database or by using a VLDB approach using a query such as:
coupled with a pure star schema. Eciency is often im- SELECT MAX( onpeak( powerusagekw ) ) * de-
proved if time is treated as a discrete quantity rather than mand_charge; SELECT SUM( onpeak( powerusagekwh
as a continuous mathematical dimension. Database joins ) ) * energy_charge;
across multiple time series data sets is only practical when
the time tag associated with each data entry spans the
TSDBs also generally have conversions to and from spe-
same set of discrete times for all data sets across which
cic time zones implemented at the server level.
the join is performed.
1 Overview
1
2 6 REFERENCES
6 References
4 Example TSDB Systems
[1] Canary Labs
The following list of open source and commercial systems
are believed to provide specialised support for time series
data. Some NoSQL systems may also claim support for
this type of data, but if they are not explicitly designed
with that use case in mind then they will not be listed
here.
Druid
InuxDB
SiteWhere
tsdb
4.2 Proprietary
eXtremeDB Financial Edition
Geras
7.2 Images