Numpy implementation of Steinarsson’s Largest-Triangle-Three-Buckets algorithm for downsampling time series–like data while retaining the overall shape and variability in the data
LTTB is well suited to filtering time series data for visual representation, since it reduces the number of visually redundant data points, resulting in smaller file sizes and faster rendering of plots.
Note that it is not a technique for statistical aggregation, cf. regression models or non-parametric curve fitting / smoothing.
This implementation is based on the original JavaScript code at https://github.com/sveinn-steinarsson/flot-downsample and Sveinn Steinarsson’s 2013 MSc thesis Downsampling Time Series for Visual Representation.
Licence: MIT
Install the lttb
package into your (virtual)
environment:
$ pip install lttb
The function lttb.downsample()
can then be used in your
Python code:
import numpy as np
import lttb
# Generate an example data set of 100 random points:
# - column 0 represents time values (strictly increasing)
# - column 1 represents the metric of interest: CPU usage, stock price, etc.
data = np.array([range(100), np.random.random(100)]).T
# Downsample it to 20 points:
small_data = lttb.downsample(data, n_out=20)
assert small_data.shape == (20, 2)
A test data set is provided in the source repo in
tests/timeseries.csv
. It was downloaded from http://flot.base.is/ and converted from
JSON to CSV.
This is what it looks like, downsampled to 100 points:
By default, downsample()
checks that the input data
satisfies the following constraints:
These checks can be skipped (e.g. if you know that your data will always meet these conditions), or additional checks can be added (e.g. that the time values must be evenly spaced), by passing in a different list of validation functions, e.g.:
# No input validation:
small_data = lttb.downsample(data, n_out=20, validators=[])
# Stricter check on x values:
from lttb.validators import *
small_data = lttb.downsample(data, n_out=20, validators=[has_two_columns, x_is_regular])
pyproject.toml
and
packaged with Flit.downsample()
raises
ValueError
if input data contains NaN values. This can be
disabled by removing contains_no_nans()
from the list of
validators.setup.py
was fixed so that this package can be
installed in Python 2 again.setuptools_scm
rather than bumpversion
.If you find a bug or have an idea for improving this package, please describe it in a message to the mailing list.
Patches are welcome. Feel free to send them to mailing list using
git send-email
, or you can send me a link to your repo if
it is publicly accessible. If you prefer the pull request workflow, you
can also send me a PR at https://codeberg.org/javiljoen/lttb-numpy.
Please ensure that the tests and linting checks listed in the
Makefile
all pass, and that any new features are covered by
tests.
Create a Python virtual environment, e.g. using
python3 -m venv
. In that venv, install the dependencies and
development tools:
pip install -e .[test,dev]
The linters and tests can then be run with the commands in the
Makefile
:
make lint
make test