tsfresh: Automatic Feature Extraction from Time Series

tsfresh automates feature engineering for time series. You hand it a DataFrame of sequences — sensor readings, logs, financial ticks — and it calculates hundreds of statistical features per series, then optionally runs hypothesis tests to prune the irrelevant ones before you touch a model.

Why I starred it

Feature engineering on time series is mostly miserable. You pick a handful of aggregates by hand — mean, std, maybe a lag — train, get mediocre results, and spend days iterating on transformations that should have been systematic. tsfresh makes that systematic.

What got my attention wasn't the feature count. It's that the library pairs extraction with statistically principled filtering. Most "auto-feature" libraries just dump everything and leave you to sort it out. tsfresh uses the FRESH algorithm (Feature extraction based on Scalable Hypothesis tests) to evaluate each feature's relevance to the target through individual significance tests, with p-value correction for multiple comparisons. That's a real methodological choice, not just a feature list.

How it works

The core lives in tsfresh/feature_extraction/feature_calculators.py. Every calculator is a plain Python function decorated with set_property:

@set_property("fctype", "simple")
def abs_energy(x):
    return np.dot(x, x)

@set_property("fctype", "combiner")
def agg_autocorrelation(x, param):
    # returns list of (key, value) pairs for each lag in param
    ...

The fctype property is the core contract. "simple" functions return a scalar; "combiner" functions return (key, value) pairs for parameterized features computed in a single pass — e.g., autocorrelation at multiple lags without recomputing the base. The extraction engine in extraction.py discovers all calculators by introspecting the module for functions with an fctype attribute, which means adding a new feature is just writing a decorated function. No registration, no config files.

Feature settings live in settings.py as plain dicts. ComprehensiveFCParameters is the full set (700+ features with their parameter grids). MinimalFCParameters is the subset tagged @set_property("minimal", True) — useful when you want a quick pass. You can also reverse-engineer settings from an existing feature DataFrame using from_columns(), which parses the double-underscore column name convention (<kind>__<feature>__<param>=<value>) back into calculator configs.

Parallelization runs through tsfresh/utilities/distribution.py. The DistributorBaseClass provides a map_reduce interface; concrete subclasses are MultiprocessingDistributor (multiprocessing.Pool), MapDistributor (single-threaded), and ApplyDistributor (for custom executors). Dask is supported if you install it. The extraction function in extract_features() accepts a distributor= kwarg so you can swap compute backends without changing anything else.

The filtering step is where the stats paper becomes code. tsfresh/feature_selection/significance_tests.py implements four test functions depending on the feature/target type combination:

binary target + binary feature → Fisher's exact test
binary target + real feature → Mann-Whitney U
real target + binary feature → Kolmogorov-Smirnov
real target + real feature → Kendall's tau

Each returns a p-value. selection.py applies Benjamini-Hochberg correction across all features to control the false discovery rate. It's a clean separation: hypothesis testing is completely isolated from feature computation.

Using it

The typical workflow:

from tsfresh import extract_features, select_features
from tsfresh.utilities.dataframe_functions import impute

# timeseries: DataFrame with 'id', 'time', 'value' columns
X = extract_features(
    timeseries,
    column_id="id",
    column_sort="time",
    n_jobs=4,
)

impute(X)  # handles NaN from features that don't apply to all series

# y: Series of target values indexed by id
X_filtered = select_features(X, y)

On a machine with 4 cores, a batch of ~1000 time series of length 100 completes in a few minutes with ComprehensiveFCParameters. The progress bar from tqdm is on by default; disable with disable_progressbar=True.

For sklearn pipelines there's a RelevantFeatureAugmenter transformer in tsfresh/transformers/ that wraps the full extract+filter pipeline in a fit/transform interface.

Rough edges

The dependency footprint is heavy: stumpy, pywt, statsmodels, scipy, numpy, pandas, and optionally matrixprofile. Install time is significant. The matrixprofile calculators are gated behind an optional dependency and actually require Python 3.8 specifically — the README says to create a separate conda env for them. That's a real limitation if you're on a newer Python and want those features.

Extraction on large datasets with ComprehensiveFCParameters is slow even with parallelism. The parameterized combiners help (one pass for multiple lags), but you're still computing 700+ features. The practical workflow is usually to run comprehensive extraction once, save the filtered feature set, and use from_columns() to reproduce only those features in production.

The docs are good for an academic-origin project, but the ReadTheDocs site is the primary reference — the inline docstrings are thorough but assume familiarity with the underlying statistics. If you don't know what the Augmented Dickey-Fuller test is, you'll encounter features you can't interpret.

Bottom line

If you work with time series classification or regression and do feature engineering manually, tsfresh is worth running at least once to see what you're missing. It's most valuable in exploratory phases where you don't know which features matter — the hypothesis testing filter is what turns a dump of 700 numbers into something useful.