Python methods for numerical differentiation of noisy data, including multi-objective optimization routines for automated parameter selection.
PyNumDiff is a Python package that implements various methods for computing numerical derivatives of noisy data, which can be a critical step in developing dynamic models or designing control. There are seven different families of methods implemented in this repository:
- convolutional smoothing followed by finite difference calculation
- polynomial-fit-based methods
- iterated finite differencing
- total variation regularization of a finite difference derivative
- Kalman (RTS) smoothing
- basis-function-based methods
- linear local approximation with linear model
Most of these methods have multiple parameters, so we take a principled approach and propose a multi-objective optimization framework for choosing parameters that minimize a loss function to balance the faithfulness and smoothness of the derivative estimate. For more details, refer to this paper.
Dependencies are listed in pyproject.toml. They include the usual suspects like numpy
and scipy
, but also optionally cvxpy
.
The code is compatible with >=Python 3.10. Install from PyPI with pip install pynumdiff
, from source with pip install git+https://github.com/florisvb/PyNumDiff
, or from local download with pip install .
. Call pip install pynumdiff[advanced]
to automatically install optional dependencies from the advanced list, like CVXPY.
For more details, read our Sphinx documentation. The basic pattern of all differentiation methods is:
somethingdiff(x, dt, **kwargs)
where x
is data, dt
is a step size, and various keyword arguments control the behavior. Some methods support variable step size, in which case the second parameter is renamed _t
and can receive either a constant step size or an array of values to denote sample locations.
You can provide the parameters:
from pynumdiff.submodule import method
x_hat, dxdt_hat = method(x, dt, param1=val1, param2=val2, ...)
Or you can find parameter by calling the multi-objective optimization algorithm from the optimize
module:
from pynumdiff.optimize import optimize
# estimate cutoff_frequency by (a) counting the number of true peaks per second in the data or (b) look at power spectra and choose cutoff
tvgamma = np.exp(-1.6*np.log(cutoff_frequency) -0.71*np.log(dt) - 5.1) # see https://ieeexplore.ieee.org/abstract/document/9241009
params, val = optimize(somethingdiff, x, dt, tvgamma=tvgamma, # smoothness hyperparameter which defaults to None if dxdt_truth given
dxdt_truth=None, # give ground truth data if available, in which case tvgamma goes unused
search_space_updates={'param1':[vals], 'param2':[vals], ...})
print('Optimal parameters: ', params)
x_hat, dxdt_hat = somethingdiff(x, dt, **params)
If no search_space_updates
is given, a default search space is used. See the top of _optimize.py
.
The following heuristic works well for choosing tvgamma
, where cutoff_frequency
is the highest frequency content of the signal in your data, and dt
is the timestep: tvgamma=np.exp(-1.6*np.log(cutoff_frequency)-0.71*np.log(dt)-5.1)
. Larger values of tvgamma
produce smoother derivatives. The value of tvgamma
is largely universal across methods, making it easy to compare method results. Be aware the optimization is a fairly heavy process.
Much more extensive usage is demonstrated in Jupyter notebooks:
- Differentiation with different methods: 1_basic_tutorial.ipynb
- Parameter Optimization with known ground truth (only for demonstration purpose): 2a_optimizing_parameters_with_dxdt_known.ipynb
- Parameter Optimization with unknown ground truth: 2b_optimizing_parameters_with_dxdt_unknown.ipynb
- Automatic method suggestion: 3_automatic_method_suggestion.ipynb
.github/workflows
contains.yaml
that configures our GitHub Actions continuous integration (CI) runs.docs/
containsmake
files and.rst
files to govern the waysphinx
builds documentation, either locally by navigating to this folder and callingmake html
or in the cloud byreadthedocs.io
.examples/
contains Jupyter notebooks that demonstrate some usage of the library.pynumdiff/
contains the source code. For a full list of modules and further navigation help, see the readme in this subfolder..editorconfig
ensures tabs are displayed as 4 characters wide..gitignore
ensures files generated by localpip install
s, Jupyter notebook runs, caches from code runs, virtual environments, and more are not picked up bygit
and accidentally added to the repo..pylintrc
configurespylint
, a tool for autochecking code quality..readthedocs.yaml
configuresreadthedocs
and is necessary for documentation to get auto-rebuilt.CITATION.cff
is citation information for the Journal of Open-Source Software (JOSS) paper associated with this project.LICENSE.txt
allows free usage of this project.README.md
is the text you're reading, hello.linting.py
is a script to runpylint
.pyproject.toml
governs how this package is set up and installed, including dependencies.
See CITATION.cff file as well as the following references.
@article{PyNumDiff2022,
doi = {10.21105/joss.04078},
url = {https://doi.org/10.21105/joss.04078},
year = {2022},
publisher = {The Open Journal},
volume = {7},
number = {71},
pages = {4078},
author = {Floris van Breugel and Yuying Liu and Bingni W. Brunton and J. Nathan Kutz},
title = {PyNumDiff: A Python package for numerical differentiation of noisy time-series data},
journal = {Journal of Open Source Software}
}
@article{ParamOptimizationDerivatives2020,
doi={10.1109/ACCESS.2020.3034077}
author={F. {van Breugel} and J. {Nathan Kutz} and B. W. {Brunton}},
journal={IEEE Access},
title={Numerical differentiation of noisy data: A unifying multi-objective optimization framework},
year={2020}
}
We are using GitHub Actions for continuous intergration testing.
Run tests locally by navigating to the repo in a terminal and calling
> pytest -s
Add the flag --plot
to see plots of the methods against test functions. Add the flag --bounds
to print
This project utilizes the MIT LICENSE. 100% open-source, feel free to utilize the code however you like.