Numerical Differentiation
Numerical differentiation deals with approximating the derivative of a mathematical function or discrete data. It is fundamentally derived from Taylor series expansions.
Derivation from Taylor Series
The mathematical foundation for numerical differentiation is the Taylor series. By expanding around , we get:
Solving for yields the finite divided difference formulas. The truncated terms represent the theoretical truncation error of the approximation.
Finite Divided Differences for the First Derivative
The simplest approximations of the first derivative are based on truncating the Taylor series after the first derivative term.
Common Finite Differences
- Forward difference: (Truncation Error )
- Backward difference: (Truncation Error )
- Centered difference: (Truncation Error )
Note
The centered difference formula is mathematically more accurate, possessing a truncation error of , compared to for the standard forward and backward differences. This is because the term in the Taylor series exactly cancels out when subtracting the forward and backward expansions.
Formulas for Higher-Order Derivatives
By manipulating Taylor series expansions for multiple points (e.g., ), we can derive formulas for higher-order derivatives. The centered difference formulas for the second and third derivatives are:
Higher-Order Centered Differences
- Second Derivative: with error
- Third Derivative: with error
High-Accuracy Differentiation Formulas
Higher-accuracy formulas can be generated by including more terms from the Taylor series expansion. For example, a more accurate forward difference formula requires points , , and , achieving without using a centered span.
Richardson Extrapolation
Richardson extrapolation is an elegant method to improve the accuracy of a derivative estimate by combining two less accurate estimates computed with different step sizes.
Generalized Richardson Extrapolation
If is an approximation of order , we can combine estimates using step sizes and (where , typically ) to eliminate the leading error term and obtain a higher-order estimate :
For centered differences () and halving the step size (), this reduces to the classic formula:
Which increases the accuracy from to .
Formulas for Unequally Spaced Data
In practice, experimental data points are often not evenly spaced. In this case, standard finite difference formulas cannot be applied. Instead, a Lagrange interpolating polynomial is fit through the adjacent points and then differentiated.
Unequally Spaced Differences
For three data points , let and . The derivative is given by:
When , this formula correctly collapses to the standard centered difference.
Errors in Numerical Differentiation and Condition Number
Numerical differentiation is inherently unstable and represents an ill-conditioned problem. While reducing the step size initially decreases the truncation error, it significantly magnifies the round-off error.
Condition Number and Loss of Significance
The numerical calculation of a derivative requires subtracting two nearly equal function values, , and dividing by a very small number, . This process is highly susceptible to loss of significance (subtractive cancellation).
The condition number for numerical differentiation scales proportionally to . As , the condition number approaches infinity, meaning the problem becomes infinitely sensitive to the finite precision limits of floating-point arithmetic. Thus, total error forms a "U-shaped" curve: decreasing beyond an optimal point causes the round-off error to dominate, drastically degrading accuracy.
Caution
Because numerical differentiation amplifies error by a factor proportional to , differentiating raw, scattered data directly often leads to useless results. The noise completely masks the true derivative.
Data Smoothing Before Differentiation
Because of the ill-conditioned nature of numerical differentiation, raw experimental data must almost always be smoothed before derivatives are taken.
Procedure
- Visual Inspection: Plot the data to identify the level of noise and potential outliers.
- Smoothing or Regression: Apply a low-pass filter (like a moving average) or fit a smooth curve (like a low-order polynomial regression or a smoothing spline) to the data.
- Differentiation: Differentiate the fitted curve analytically, or apply numerical differentiation to the smoothed data points.
Partial Derivatives
For functions of multiple variables, partial derivatives are approximated by holding all other variables constant and applying the standard finite difference formulas to the variable of interest.
Advanced Differentiation Techniques
Beyond finite differences, modern engineering and machine learning heavily rely on computational differentiation techniques to avoid round-off errors and the need for analytical derivations.
Complex Step Differentiation
If is an analytic function, passing a complex variable into it can remarkably compute the real derivative without subtractive cancellation (round-off) errors. The formula uses a very small complex step :
Since there is no subtraction in the numerator, can be chosen as small as machine epsilon (e.g., ) to achieve near-exact precision without catastrophic cancellation.
Automatic Differentiation (AD)
AD computes exact derivatives by systematically applying the chain rule to the elementary operations (addition, multiplication, trigonometric functions) that make up a computer program. It is neither symbolic differentiation (which produces massive equations) nor numerical differentiation (which suffers from truncation and round-off error). AD allows for exact gradient evaluations at the cost of one forward evaluation, which is the foundational technology behind modern deep learning frameworks.
Key Takeaways
- Finite difference formulas are directly derived from the Taylor series expansion.
- Derivatives can be approximated using forward, backward, or centered finite divided differences. Truncation errors are derived from the unused Taylor series terms.
- For unequally spaced data, the derivative is derived from differentiating a Lagrange interpolating polynomial over adjacent points.
- Centered differences are generally more accurate () than forward or backward differences () due to the cancellation of the terms in the Taylor series.
- Formulas for higher-order derivatives (2nd, 3rd) and high-accuracy formulas involve more neighboring data points.
- Richardson extrapolation combines two estimates of lower accuracy to produce one of higher accuracy, with the generalized formula eliminating the leading error term.
- Numerical differentiation is an inherently ill-conditioned problem because the condition number scales with . It suffers from severe loss of significance (subtractive cancellation) when becomes too small.
- Because of noise amplification, data smoothing or regression must almost always precede the numerical differentiation of experimental data.
- Complex step differentiation avoids subtractive cancellation entirely, allowing extremely small step sizes.
- Automatic Differentiation (AD) provides exact analytical derivatives numerically by programmatically applying the chain rule to atomic operations.