Sample points for linear interpolation¶
We want to approximate some function $f: [a, b] \to \mathbf{R}$ over a finite interval $[a, b]$ using linear interpolation. Linear interpolation requires us to select a set of sample points $x = (x_0, x_1, \ldots, x_n)$. Assume $x_0 = a$ and $x_n = b$.
Suppose we do not have a closed form for $f$, but we can sample $f$ at a set of points $x$. Evaluating $f$ may be expensive, so we probably only want to sample a small number of $x$ vectors.
This scenario can arise, for example, if
- you're trying to fit data with a piecewise linear function, but are initially unsure which sample points will lead to the "best" approximation, or
- you're writing code that would ideally plot functions "accurately" with minimal user input.
We won't rigorously define what we mean by "best" or "accurately". Instead, we'll rely on the "eyeball norm".
Example function¶
Suppose we want a piecewise linear approximation to $f(x) = x^\alpha$ over the interval $[0,1]$ with something like $\alpha = .23$.
If $n=10$ is fixed, which set of points $x = (x_0, x_1, \ldots, x_n)$ gives us the "best" approximation?
We'll start with evenly spaced sample points.
import numpy as np
import matplotlib.pyplot as plt
def f(x):
y = x**.23
return y
n = 10
x = np.linspace(0, 1, n+1)
y = f(x)
x
array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])
Below, we compare the resulting piecewise linear (PWL) function against the "true" $f$ (which we plot here using a much finer granularity sampling).
The PWL function seems to be a reasonable approximation on the interval $[0.1, 1]$, but is less accurate around the interval $[0, 0.1]$.
Can we obtain a better PWL approximation with the same number of sample points?
def plot_helper(f, x, y, title=''):
x_fine = np.linspace(x[0], x[-1], 10000)
y_fine = f(x_fine)
fig, ax = plt.subplots(dpi=150)
ax.plot(x, y, '-o', label='PWL', alpha=.8)
ax.plot(x_fine, y_fine, label='true', alpha=.5)
ax.legend();
ax.set_aspect('equal') # important!
if title:
ax.set_title(title)
plot_helper(f, x, y)
Idea¶
We'll refer to this first PWL function as $f^1$, or by the pair of vectors $(x^1, y^1)$.
One problem with the $f^1$ approximation seen above is the large line segment from $(x^1_0, y^1_0)$ to $(x^1_1, y^1_1)$. One desirable property of our interpolation might be to have equal arc lengths of $f$ between adjacent sample points. Unfortunately, we don't have direct access to the curve $f$.
Instead, let's use $f^1$ as an approximation of $f$, and resample points $x^2$ such that the arc lengths of $f^1$ over the segments given by $x^2$ are all equal.
Intuitively, for the example above, this should give us a denser sampling over the interval $[0, 0.1]$.
We can then iterate this process to hopefully converge on a PWL approximation of $f$ which has equal arc lengths over each segment.
Implementation¶
The function resample(x,y)
below takes in a PWL function given by the pair of vectors
$f^1 = (x^1, y^1)$ and returns a resampling $x^2$ such that the segments given by
$x^2$ have equal arc length with respect to $f^1$.
Once we've computed $x^2$, we can evaluate $y^2_i = f(x^2_i)$ to get our new PWL function $f^2 = (x^2, y^2)$.
Note that resample(x,y)
only has knowledge of the PWL $f^1$; it has no other information
about the true $f$.
We only use $f$ to compute the new $y^2$.
We then iteratively apply resample(x,y)
to get a sequence $(x^k, y^k)$.
Also note that throughout the iteration, the endpoints do not change:
$$ \begin{gather*} [a, b] = \left[x^k_0, x^k_n \right] \\ [f(a), f(b)] = \left[y^k_0 , y^k_n\right]. \end{gather*} $$def resample(x,y):
# compute the lengths of the segments of f^1
s = np.sqrt(np.diff(x)**2 + np.diff(y)**2)
# c[i] gives the total length of f^1 over the interval
# (x[0], x[i])
c = np.array([0] + list(np.cumsum(s)))
# Evenly sample the unit interval [0,1]
# with N + 1 points where U[0] = 0
# and U[N] = 1.
# Take only the interior points (U[1], ..., U[n-1])
U = np.linspace(0,1,len(x))[1:-1]
# total length of the curve f^1
L = c[-1]
out = [x[0]]
for u in U:
v = L*u
i = np.searchsorted(c, v)
theta = (v - c[i-1])/(c[i] - c[i-1])
x2 = x[i]*theta + x[i-1]*(1-theta)
out += [x2]
out += [x[-1]]
return np.array(out)
We can see the result of resample
in the figure below.
The blue curve corresponds to $f^1 = (x^1, y^1)$.
The green lines give the locations of the resampled points $x^2$. We can see that the new sampling is indeed denser around $0$.
x_new = resample(x, y)
fig, ax = plt.subplots(dpi=150)
ax.plot(x, y, '-o', label='old')
ax.vlines(x_new, 0, 1, alpha=.3, color='green', label='new')
ax.set_aspect('equal')
ax.legend(loc='lower right')
<matplotlib.legend.Legend at 0x1130a8e50>
for k in range(13):
if (k in [1,2]) or k % 3 == 0:
plot_helper(f, x, y, f'Step: {k}')
x = resample(x,y)
y = f(x)
Logistic function¶
Here we'll try a function with a different shape and having curve endpoints at something other than $(0,0)$ and $(1,1,)$, which is what we had in the example above.
def f(x):
y = 5/(1+np.exp(-10*x)) - 1
return y
n = 10
x = np.linspace(-2, 3, n+1)
y = f(x)
for k in range(5):
plot_helper(f, x, y, f'Step: {k}')
x = resample(x,y)
y = f(x)
Interestingly, step 1 seems to be the best (to my eye) approximation of the function, but our iteration sails right past it. This is probably a good point at which you might re-examine your definition of "best" approximation, and evaluate whether this iterative algorithm does a good job of acheiving it. For all its virtues, the eyeball norm does tend to be a bit subjective.
Conclusion¶
It probably isn't hard to come up with pathological examples of functions where this iteration fails to converge and gives garbage output, but for simple, well-behaved functions like the ones above, it seems to be fairly robust.
Also, there are probably conditions under which you can prove that the iteration converges. Monotone and continuous? Is the iteration operator a contraction?
I wonder if something like this (or something smarter) is used in plotting libraries to produce nice-looking plots. Even if this iteration breaks over the entire function domain, it might still be useful on well-behaved subsets.