Bézier splines

The SVG specification provides for the inclusion of quadratic and cubic Bézier curves as general curve shapes. These curves are a variety of spline – piecewise polynomial curves – with a relatively intelligible specification of their control points. The general form of each portion of such a curve is given by a polynomial, with vector-valued coefficients, in a scalar parameter whose variation within some range causes the polynomial's value to traverse the given portion of the curve. However, the vectors needed as coefficients to such a polynomial aren't very intuitively related to where the curve is going and what it's doing in the given portion of the curve: in practice, one wishes to specify the latter in terms of some points on the curve and some indication of which way it's going as it passes through those points. The Bézier formalism lets one specify a curve in the latter terms, without having to work out the polynomial coefficients this implies.

Each portion of the curve, formally a function f from its parameter to positions in a vector space, is generated by the curve parameter, x, varying between two adjacent integers, n and 1+n; the curve's position is specified at each integer value for x, so we are given f(n) and f(1+n) and, for n<x<1+n, f is a polynomial. Normally, one writes a polynomial in the form a +b.x +c.x.x +…, but we're only interested in some interval n<x<1+n and the relationship between a, b, c, … and the values our polynomial takes at the ends of this interval are somewhat messy. It thus makes sense to look for other ways to express our polynomial.

Homogeneous polynomials

If we think of the simple linear case, we can interpolate between the values at n and 1+n by multiplying the latter by x−n and the former by 1+n−x and adding; one multiplier is 0 and the other 1 at each end of the interval, so we get the right answers; and each term is linear, hence so is the sum. For higher degree cases, we can use powers and products of these simple linear polynomials, x−n and 1+n−x, analogously: it turns out that we can write any polynomial of degree m, in the interval n<x<1+n, as a sum of terms, each of which has m factors, some of x−n and the rest of 1+n−x. For example, we would write 1 +3.x +7.x.x in the interval 2<x<3 as 73.(x−2).(x−2) +101.(x−2).(3−x) +35.(3−x).(3−x) – which makes it very evident that it takes the values 73 at 3 and 35 at 2.

When each term, in a polynomial in two variables, has equal sum of powers of the two variables, the polynomial is described as homogeneous. In our case, the two variables are themselves linear polynomials in a common variable, but we can still use the same terminology. When we write a polynomial in this homogeneous form, using powers of the ordinate's differences from the end-points of an interval, the function's values and successive derivatives may be straightforwardly computed from the coefficients. Casteljau (and Bézier) chose to include a combinatorial factor, chose(m, i) = m!/i!/(m−i)!, in the coefficients, as this happens to lead to some nice properties that I'll explore below.

Casteljau's (or Bézier's) polynomials

Define:

and observe that

B(c)'(t)
= sum(: c(i).chose(m, i).i.power(i−1, t).power(m−i, 1−t) −c(i).chose(m, i).power(i, t).(m−i).power(m−i−1, 1−t) ←i :1+m)
= m.sum(: c(i).chose(m−1, i−1).power(i−1, t).power(m−i, 1−t) ←i, i>0 :1+m) −m.sum(: c(i).chose(m−1, i).power(i, t).power(m−1−i, 1−t) ←i :m)
= m.B((: c(j+1) ←j :m), t) −m.B((:c:m), t)

So B(c)' = m.(B(: c(j+1) ←j :m) −B(:c:m)) = m.B(: c(j+1)−c(j) ←j :m) and differentiation is mapped to taking finite differences and scaling by the polynomial's order. In particular, if we use a list ({vectors}: c |m) to describe the curve f = (: B(c, x−n) ←x :) in the interval n≤x≤1+n, f'(n)/m = B'(c, 0)/m = c(1) −c(0) and f'(1+n)/m = c(m)−c(m−1). Thus not only are the end-points of c the end-points of the curve (segment) but the next points inwards are on the curve's tangents at these points, at a position along the curve corresponding to parameter change 1/m.

Naturally, we can differentiate again: this time, we get B(c)'' = m.(B(: c(j+1) ←j :m)' −B(:c:m)') = m.(m−1).(B(: c(j+2) ←j :m−1) −B(: c(j+1) ←j :m−1) +B(:c:m−1)), whence f''(n)/m/(m−1) = B''(c, 0)/m/(m−1) = c(2) −2.c(1) +c(0) and f''(1+n)/m/(m−1) = c(m) −2.c(m−1) +c(m−2). Thus if we look at the m−1 vectors between consecutive coefficients from c and think of each as a nominal f' value, at parameter values 1/(m−1) apart through the interval, then the change between the f' values described by the first two is just the second derivative at the start of the interval times the nominal difference in parameter value for which they provide f' values; likewise for the last two such vectors and f''(1+n).

This continues simply: the i-th derivative at n is given by the first 1+i entries in c and that at 1+n by the last 1+i entries in c; if we compute a nominal i-th derivative in the same way from the first 1+i entries after c(0) and the last 1+i entries before c(m), associating these with a parameter value 1/(1+m−i) from the relevant end, inside the interval, then the rate of change this implies for the i-th derivative does indeed yield the (1+i)th derivative at the relevant end. The i-th derivative at n is, indeed,

with a similar formula giving the i-th derivative at 1+n. In particular, the m-th derivative is sum(: power(m−j, −1).c(j)/j!/(m−j)! ←j :1+m) and constant throughout the interval.

The result of this is that the entries in c all lie near the curve (which makes it practical to intuit the right points to select to get any given curve). The polygon formed by connecting them in order, and connecting c(m) back to c(0), is known as the Bézier hull of the curve.

Transitions between intervals

Because each c(i) is effectively associated with a parameter value n+i/m, it is natural to describe the entire curve (not just the interval from n to 1+n) in terms of a list, q, of vectors at nominal parameter values 1/m apart; c is then simply the sub-list (: q(i+n.m) ←i :1+m). The entries at indexes that are multiples of m, q(j.m), really are the compound curve's positions at their nominal parameter values, j; the others are reasonably close.

The differences between adjacent entries in q give nominal gradients; but adjacent segments don't necessarily agree about the gradient at their common end-point; nothing forces q(1+m.n)−q(m.n) to be equal to q(m.n)−q(m.n−1). Dividing each by m, the latter gives the derivative of the curve as it approaches parameter value n from below; the former gives the derivative as parameter value approaches n from above. Only if they are equal is the derivative (formally defined, let alone) continuous at n. Likewise, each segment gives a value for each higher derivative on its side of each end of its parameter interval; only if it agrees with the segment on the other side of that end-point is the given derivative continuous at the end-point.

However, given the simplicity of the relationship between the vectors in q and the relevant derivatives, it's fairly straightforward to ensure continuity of derivatives across the boundaries between segments. Of course, if we insist on continuity of all derivatives, or indeed of the first m, we'll get a simple polynomial curve – described differently on each interval, but none the less always the same polynomial. To have the flexibility to take our curve to assorted places we want it to visit, we have to sacrifice continuity of at least the m-th derivative; but we can ensure continuity of the first i derivatives for any i < m.


Valid CSSValid HTML 4.01 Written by Eddy.