Conic Sections

A conic section is a curve obtained as the intersection of a cone and a plane.

Characterization

The given domain in which this is to happen is three-dimensional. A cone is, for these purposes, understood to be double-ended, in the sense that it doesn't stop at its vertex (like a dunce's cap does); after the radius of the cone has shrunk to zero at the vertex, it grows again on the other side. Chose co-ordinates centred on the cone's vertex, using the axis of rotation of the cone as z-axis; then the cone is the set of points with [x,y,z] in {[r.Cos(a), r.Sin(a), k.r]: r is real, a is an angle} for some non-zero constant real k; this cone can equally be characterized as x² +y² = (z/k)².

A general plane is characterized by u.x +v.y +w.z = h for some constants u, v, w and h, with at least one of u, v and w non-zero, so its intersection with the cone is given by u.r.Cos(a) +v.r.Sin(a) +w.k.r = h. However, we're going to want to know what the intersection looks like as a sub-set of the plane, so we'll want a description of it in terms of some co-ordinates that make sense in the plane, chosen so as to describe the intersection reasonably simply. Also, while we specified the z axis to be the cone's axis of rotational symmetry, we still have freedom to chose our y and x axes within the plane perpendicular to that through the cone's apex. So chose, as x-axis, a line perpendicular to the plane's normal, i.e. parallel to some line in the plane, as well as to the z axis. (In the case where the plane itself is perpendicular to the z-axis, this is no constraint and the y-axis shall also be perpendicular to the plane's normal, but there's no harm in that.) We can then use x as one co-ordinate in our plane, that increases by the distance moved when one moves in the plane parallel to the x-axis. This implies u = 0, since we can change x within the plane without changing y or z.

We get the same cone if we replace k with −k, so we can presume that k is positive. For v.y +w.z = h to specify a plane, at least one of v and w must be non-zero; and applying a common non-zero scaling to all of v, w and h makes no difference to which plane is specified, so we can re-scale to make v = Sin(c), w = Cos(c) for some angle c. We still have the freedom to reverse the directions of our co-ordinates, which we can use to ensure that c lies between zero and a quarter turn (inclusive), with h positive. The co-ordinate y.Cos(c) −z.Sin(c) will then, give or take a suitable constant offset, constitute a suitable co-ordinate in our plane to, along with x, provide us with an orthonormal system. The final step in our preliminaries is then to find a suitable constant offset, by choosing one that makes the intersection symmetric about the origin in our plane.

The cone meets the plane x = 0 in the pair of lines z = ±k.y, which meet our plane Sin(c).y +Cos(c).z = h when (Sin(c) ±k.Cos(c)).y = h. When Tan(c) = k, the ± = − case is 0.y = h; if h = 0, the line z = −k.y is the entirety of the intersection, with the plane actually tangent to the cone in this line (in the terms of what follows, this is really a degenerate case of the parabola and actually only really wants one half of this line to be thought of as intersection); otherwise, z = −k.y is parallel to (so does not meet) the plane k.y +z = h/Cos(c), which meets z = +k.y at y = h/Cos(c)/k/2, z = h/Cos(c)/2.

Otherwise, when Tan(c) is neither ±k (the Tan(c) = −k case being ruled out by our choices that made c be in the first quadrant and k positive), the plane meets the cone's two lines at y = h/(Sin(c) ±k.Cos(c)), Cos(c).z = h.(1 −Sin(c)/(Sin(c) ±k.Cos(c))) = ±k.h.Cos(c)/(Sin(c) ±k.Cos(c)). If Cos(c) = 0 – i.e. c = turn/4, the case where our plane is parallel to the z-axis as well as the x-axis – we simply have y = h and z = ±k.h; the mid-point between the two points is [y,z] = [h,0]. Otherwise, z = ±k.h/(Sin(c) ±k.Cos(c)) and our two points are

[y,z]: = h.[1, ±k]/(Sin(c) ±k.Cos(c))
with mid-point: h.[1/(Sin(c) +k.Cos(c)) +1/(Sin(c) −k.Cos(c)), k/(Sin(c) +k.Cos(c)) −k/(Sin(c) −k.Cos(c))]/2; = h.[Sin(c), −k.k.Cos(c)]/(Sin(c).Sin(c) −k.k.Cos(c).Cos(c))

To simplify this (and subsequent work) let Q = Sin(c).Sin(c) −k.k.Cos(c).Cos(c), making the mid-point h.[Sin(c), −k.k.Cos(c)]/Q. In the plane, we want an offset to include in a co-ordinate that's otherwise y.Cos(c) −z.Sin(c); since I want symmetry in the origin, I pick the offset that makes the co-ordinate zero at the mid-point just obtained. To that end, define a constant H by:

−H: = Sin(c).Cos(c).(1+k.k).h/Q
which is clearly the value of y.Cos(c) −z.Sin(c) at our mid-point.; = (1+k.k).h.Tan(c)/(Tan(c).Tan(c) −k.k)
when Cos(c) is non-zero.

so the co-ordinate I'll use as complement to x in the plane is Y = y.Cos(c) −z.Sin(c) +H. We can re-construct our original co-ordinates y and z from Y and h = Sin(c).y +Cos(c).z, treating h for these purposes as if it were a variable; we trivially obtain

y = (Y−H).Cos(c) +h.Sin(c)
z = h.Cos(c) −(Y−H).Sin(c)

and so can substitute these into x² +y² = (z/k)² to obtain:

(k.x)²: = z² −(k.y)²; = (h.Cos(c) −(Y−H).Sin(c))² −(k.(Y−H).Cos(c) +k.h.Sin(c))²; = h.h.Cos(c).Cos(c) −2.(Y−H).h.Cos(c).Sin(c) +((Y−H).Sin(c))² −k.k.(h.h.Sin(c).Sin(c) +2.(Y−H).h.Sin(c).Cos(c) +((Y−H).Cos(c))²); = h.h.(Cos(c).Cos(c) −k.k.Sin(c).Sin(c)) −2.(Y−H).h.Cos(c).Sin(c).(1+k.k) +Q.(Y−H)²
which is simply linear in Y when k = Tan(c); otherwise, it is; = Q.((Y−H) −h.Cos(c).Sin(c).(1+k.k)/Q)² −(h.Cos(c).Sin(c).(1+k.k))²/Q +h.h.(Cos(c).Cos(c) −k.k.Sin(c).Sin(c)); = Q.(Y +Sin(c).Cos(c).(1+k.k).h/Q −h.Cos(c).Sin(c).(1+k.k)/Q)² +h.h.((Cos(c).Cos(c) −k.k.Sin(c).Sin(c)).Q −Cos(c).Cos(c).Sin(c).Sin(c).(1+k.k).(1+k.k))/Q; = Q.Y² +h.h.(Cos(c).Cos(c).Sin(c).Sin(c) −k.k.Cos(c).Cos(c).Cos(c).Cos(c) −k.k.Sin(c).Sin(c).Sin(c).Sin(c) +k.k.k.k.Sin(c).Sin(c).Cos(c).Cos(c) −Cos(c).Cos(c).Sin(c).Sin(c).(1+k.k).(1+k.k))/Q; = Q.Y² +h.h.(Cos(c).Cos(c).Sin(c).Sin(c).(1 −(1+k.k).(1+k.k) +k.k.k.k) −(Cos(c).Cos(c).Cos(c).Cos(c) +Sin(c).Sin(c).Sin(c).Sin(c)).k.k)/Q; = Q.Y² −h.h.(2.k.k.Cos(c).Cos(c).Sin(c).Sin(c) +(Cos(c).Cos(c) −Cos(c).Cos(c).Sin(c).Sin(c) +Sin(c).Sin(c) −Cos(c).Cos(c).Sin(c).Sin(c)).k.k)/Q; = Q.Y² −h.h.k.k/Q

which we can rewrite as:

(Q.Y)² = (k.h)² +Q.(k.x)²

Given that c is in the first quadrant, so neither Sin(c) nor Cos(c) is negative, the common co-efficient Q has the same sign as Tan(c).Tan(c) −k.k and hence as Tan(c) −k. The further analysis divides into three cases, according as Tan(c) is less than, equal to or greater than k; i.e. Q is negative, zero or positive.

Note that, given only the geometry of the intersection, as seen in the plane, absent knowledge of the cone or the co-ordinates used above, we can only discover the ratios among the coefficients in the equation, k.h/Q and k/√Q, whence we can obtain h/√Q and k/h; we cannot determine the absolute values of h, k and Q only from the curve we see in the plane, although any one of them (or c) would suffice to imply the others, given these ratios. The Q = 0 case yields only one constant, k/h, which still doesn't suffice to determine h, k or c, although any one of them does (with the help of Tan(c) = k) then suffice to tell us the rest. All the same, the general shape of the intersection does suffice to tell us which of the three cases we're dealing with, i.e. the sign of Q.

Hyperbola

When Tan(c) > k, the equation characterizing the intersection equates the square of a multiple of Y with the sum, with positive coefficients, of the squares of x and of the constant h; this cannot possibly be smaller than its term in h alone. Consequently, Y is bounded away from zero; there is a range of values Y cannot take, for all that it can (for suitable x) take arbitrarily large values outside that range.

For large x and Y, the terms in these shall dominate that in h and the intersection's equation is reasonably well approximated by ±k.x.√Q = Q.Y, which describes a pair of straight lines; if we introduce co-ordinates u, v defined by

h.k.u = Q.Y +k.x.√Q
h.k.v = Q.Y −k.x.√Q

the (unapproximated) equation reduces to u.v = 1; thus the intersection is just the result of applying a linear transformation to the familiar y = 1/x hyperbola.

Parabola

When Tan(c) = k we have Q = 0, so the term in Y² has zero co-efficient and the equation above is simply:

(k.x)²: = h.h.(Cos(c).Cos(c) −k.k.Sin(c).Sin(c)) −2.(Y−H).h.Cos(c).Sin(c).(1+k.k)
in which we can substitute Sin(c) = k.Cos(c) and H = 0 to obtain:; = h.h.Cos(c).Cos(c).(1 −k.k.k.k) −2.Y.h.k.(1+k.k).Cos(c).Cos(c); = (h −h.k.k −2.Y.k).h.Cos(c).Cos(c).(1+k.k); = (h −h.k.k −2.Y.k).h.(Cos(c).Cos(c) +Sin(c).Sin(c)); = (h −h.k.k −2.Y.k).h

which we can re-arrange as

2.h.k.Y = h.h.(1 −k.k) −(k.x)²

which, aside from some simple scaling and a translation, is simply the familiar quadratic y = x². Substituting k = Tan(c) in this and re-arranging, we get:

2.Y: = h.(1 −Tan(c).Tan(c))/Tan(c) −x².Tan(c)/h; = h.(Cos(c).Cos(c) −Sin(c).Sin(c))/Sin(c)/Cos(c) −x².Tan(c)/h; = 2.h/Tan(2.c) −x².Tan(c)/h

Ellipse

When Tan(c) < k we have Q < 0 and our equation can be re-arranged as

(k.h)² = (Q.Y)² −Q.(k.x)²

in which each term is a positive co-efficient times a square; a simple re-scaling of each of x and Y will reduce this to x² +y² = 1, which describes the unit circle; a re-scaled circle is known as an ellipse.

A classical reprise

The other way conic sections are characterized in classical mathematics is in terms of some two-dimensional geometry. Given a straight line and a point, called focus, not on that line, any point in the plane has a distance from the focus and a distance from the nearest point on the straight line; we can compute the ratio between these two distances so we can define a function on the plane that maps each point to this ratio. We can then ask for the sets of points in the plane with particular values for this function; it turns out that these sets of points are exactly the same conic section curves we've just seen.

Use the line as x-axis and chose, as origin, the point on it closest to the focus; orient the y-axis (which, by now, inevitably passes through the focus) so as to make the focus be the point [x,y] = [0,L], where L is the distance from the y-axis to the focus. The ratio of distance from focus divided by distance from x-axis is then simply (x² +(y−L)²)^1/2/y; to characterize the set of points on which this takes some fixed value, say a, we must study the equation

0: = x.x +(y−L).(y−L) −a.a.y.y; = x.x +(1−a.a).y.y −2.L.y +L.L

When a = ±1 this is the simple quadratic y = (x.x+L.L)/2/L, hence a parabola as above for Tan(c) = k. Otherwise, we have

x.x: = (a.a−1).y.y +2.L.y −L.L; = (a.a−1).(y +L/(a.a−1))² −L.L/(a.a −1) −L.L; = (a.a−1).(y +L/(a.a−1))² −a.a.L.L/(a.a −1)

which we can re-write as

a.a.L.L +(a.a −1).x.x = (L +(a.a−1).y)², or as
1 +(1 −1/a/a).(x/L)² = (1/a +(a−1/a).y/L)²

and I hope it is, in light of the above, clear that a.a > 1 yields a hyperbola while a.a < 1 yields an ellipse. The case a = 0 is a circle, albeit degenerately – it's the circle of radius zero centred on the focus.

Comparing this with 1 +Q.(x/h)² = (Q.Y/k/h)², Q/k/h corresponds to (a.a−1)/a/L and Q/h/h to (1−1/a/a)/L/L, hence a.L corresponds to h/k and a.a−1 corresponds to Q/k/k. Looking at 1+Q/k/k = 1 +Sin(c).Sin(c)/k/k −Cos(c).Cos(c) = Sin(c).Sin(c).(1 +1/k/k) we can identify a (give or take a sign) with Sin(c).√(1 +1/k/k) hence L with h/Sin(c)/√(k.k+1). We can infer a and L from the intersection's geometry in the plane, but not c (or Q), k and h.

Skew cones

Technically, the cone discussed above is a right circular cone – that is, one which meets the planes at right-angles to its centre-line in circles. More generally, we can define a cone by any curve in a plane and any point not in the plane; the straight lines through the given point that intersect the given curve then form a surface which we can think of as a generalized cone – the classic pyramid is a square cone in this sense, while the tetrahedron is a triangular cone. The given point is called the vertex or apex of the cone; if the normal from it to the plane of the original figure meets the plane at the figure's centroid, we can describe it as a right cone on the given figure; otherwise, it is a skew cone. Let us now consider a circular skew cone – that is, the surface obtained by connecting the points of a circle to a single point neither in the plane of the circle, nor in that plane's normal through the circle's centre.

By applying a shear transformation relative to the figure's plane – that is, for some co-vector q which annihilates all displacements in this plane and some vector v parallel to the plane, the transformation is (: x +v.q(x−p) ←x :) for an arbitrary point p in the plane (it doesn't matter which; changing choice of p only adds a displacement in the plane to x−p; and q annihilates this change) – we can transform any skew cone to a right cone. The intersection of a plane and a skew circular cone can thus be obtained by straightening the cone in this way, intersecting with the sheared plane (which is itself a plane) and then reversing the shear to recover the original plane. The action of the shear on the intersection is a linear map (within the plane) which depends on the relationship between the direction (v, in the characterization just given) of the shear and the normal to the intersecting plane; however, at worst, it combines a shear within the plane and a scaling of one direction without change to the perpendicular direction. When the plane contains a direction perpendicular to the direction of the shear, the above analysis will make this direction the x-axis and it shall be preserved by the shear; the effect of reversing the shear will only be to scale the Y co-ordinate.

Viewing a tilted disk

When an astronomer looks through a telescope at a distant spiral galaxy, the galaxy is reasonably well characterized as a flat circular disk; the shape it makes on our sky (and hence in the astronomer's photograph) is the result of intersecting a skew cone – with the galaxy's perimeter as figure and the astronomer as apex – with the telescope's focal plane, which is (unlike the plane of the figure itself) perpendicular to the line from the cone's apex to the figure's centroid; this ensures that it contains a direction perpendicular to the direction of the shear needed to right the cone. From the shape of this intersection, we can hope to infer the angle between the galactic plane and our line of sight to the galaxy.

The intersection of the righted cone and sheared focal plane will be an ellipse and unshearing shall merely stretch it in the Y direction; the result is still an ellipse. We can measure its major and minor axes, which correspond to k.h/Q in the Y direction and h/√Q in the X direction, although the former shall have been stretched (or squashed) by unshearing.

Written by Eddy.