Poisson Brackets

This page uses an antique notation and is in need of maintenance.

Dirac, in The Principles of Quantum Mechanics, begins the fourth Chapter, The Quantum Conditions, with a section on an operator called the Poisson Bracket. This derives from the Hamiltonian/Lagrangian formalism for describing a dynamical system in terms of two families of parameters, typically position and momentum, with a one-to-one pairing off of each member of one family with a member of the other. The formalism describes the system in terms of an action function, H, which depends on these parameters and tells us the dynamics of the system: H's derivative with respect to a parameter tells us the rate of variation of its pair, give or take a sign.

Hamilton's packaging of Lagrange's formalism enables us to package each family of parameters as a vector quantity and gives us a duality between the two kinds of vector quantity. For details, let the space of vectors describing one of the families be Q, that describing the other P, with H(q, p) a member of some linear space S (archetypically {scalars}, probably ordered): Hamilton gives us (if we constructed H right) that the rate of change of q, will be q' = D1(H,q,p), the rate at which H varies under small perturbations in p; and p' = -D0(H,q,p), its derivative with respect to q. This gives us P = {linear (Q|:S)} and Q = {linear (P|:S)} give or take some natural isomorphisms, whence we have a natural contraction between Q and P, combining q in Q with p in P to give q·p in S.

The Poisson bracket involves multiplying things together in two orders, u×v and v×u, and performing subtraction on the result. The relevant aspects of the multiplication involved are that:

it's bilinear

for any linear a which accepts u, a(u×v) = (a(u))×v, and any z for which v(z) shows v acting as a linear map, (u×v)(z) = u×(v(z)). In particular, for any scalar a, a.(u×v) = (a.u)×v and (u×v).a = u×(v.a)
whenever x+u makes sense, (x+u)×v = x×v +u×v
whenever v+y makes sense, u×(v+y) = u×v +u×y

indeed, the tensor product operation, which I'll duly write as ×, is defined so as to make it the idealization of these truths. For given linear spaces U, V, let U⊗V be the span of {u×v: u in U, v in V} and notice that it in principle remembers the order of the factors: but it allows some mixing up of the information content along the way (we can rearrange sums in various ways using bases of our choice, for instance). That gives rise to permutation operators which express natural isomorphisms such as that between U⊗V and V⊗U induced by u×v → v×u. These in turn are cousins to some isomorphisms which are so natural that we normally ignore them, such as that between (U⊗V)⊗W and U⊗(V⊗W): these don't change the order of the entries in the sequence U, V, W.

Other bilinear operators can be expressed in terms of the tensor product by use of tracing operators or contraction: I use a unified notation for trace/permute operators (allowing some mixing which we'll need below). For example, trace acts on {linear (V|:V)} = V⊗dual(V) as τ([!,!],[V,dual(V)]) characterized by ({linear (V|:V)}| u×w → w(u) :{scalars}) with w in dual(V), u in V hence w(u) a scalar (and the ! could as readily have been any other repeated symbol, so long as it wasn't a natural number). For the permutation u×v×w → v×w×u, with u in U, v in V and w in V, I'll write τ([0,2,1],[U,V,W]), and note that I number positions in lists […, 1, 0], so τ([2,1,0]) would have given the appropriate identity.

it respects the order of factors

So while a general bilinear operator, such as τ builds, can involve a permutation, the multiplication we want to be using is one which doesn't alter the order of the pieces multiplied. When we talk about u×v −v×u as members of one space, we are using a natural isomorphism between U×V and V×U, but we're not using u×v → v×u, we're using a non-permuting natural isomorphism, which means we can express U and V each as the tensor product of a list of spaces for which the combined list is the same whether we take U's or V's first. In fact we may be doing some contraction at the same time, as part of a more general multiplication, so long as the uncontracted bits aren't permuted relative to one another.

As an example, suppose we have linear spaces G, T dual to one another and some other linear space S. With [T,S,S,S,G] as U's list and [T,S,G] as V's, we could use a multiplication which contracts out […, G]×[T, …] to […, …] making both u×v and v×u fall in the tensor product space of [T,S,S,S,S,G] and allowing us to do subtraction. (Note: I think this example is the general case, subject to the tensor product folding any instance of {scalars}, in its list of spaces, out of sight.) The differences needn't be zero, but are defined.

Finally:

its relationship with differential operators is the product rule

that is D(u×v) = D(u)×v +τ([1,2,0], u×D(v)) in which the τ is just there to make a U⊗{gradients}⊗V into a {gradients}⊗U⊗V, like the other two terms present: it's a necessary shuffling to allow us the addition. Crucially, given that we'll be involving differential operators, all the τ operators are mapped to zero by any Leibniz operator, including any differential operator; that is, the τ operators are constant. This lets us do the entire discussion as if the only multiplication of interest were the tensor product, allowing the presence of some trace operator to finish off the job.

Given functions u = (Q| q→ (P| p→ u(q,p) :U) :{(P|:U)}) and v likewise with v(q,p) in V, we can likewise construct u×v = (Q| q→ (P| p→ u(q,p)×v(q,p) :U⊗V) :). Now, there are situations in which U⊗V and V⊗U are the same, e.g. when either is {scalars}, when they are equal or, indeed, when V is U⊗U. The crucial property is that we can discuss u×v −v×u.

Now, with (Q| u :{(P|:U)}) we have, as for (Q| H :{(P|:S)}), D0(u) defined by: D0(u,q,p) is a linear map from Q to U which maps a small perturbation in q to the small perturbation it would produce in u(q,p). Similarly, D1(u,q,p) tells us how u depends on p. Examining D0(u,q,p) linear (Q|:U) and D1(v) linear (P|:V), the latter is linear ({linear (Q|:S)}| :V) so in V⊗dual(S⊗dual(Q)) or, effectively, V⊗dual(S)⊗Q while the former is in U⊗dual(Q), so we can use the action of Q on its dual (effectively the contraction between P and Q) to define a contracting multiplication of D0(u,q,p) and D1(v,q,p) yielding an answer in dual(S)⊗V⊗U, give or take ordering of the dual(S) relative to V and U, but without affecting the relative order of V and U.

This lets us define an operator taking u with v to D0(u)*D1(v) where * is an operator which contracts out the dual(Q) of D0 with the dual(P) of D1 and yields its answer as a dual(S)⊗U⊗V. In principle, it can fold in some tracing as part of our multiplication of u with v. Swapping v with u and using our natural equivalence of U⊗V and V⊗U, we can examine the difference, D0(u)*D1(v) −D0(v)*D1(u). The result of combining u and v in this way is known as their Poisson Bracket, written [u,v], an orthodoxy I'll keep despite potential confusion with my [item, item, …] notation for lists, which I'll be using for permutations.

It is immediate from this that [u,v] = −[v,u]. If either u or v is constant, [u,v] will of course be 0. It is also easy enough to show that [u1+u0,v] = [u1,v]+[u0,v] and [u,v1+v0] = [u,v1]+[u,v0].

Now consider [u×v, w]. For this, use D(u×v) = D(u)×v + τ([1,2,0], [U,dual(Q),V], u×D(v)), with D either D0 or D1. It's for compatibility with this that my definition of the Poisson bracket has an implicit permutation which makes [u,v] a dual(S)⊗U⊗V. Fortunately, the contraction(s) in * will get rid of these permutations. We get

[u×v, w]: = D0(u×v)*D1(w) −D0(w)*D1(u×v); = (D0(u)×v)*D1(w) +τ([1,2,0], u×D0(v))*D1(w) −D0(w)*D1(u)×v −D0(w)*τ([1,2,0], u×D1(v))
and the permutations pulling the D-aspect leftwards is, in each case, eaten by the permutation in the * so that the resulting dual(S) aspect is always the leftmost item: presuming only that shuffling uniformly throughout, we can write; = (D0(u)×v)*D1(w) +u×D0(v)*D1(w) −D0(w)*D1(u)×v −D0(w)*(u×D1(v)); = (D0(u)×v)*D1(w) +u×D0(v)*D1(w) −D0(w)*D1(u)×v −D0(w)*(u×D1(v)); = [u,w]×v +u×[v,w] +(D0(u)×v)*D1(w) −D0(u)*D1(w)×v +u×D0(w)*D1(v) −D0(w)*(u×D1(v))
[u, v×w]: = D0(u)*D1(v×w) −D0(v×w)*D1(u); = D0(u)*D1(v)×w +D0(u)*(v×D1(w)) −(D0(v)×w)*D1(u) −v×D0(w)*D1(u); = [u,v]×w +v×[u,w] +D0(u)*(v×D1(w)) −v×D0(u)*D1(w) +D0(v)*D1(u)×w −(D0(v)×w)*D1(u); These we can apply in either order to examine
[u0×u1, v0×v1]: = (D0(u0)×u1)*D1(v0×v1) +u0×D0(u1)*D1(v0×v1) −D0(v0×v1)*D1(u0)×u1 −D0(v0×v1)*(u0×D1(u1)); = (D0(u0)×u1)*(D1(v0)×v1 +v0×D1(v1)) +u0×D0(u1)*(D1(v0)×v1 +v0×D1(v1)) −(D0(v0)×v1 +v0×D0(v1))*D1(u0)×u1 −(D0(v0)×v1 +v0×D0(v1))*(u0×D1(u1)); = (D0(u0)×u1)*D1(v0)×v1 +u0×D0(u1)*(v0×D1(v1)) +u0×D0(u1)*D1(v0)×v1 +(D0(u0)×u1)*(v0×D1(v1)) −(D0(v0)×v1)*D1(u0)×u1 −v0×D0(v1)*(u0×D1(u1)) −(D0(v0)×v1)*(u0×D1(u1)) −v0×D0(v1)*D1(u0)×u1
[u0×u1, v0×v1]: = D0(u0×u1)*D1(v0)×v1 +D0(u0×u1)*(v0×D1(v1)) −(D0(v0)×v1)*D1(u0×u1) −v0×D0(v1)*D1(u0×u1); = (D0(u0)×u1)*D1(v0)×v1 +u0×D0(u1)*(v0×D1(v1)) +u0×D0(u1)*D1(v0)×v1 +(D0(u0)×u1)*(v0×D1(v1)) −(D0(v0)×v1)*D1(u0)×u1 −v0×D0(v1)*(u0×D1(u1)) −(D0(v0)×v1)*(u0×D1(u1)) −v0×D0(v1)*D1(u0)×u1
[u, [v,w]] +[v, [w,u]] +[w, [u,v]]: = D0(u)*D1(D0(v)*D1(w) −D0(w)*D1(v)) −D0(D0(v)*D1(w) −D0(w)*D1(v))*D1(u) +D0(v)*D1(D0(w)*D1(u) −D0(u)*D1(w)) −D0(D0(w)*D1(u) −D0(u)*D1(w))*D1(v) +D0(w)*D1(D0(u)*D1(v) −D0(v)*D1(u)) −D0(D0(u)*D1(v) −D0(v)*D1(u))*D1(w)
and the machinery of * will look like × to D, so we get the product rule as D1(D0(v)*D1(w)) = D1(D0(v))*D1(w) +D0(v)*D1(D1(w)) but need to take to some care about which D1 the * relates to: so I'll use E as an alias for D: then D0 and D1 will be contracted by one * (the one inside E(…)) while E0 and E1 are contracted by the other (the outer *). Expanding the result out yields lots of terms (in this order if the expansion is done straightforwardly):; = E0(u)*E1D0(v)*D1(w) +E0(u)*D0(v)*E1D1(w) −E0(u)*E1D0(w)*D1(v) −E0(u)*D0(w)*E1D1(v) +E0D0(w)*D1(v)*E1(u) +D0(w)*E0D1(v)*E1(u) −E0D0(v)*D1(w)*E1(u) −D0(v)*E0D1(w)*E1(u) +E0(v)*E1D0(w)*D1(u) +E0(v)*D0(w)*E1D1(u) −E0(v)*E1D0(u)*D1(w) −E0(v)*D0(u)*E1D1(w) +E0D0(u)*D1(w)*E1(v) +D0(u)*E0D1(w)*E1(v) −E0D0(w)*D1(u)*E1(v) −D0(w)*E0D1(u)*E1(v) +E0(w)*E1D0(u)*D1(v) +E0(w)*D0(u)*E1D1(v) −E0(w)*E1D0(v)*D1(u) −E0(w)*D0(v)*E1D1(u) +E0D0(v)*D1(u)*E1(w) +D0(v)*E0D1(u)*E1(w) −E0D0(u)*D1(v)*E1(w) −D0(u)*E0D1(v)*E1(w)
Which I can now re-order into four sets of three +ve/−ve pairs as follow:; = E0(w)*E1D0(u)*D1(v) −D0(w)*E0D1(u)*E1(v) +E0(u)*E1D0(v)*D1(w) −D0(u)*E0D1(v)*E1(w) +E0(v)*E1D0(w)*D1(u) −D0(v)*E0D1(w)*E1(u) +D0(v)*E0D1(u)*E1(w) −E0(v)*E1D0(u)*D1(w) +D0(w)*E0D1(v)*E1(u) −E0(w)*E1D0(v)*D1(u) +D0(u)*E0D1(w)*E1(v) −E0(u)*E1D0(w)*D1(v) +E0(v)*D0(w)*E1D1(u) −E0(w)*D0(v)*E1D1(u) +E0(w)*D0(u)*E1D1(v) −E0(u)*D0(w)*E1D1(v) +E0(u)*D0(v)*E1D1(w) −E0(v)*D0(u)*E1D1(w) +E0D0(u)*D1(w)*E1(v) −E0D0(u)*D1(v)*E1(w) +E0D0(v)*D1(u)*E1(w) −E0D0(v)*D1(w)*E1(u) +E0D0(w)*D1(v)*E1(u) −E0D0(w)*D1(u)*E1(v)
Within each term we can independently chose which pair of differential operators to alias as E and as D, so long as each * continues contracting the correct pair (the aliasing being nothing more than that): I chose to swap the names in the −ve terms, because I've looked ahead and seen what I'm about to write. This will give us D0E1 and similar DE operators but the notional independence of our parameters means that D0E1 = E1D0, except for the order of the dual(S) aspects of the resulting tensor quantities (and when S is 1-dimensional, this order is genuinely irrelevant): the second derivative is symmetric (which might not quite be true on a smooth manifold, but leave that aside for now), which lets me swap DE terms back into ED order. The first two sets of three +/− pairs then cancel exactly (the −ve member having been rearranged into a copy of the +ve one – that's the looking ahead I mentioned) and we are left with; = E0(v)*D0(w)*E1D1(u) −D0(w)*E0(v)*E1D1(u) +E0(w)*D0(u)*E1D1(v) −D0(u)*E0(w)*E1D1(v) +E0(u)*D0(v)*E1D1(w) −D0(v)*E0(u)*E1D1(w) +E0D0(u)*D1(w)*E1(v) −E0D0(u)*E1(v)*D1(w) +E0D0(v)*D1(u)*E1(w) −E0D0(v)*E1(w)*D1(u) +E0D0(w)*D1(v)*E1(u) −E0D0(w)*E1(u)*D1(v)
or, with a suitable reading of the action of the *s involved,; = (E0(v)*D0(w) −D0(w)*E0(v))*E1D1(u) +(E0(w)*D0(u) −D0(u)*E0(w))*E1D1(v) +(E0(u)*D0(v) −D0(v)*E0(u))*E1D1(w) +E0D0(u)*(D1(w)*E1(v) −E1(v)*D1(w)) +E0D0(v)*(D1(u)*E1(w) −E1(w)*D1(u)) +E0D0(w)*(D1(v)*E1(u) −E1(u)*D1(v))

in which each +/− pair will vanish provided certain multiplications commute. Each term is a tensor of form dual(S)⊗dual(S)⊗X where X is the linear space of which u×v×w is a member regardless of the order in which u, v and w appear. Each pair is a difference of such terms with one end of X (seen as U×V×W in some order) held still and the other two aspects swapped: if the terms are symmetric under such swaps, the sum will be zero. This will happen if either of the aspects being swapped is 1-dimensional and under certain other circumstances (e.g. when * involves suitable contraction between U, V and W). Certainly when enough of U, V and W are {scalars} we do get the same answer, and the same would hold if u, v and w were suitable differential operators on some smooth manifold (rather than tensor fields). So for the kinds of u, v and w entertained by Hamilton, we get the answer 0.

Time to go back and explain why I used × between the things the Poisson bracket combined, and how to use tracing, effectively delegated to *, to infer the more interesting cases from the × cases.

Written by Eddy.