Associative binary Operators

This page is one of several that treat this topic, with much overlap; I'm working to collapse them all down to few.

We say that a uniform binary operator (A×A|g:A) left-associates over a binary operator (A×B|f:B) precisely if, for every a, c in A and b in B: f(g(a,c),b)=f(a,f(c,b)). We say that (A×A|g:A) right-associates over (B×A|f:A) precisely if, for every a, c in A and b in B: f(b,g(a,c))=f(f(b,a),c). A binary operator which associates over itself from either side (is uniform and) also associates over itself from the other side: it is called, simply, associative.

The importance of associativity lies in the power it gives us to re-arrange expressions: it introduces a form of transitivity to binary operators. For example, if g left-associates over f, as above, then any equation of form f(a,b)=b, for specified a and b, at last implies that a is something like an identify: consider any c in A, with f(c,b)= f(c,f(a,b)) = f(g(c,a),b). Thus, if f is right-cancellable, we obtain c= g(c,a) so a is a right-identity for g: in any case, we find that f1(b) = (A| c-> f(c,b) :B) can never distinguish any c from g(c,a). We need associativity to perform the central rearrangement: this then allows us to put expressions into forms which permit us to apply properties of the binary operators such as cancellation and completeness.

For any associative g, an equation of form g(a,b)=b implies that every c has g(g(c,a),b)=g(c,b) and g(a,g(b,c))=g(b,c). The former we had before; it tells us that a is a right-identity for g whenever g is right-cancellable. The latter tells us that g0(a) acts as the identity on ({b}×A|g|), which is (g1(b)|): thus, if g is right-complete (so g0(b) is epic) we have a as a left-identity. Analogously, any equation of form g(a,b)=a tells us that b is a right-identity for g if g is left-complete and a left-identity for it if it is left-cancellable. Thus associativity allows us, knowing only a's relationship to b, to determine (either b or) a's relationship with many (potentially all) other values.

Further Reading

There are natural parallels with the definition of distributivity.

Contents

While the main thread of building up interesting structures out of associative binary operators passes from this page to those under `Further Reading', above, I devote the rest of this page to various matters relating to associativity. These generally appear as details when seen from the primary discourse, though they have their importance.

Inverses

Note: the following could be split left- and right- by considering, for example, a left identity, e, of f lying in C (as well as A): a right-inverse of a in A is then some b for which f(a,b)=e.

For a binary operator (A×B|f:C), having an identity, e: e a member of the intersection of A with B, therefore also of their union, which is contained in C; we refer to (|f:{e}) as the collection of mutually inverse pairs. Each such is of form (a,b) with a in A, b in B and f(a,b)=e. In such a case, we describe a as a left-inverse for b; and b as a right-inverse for a. Notice that if f is associative and a member, h, of the intersection of A and B has a left-inverse, l, and a right-inverse, r, we have l = f(l,f(h,r)) = f(f(l,h),r) = r. Thus left- and right-inverses for associative binary operators necessarily coincide. When we have both f(a,b) and f(b,a) as f's identity, we describe each of a and b as the other's inverse, without left/right qualification.

Consider an associative binary operator (A×B|f:C) which has an identity (so A and B are contained in C). If every member of B has a left-inverse (in A), then f is necessarily left-cancellable: for, writing a as the left-inverse of b, f(b,c)=f(b,d) &implies; f(a,f(b,c))=f(a,f(b,d)) &implies; c=f(f(a,b),c)=f(f(a,b),d)=d. Likewise, if each member of A has a right-inverse, then f is necessarily right-cancellable.

Groups

A group is a complete, cancellable associative binary operator.

These conditions imply an identity for f in A, unless A is empty. Proof: given any element a of A (possible iff A is non-empty) completeness lets us solve f(a,e)=a. Then, for any c in A: f(a,c)=f(f(a,e),c)=f(a,f(e,c)) in which cancellation of a yields c=f(e,c). This being so for arbitrary c, we have that any a's solution, e, to f(a,e)=a is a left identity; it thus also solves f(e,a)=a whence arbitrary c satisfies f(c,a)=f(c,f(e,a))=f(f(c,e),a), which cancels to yield c=f(c,e). Thus e is also a right identity and, therefore, an identity.

Using this identity, e, apply completeness to f(a,c)=e with a given, to obtain a right inverse, c, for a; solve f(d,a)=e for a left inverse. As before, c=f(e,c)=f(f(d,a),c)=f(d,f(a,c))=f(d,e)=d proves that c and d are equal. This shows that any a has a unique inverse. Thus a non-empty group has both identity and inverses.

Given that an identity and inverses together imply both cancellability and completeness, one may equivalently define a group to be an associative binary operator with identity and inverses - differing from the above precisely in relation to the empty group. I have chosen to do otherwise mostly to illustrate how just one property, expressed in two chiral (left- and right-) forms and in two `dual' forms, cancellation and completeness, can produce (almost) the same result. I find the lack of conceptual orthogonality between identity and inverse inelegant and I find the generalisation of completeness and cancellability (to non-uniform binary operators) flows rather more naturally than do those of identity and inverse. In particular, cancellable complete binary operators look like reasonably interesting constructs even when they aren't uniform (let alone associative).

The interested reader might attempt the experiment of drawing the category-theoretic diagrams expressing the two definitions. Those for cancellation and completeness are comparatively trivial and can be written independently: they state that f0 and f1 yield monic and epic answers. I know the ones needed to express the standard definition are ghastly: but the part I remember as such was the expression of associativity, so the parts for identity and inverse might also be tidy.

Bulk Action

One natural operation which may be induced from an associative binary operator (A×A|f:A) is its bulk action on functions from non-empty natural numbers to A, denoted ({(n|:A): n non-empty, natural}| bulk[f] :A) and defined inductively as:

When the binary operator being thus extended has a left-identity, the first of these definitions can be replaced by one stating that the bulk action applied to a function from the empty set to A yields this left-identity: the given first rule is then deduced by a single application of the second. Whenever we can construct a suitable definition of the bulk action of f on a function from a limit ordinal to A, in terms of the bulk action of f on the restrictions of this to members of the limit ordinal, we can extend the bulk action to act on arbitrary functions to A from well-ordered sets. This last is not widely used ;^>

If we have a topology on A, with associative (A×A|f:A) continuous, we can declare a function from a limit ordinal to A, (N|x:A), to be bulkable if there is some function, h, from N to the compact non-empty subsets of A for which: whenever n < m in N, h(m) is a subset of h(n) and bulk[f]((m:x:)) is in h(n); and the intersection of (N|h|) is a singleton set, the sole member of which is then defined to be the bulk action of f on x: {f(x)} = intersection(N|h|). [This depends on intersection being definable, in general, on arbitrary functions yielding sets: and, in fact, it gives us arbitrary intersection as the bulk action of the binary operator that intersection defines on sets (with all functions yielding sets being intersection-bulkable). The topology this involves on the collection of sets is interesting: for any given set, A, the set of its subsets, P(A), is compact in the topology; so our topology on the collection of sets has compact(Set)= {P(S): S in Set}. I think this produces open(Set)= {P(S)\{S,{}}: S in Set}, or something similar.]

Applying the definition to an Abelian binary operator, we find that permutation of the ordered set has no effect on the bulk action, so that we can ignore the requirement of ordering and apply the bulk action of f to any function from a finite (non-empty, if we lack an identity) set to A.

In general, I shall describe a function as bulkable by some binary operator precisely if the above defines a bulk action for it. When we have an f which we regard as addition, `bulkable' becomes `summable'; addition's bulk action is called summation and converts a function from a finite set into the sum of that function; bulk[+]=sum. Likewise, the bulk action of a multiplicative operator is called its product, bulk[·]=product - and we shall meet situations (in linear algebra, at least) where multiplication is not Abelian: multipliable (multiplicable ? productable ? produceable ?) functions have then to be ordered, and products depend on order.

Notice that I have defined bulk actions on functions, not sets. We may be used to saying we sum a `set' of values: however, the `sets' we sum can include repeated elements - which sets don't do. In fact, what we sum is the image of some (indexing) set under a function: if the function takes some value several times, it appears repeatedly in the sum. For example, when we sum 2, 3, 2, 4, 2, and 5, we're really summing a function from a set with six distinct elements, three of which map to 2, the others mapping to 3, 4 and 5 variously. When we do actually want to sum a set, what we're doing can be understood as summing the identity function on that set: this fits well with my general habit of downplaying the distinction between a set and its identity.

The bulk action of an associative binary operator induces an algebra into which one can embed the binary operator's domain and see a natural identity for the operator. If the binary operator is also Abelian and cancellable, a related construction can be used to extend the binary operator to form a group.

Bootstrapping an associative operator

Because associativity allows us to rearrange some expressions in three (or more) constituents, it enables us to prove that various constructions (such as inverses, groups and bulk actions) make sense. It also allows us, as I shall describe in following sections, to extract more powerful binary operators from quite simple ones.

Induced from an action

In some situations one may start with an action, (A×B:f:f), and attempt to construct g from f: if f(c,f(a,b)) is guaranteed to be f(k,b) for some k in A (which will need to be unique, at least for most b) we can try using f(c,f(a,b)) = f(g(c,a),b) to define the function g. We're not always guaranteed that this'll work, or that g will turn out to have nice properties if it does ! (Vicious counter-case: let A be the space of antiautomorphisms of complex vector space B, with f being A's action as a function on B.)

An action f, of A on B, aka (A×B|f:B), supplies us with an embedding, f0, of A in {(B|:B)}. The latter supports an associative binary operator called composition, ({(B|:B)}×{(B|:B)}| (f,g)->f&on;g :{(B|:B)}). Under composition, each of {epic (B|:B)}, {monic (B|:B)} and their intersection, {iso (B|:B)}, is closed under composition: f0's images of A when f is right-complete, left-cancellable and both, resectively, lie in these sub-domains of {(B|:B)}. If (A|f0|) is closed under composition, we can use it to obtain a uniform (A×A|g:A) left-associating over f.

With (A|f0|) closed under composition in {(B|:B)}, let C=(A|f0|): on C, we have an action (C×C|compose:C) which left-associates over the action of C, as a subdomain of {(B|:B)}, on B, namely (C×B| (c,b)->c(b): B). But C is (A|f0|), so the illustration's c is f0(a) for some a in A and c(b) is just f(a,b). Closure of C under composition means that there is always some h in A, for any given a,b in A, for which f0(h)= f0(a)&on;f0(b). Taking solutions, h, to this equation for each pair a,b in A gives us (A×A|(a,b)->h:A) left-associating over f.

A might not be well-enough defined for us to construct its self-product, A×A, for all our ability to describe A. None the less, if f0 is monic it does imply such a product on A, even if we can't construct it (I don't remember ever proving this yet). In this, or any other case in which we cannot construct the above function (A×A|:A), we still have composition restricted to C=(A|f0|) as a binary operator on A.

Completed via cancellation

For any binary operator (A×B|f:C), we can define a relation on A×B by: (a,b) ~ (e,d) precisely when f(a,d)=f(e,b): this can be read as saying (using the language of addition for f) that the f-difference between a and b is equal to that between e and d: if the relation proves to be an equivalence relation, we can think of the equivalence classes as the collection of possible differences. The relation is transparently symmetric (that is: (a,b) ~ (e,d) &implies; (e,d) ~ (a,b)) and reflexive (every (a,b) ~ itself), so it will be an equivalence relation whenever it is transitive (that is, whenever (a,b) ~ (e,d) ~ (i,h), it implies that (a,b) ~ (i,h): one can always cut out the middle-part of a chain). The equivalence classes here are just the construction needed to render a binary operator complete.

Suppose f now to be cancellable, associative and Abelian. Then (a,b) ~ (e,d) ~ (i,h) tells us f(a,d)=f(e,b) and f(e,h)=f(i,d), from which we obtain f(f(a,d), f(e,h)) = f(f(e,b), f(i,d)), which associativity and Abel let us turn into f(f(e,d), f(a,h)) = f(f(e,d), f(i,b)), which cancellation turns into f(a,h) = f(i,b) whence (a,b) ~ (i,h): so our relation is transitive. Thus our relation, ~, is an equivalence relation whenever f is cancellable, associative and Abelian: when else ? Note that function action, ({(B|:C)}×B| (u,b)->u(b): C) is not, in general, completeable - though occasionally its restriction to some sub-domain of {(B|:C)} may be, especially if B and C are simple enough and/or equal.

For such an (A×A|f:A), we have a natural binary operator on A×A, namely ((A×A)×(A×A)| ((a,b),(e,d)) -> (f(a,e),f(b,d)): A×A), which respects this equivalence relation: that is, whenever (a,b)~(i,h), we obtain (f(a,e),f(b,d))~(f(i,e),f(h,d)). [Proof: f(a,h)=f(i,b), so (using associativity and Abel before and after substituting) f(f(a,e),f(h,d)) = f(f(a,h),f(e,d)) = f(f(i,b),f(e,d)) = f(f(i,e),f(b,d)). Likewise, whenever (e,d)~(i,h) we obtain (f(a,e),f(b,d))~f(f(a,i),f(h,d)).] A binary operator on A×A which respects the equivalence relation then naturally turns into a binary operator on the collection of equivalence classes. This has an identity (the equivalence class of (a,a) for any a in A - all such self-pairs being trivially equivalent) and inverses (because (a,b)'s equivalence class combines with that of (b,a) - its inverse - to yield that of (f(a,b),f(a,b)), which is the identity).

There is a natural embedding of A in this collection of equivalence classes, namely: map each a in A to the ~-equivalence class of (f(a,a),a); notice that this is the same ~-equivalence class as that of (f(a,b),b) for any b in A. Our binary operator on A×A combines (f(a,a),a) with (f(c,c),c) to give (f(f(a,a),f(c,c)), f(a,c)) which, by associativity and Abel, is (f(f(a,c),f(a,c)),f(a,c)), so the embedded images a and c combine to give the embedded image of f(a,c). Thus our natural embedding embeds f as our binary operator on the ~-equivalence classes of A×A: we therefore use f as the name for this. The closely related embedding which maps each a to the ~-equivalence class of (a,f(a,a)) provides us with f-inverses for (the embedded image of) A. Thus the collection of ~-equivalence classes under the binary operator constitutes an extension of A under f while adding (if not already present) an identity and inverses.

Because this `difference construction' can complete any associative cancellable Abelian binary operator, completeness can be treated as an optional property of a group. If leaving it out grants us some valuable freedoms, we can do so safe in the knowledge that, should we ever need our binary operator to form a group, it can be persuaded to do so via the difference construction.

It should be noted that there is an analogous construction whereby we can take a complete associative binary operator and reduce it, via the the transitive completion of `a~b ⇔, for some c, f(a,c)=f(b,c)', to obtain a group.

The bulk action yields an identity

Any a in A defines a function from the set 1 (ie {0}, or any other set with precisely one element) to A - namely, 0->a. This function is, by definition, always bulkable with bulk action a. Thus we have an embedding of A in the bulkable functions of an arbitrary associative binary operator on A - and the bulk action (from bulkable functions back to A) is left-inverse to it. If we combine the embedding and bulk action the other way, to obtain a function from bulkable functions, via A, to {(1|:A)} which is a subset of the bulkable functions. This acts on {(1|:A)} as the identity, so it is a projection.

There is a natural mechanism, the disjoint sum, whereby any two bulkable functions of an associative binary operator may be combined to yield another. The disjoint sum of two functions (n|x:A) and (m|y:A) is x+y = (n+m| (i,)-> x(i), (,j)->y(j) :A) where n+m is the union of {(i,): i in n} and {(,j): j in m}, which we take to be ordered by: (i,)<(h,) whenever i<h, (i,)<(,j), (,j)<(,h) whenever j<h. The direct sum n+m is a finite set whenever n and m are, so x+y has finite domain whenever x and y do. Thus whenever x+y is non-empty, it is bulkable: for it to be empty, both x and y must be empty, in which case it is bulkable precisely if they are (because it and they are equal). [If you manage to extend bulk action to limit ordinals, thus to all ordinals, you will, of course, have to see whether your definitions leave it possible to assert that the disjoint sum of bulkables is always bulkable.] Thus the direct sum of bulkable functions is always bulkable.

Disjoint sum is a binary operator on bulkable functions. It is not actually associative, but: for any (n|x:A), (m|y:A) and (p|z:A), there is a natural unique order-preserving isomorphism ((n+m)+p| ((i,),)->(i,), ((,j),)->(,(j,)), (,k)->(,(,k)) :n+(m+p)) which composes before x+(y+z) to give (x+y)+z, so disjoint sum is `associative modulo isomorphism'. In particular, this give us f(x+(y+z)) = f((x+y)+z).

It remains to discover the relation between the bulk action on a disjoint sum and on its constituents. As before, write f(x) for the bulk action of f on any (:x:|). Now, (S+empty| (s,)->s :S) is trivially iso, so the direct sum of any f-bulkable (S|x:A) with empty has f(x+empty)=f(x) (and f(empty) is a left identity in A precisely if empty is bulkable). For a singleton, (1| 0->a :A), with bulk action a, and any (n|x:A), we use the (unique natural) order-preserving isomorphism (n+1| (,0)->n, (i,)->i :successor(n)) to turn x+(:0->a:) into (successor(n): n->a, (n:x:) :A) with bulk action f(f(x),a).

Let Known(n) be the statement, for arbitrary natural number n, `every f-bulkable x and (n:y:A) have f(x+y)=f(f(x),f(y))'. We have just shown that Known(1): the bulk action on a direct sum is the result of combining the bulk actions of its constituents with f whenever the right-hand constituent's domain has at most one member. Now, for any natural number n and f-bulkable x, (successor(n)|y:A), we use the (unique natural) order-preserving isomorphism ((|x)+(n+1)| :((|x)+n)+1) to turn f(x+y), decomposed as f(x+((n|y:)+(1|0->y(n):A))), into f((x+(n|y:))+(:0->y(n):)) in which the second summand is a singleton, so we know the result to be f(f(x+(n|y:)),y(n)).

Whenever Known(n), any (successor(n):y:) either has (|y)=successor(n) or has a (unique natural) order-preserving isomorphism with some member of successor(n), which is inevitably a subset of n, via which we can factorise y as an (n::A) and apply Known(n) to it. Thus if Known(n) doesn't already tell us that some f-bulkable x and (successor(n):y:A) have f(x+y)=f(f(x),f(y)), then (|y) is successor n and we apply f(x+y)=f(f(x+(n|y:)),y(n)) to which we apply Known(n) to reveal f(x+(n|y:)) = f(f(x),f((n|y:))) whence, using associativity, f(x+y) = f(f(f(x),f((n|y:))),y(n)) = f(f(x),f(f((n|y:)),y(n))); and the definition of bulk action tells us that f(f((n|y:)),y(n)) is f(y), so we now have f(x+y)=f(f(x),f(y)). Consequently, Known(n) implies Known(successor(n)), whence, by induction, Known(1) (which we know) implies Known(n) for any natural number n. Notice that f(x+(y+z))=f((x+y)+z) now becomes just the result of applyin the associativity of f to f(x+y)=f(f(x),f(y)).

We thus combine a binary operator, disjoint sum, and a projection, embedded bulk action, on f-bulkable functions to obtain an associative binary operator on f's bulkable functions, (: (x,y)-> f(x+y) :). We now show that we can add the empty function to the bulkable functions and extend f consistently to encompass it. Thus, even without cancellation or completeness, we can obtain an identity for f.

What now can we say about the (empty) function, e, from the empty set to A ? For any bulkable function x, we have x+e and e+x equivalent to x (via the order-preserving isomorphism between the domain, n of (n|x:A) and n+e or e+n): hence, our binary operator must combine x with e, in either order, to yield the bulk action of x (as embedded, of course), ie f(e+x)=f(x)=f(x+e). Consequently e acts on our embedding of A as an identity for f: it acts as if A included an identity, f(e). If we do not in fact have an identity in A, our binary operator on finite functions to A can be consistently extended to let f(e) be e, since e+e is e.

Thus even when we have no left-identity with which to extend f's bulk action, as above, we can effectively add one if ever we find one more convenient (as long as f is associative).

Bulk action and measures.

When we look on an Abelian associative binary operator, (A×A|+:A), as addition (so we see our bulk action as summation) any function from a finite set to A defines a measure on that finite set (via summation on subsets). For (h|x:A) with p and q subsets of h, we can partition p and q (and hence their union, which I'll call u) into their intersection (which I'll call n) and their mutual exclusions, p\q and q\p. We then have natural isomorphisms equating p=n+(p\q), q=n+(q\p) and u=(q\p)+n+(p\q), whence n+u is isomorphic to n+(q\p)+n+(p\q) which is, in turn, isomorphic to p+q. Examining the restrictions of x to each of these, we obtain equal sums for (n|x:A)+(u|x:A) and (p|x:A)+(q|x:A), whence sum(n|x:A) + sum(u|x:A) = sum(p|x:A) + sum(q|x:A), the required relationship for a measure on two sets and on their intersection and union.

livery
Maintained by Eddy.