Article and note
A Series Learning through the MacWilliams IdentityPart 1 of 12
An Introduction to Möbius Inversion through the MacWilliams Identity
- Published:
- Updated:
- Reading time:
- 26 min (about 5,566 words)
Introduction
In coding theory, one of the basic theorems is the MacWilliams identity. For a linear code over a finite field with coordinate set , the weight enumerator of its dual code
can be computed from the weight enumerator of as follows:
This is the MacWilliams identity.
There are several proofs of the MacWilliams identity. In this series, I use proofs of the MacWilliams identity as a guide to an introduction to neighbouring areas and concepts. Accordingly, this series is aimed at readers who already know at least the following basic material:
We assume that the reader already knows, at least at a basic level, the following notions:
what a (finite) field is,
what a linear code over a finite field is,
what the Hamming weight is,
what the dual code is.
(There is no problem if you do not know a proof of the MacWilliams identity.)
In this series, I look at the proof methods for the MacWilliams identity under the following five broad families.
Fourier, character, and Poisson methods.
Möbius inversion, lattice theory, and shortening/puncturing methods.
Orthogonal polynomials and association schemes.
Matroids and Tutte polynomials.
Moments and double-counting methods.
This note focuses on the second of these: the Möbius inversion, lattice-theoretic, and shortening/puncturing approach. In other words, the aim of this first note is to use a proof of the MacWilliams identity as a guide to an introduction to lattice theory and Möbius inversion. If one only wants to prove the MacWilliams identity, it is enough to understand the Boolean lattice. However, as an introduction to the subject itself, I also introduce some basic lattices such as the lattice of subspaces and the partition lattice. Readers who are interested only in the proof of the MacWilliams identity may skip everything except the discussion of Boolean lattices.
When one hears Möbius inversion, one may think of a formula from elementary number theory such as
or of the inclusion–exclusion principle for finite sets.
In fact, both of these are special cases of the same theory. The systematic treatment of Möbius functions on posets was organised by Rota [Rot64] as one of the central tools of combinatorics. The common setting is that of a locally finite poset. If one considers the operation of “summing everything below”, then Möbius inversion appears as its inverse operation.
The route in this note is as follows:
posets → lattices → intervals → incidence algebras → Möbius functions → Möbius inversion → the MacWilliams identity
If one is concerned only with proving the MacWilliams identity, the poset ultimately needed is the set of all subsets of the coordinate set . This is a special lattice called the Boolean lattice. However, if one starts by treating only the Boolean lattice, it becomes harder to see the overall picture of Möbius inversion as a subject. So I first introduce Möbius inversion on general posets, then treat the Boolean lattice as one important example, and finally apply it to coding theory.
For incidence algebras and Möbius inversion on posets, Stanley [Sta12, Chapter 3] is a standard reference.
What is a poset?
The setting for Möbius inversion is a set equipped with an order. Let us begin with the most basic definitions. For the basics of posets and lattices, Davey–Priestley [DP02] is a readable standard introduction.
Definition 2.1.
Let be a set, and let be a binary relation on . We say that the pair is a partially ordered set or a poset if the following conditions hold for all :
(reflexivity)
If and , then (antisymmetry)
If and , then (transitivity)
When the order under discussion is clear from the context, I simply call a poset. For convenience, I also define
,
and ,
and ,
does not hold 1.
The word “partial” means that not every pair of elements must be comparable. If one puts the usual order on the integers , then any two integers can be compared. By contrast, if one considers the inclusion relation on subsets, it can happen that neither of two sets contains the other. The term poset is used precisely to treat such situations as orders.
Definition 2.2.
Let be a poset. Two elements are said to be comparable if either or holds. A poset is called a chain, or totally ordered, or linearly ordered, if every pair of elements is comparable.
Example 2.3.
The set
with the usual order is a chain.
Example 2.4.
Fix a finite set . The collection of all subsets of ,
becomes a poset when equipped with the inclusion relation . However, it is not a chain when . For example, if , then
but and are incomparable.
Example 2.5.
Fix a positive integer . Let
be the set of positive divisors of . Here , and means that divides , in other words that is a multiple of . If we define
then becomes a poset. As we shall see later, this example leads to the elementary number-theoretic Möbius function.
Lattices: joining two elements and taking what they have in common
Among posets, lattices are especially important. In a lattice, given two elements, one can consider the least element above both and the greatest element below both. These are called the join and the meet, respectively.
Definition 3.1.
Let be a poset, and let . An element is called an upper bound of and if
hold. If the least element among the upper bounds exists, it is called the join of and , and is denoted by .
Similarly, an element is called a lower bound of and if
hold. If the greatest element among the lower bounds exists, it is called the meet of and , and is denoted by .
Definition 3.2.
A poset is called a lattice if, for every pair of elements , both the join and the meet exist in .
Definition 3.3.
For a finite set , the lattice obtained by ordering the set of all subsets of by inclusion is called the Boolean lattice on .
In the Boolean lattice, for two subsets we have
So the join is the union and the meet is the intersection. Moreover, the least element is , and the greatest element is .
Example 3.4 (The lattice of subspaces).
Let be a finite-dimensional vector space, and write for the collection of all subspaces of . Ordered by inclusion, is a lattice. Indeed, for subspaces we have
In coding theory over finite fields, this example appears repeatedly.
Example 3.5 (The partition lattice).
Let denote the set of all partitions of a finite set . For , define to mean that “ is finer than ”. With this order, is a lattice. Its least element is the discrete partition, and its greatest element is the partition with a single block . The meet is the common refinement, namely the partition consisting of all non-empty intersections of a block of with a block of . The join is the finest partition which is coarser than both and , and corresponds to the equivalence relation generated by “belonging to the same block”.
The divisor poset is also a lattice. For two divisors , one has
So in the join is the least common multiple and the meet is the greatest common divisor.
However, being a lattice is not what Möbius inversion itself requires. In fact, Möbius inversion can be carried out on the slightly broader class of locally finite posets. It is best to think of lattices as an important class in which Möbius inversion appears especially naturally.
Intervals and local finiteness
For Möbius inversion, what matters is not so much the whole poset as the part lying between two elements.
Definition 4.1.
For example, if in , then
holds. This has the same form as the set of all subsets of . Indeed, can be written uniquely in the form with .
Definition 4.2.
A poset is called locally finite if, for every with , the interval is a finite set.
Of course, every finite poset is locally finite. An example which is locally finite but not finite is the ring of integers . There are infinitely many integers, so itself is infinite. However, for two integers , the integers lying between them are
and this is finite, with . So is a locally finite poset. By contrast, the rationals and the reals also become posets under the usual order, but whenever , the interval is infinite, so they are not locally finite.
The Boolean lattice used in this note for the MacWilliams identity is also locally finite, since it is finite. Local finiteness is needed because later we will consider sums such as . If only finitely many appear in such a sum, then one can add them without worrying about convergence.
The zeta transform as cumulative sums
In one sentence, Möbius inversion is a method for recovering the original values from cumulative sums.
First, consider a finite poset . Suppose that to each element a number is assigned. Define
by summing all the values at elements below . This is the cumulative sum of .
For example, on the chain ,
In this case the original function can be recovered by finite differences:
Here, for , one interprets this as .
On the subset poset ,
This means “sum over all subsets contained in ”. The formula for recovering in this case is the Inclusion–Exclusion Principle.
In this way, Möbius inversion is a single theory which includes ordinary finite differences and the inclusion–exclusion principle. In the context of posets, this cumulative-sum operation is called the zeta transform.
Incidence algebras
To write Möbius inversion cleanly, I introduce the incidence algebra. Despite the name “algebra”, at first it is enough to think of it as “a collection of functions assigning numbers to the intervals of a poset”.
Definition 6.1.
Let be a locally finite poset, and let be a commutative ring. The incidence algebra on 3 is the set of functions
satisfying
It is equipped with the pointwise addition, scalar multiplication, and convolution product defined below. That is, for and , define addition and scalar multiplication by
and for define the convolution product by
If you are not used to this, you may think of the commutative ring simply as a domain of numbers in which addition and multiplication can be performed, such as the rationals or the reals . When the coefficient ring is not specified, I will take the rational field and write .
The non-zero terms in the convolution sum are restricted, by the defining condition for functions in , to those with . Hence, if is locally finite, the sum is finite. With this operation, becomes a unital associative -algebra.
The convolution product resembles matrix multiplication. For the product of an matrix and an matrix , the -entry is
so one sums over the intermediate index . In the incidence algebra, the intermediate index is simply restricted to elements of , that is, elements satisfying .
The convolution product has a unit, namely the following delta function.
Definition 6.2.
Define by
This is called the delta function of the incidence algebra.
The following proposition shows that the delta function is indeed the unit.
Proposition 6.3.
For every ,
holds.
Proof
Fix arbitrary . First suppose that . Then there is no satisfying , so
Here the second equality follows from .
Now suppose that . Then the non-zero terms in the convolution product are restricted to the interval , so
Since is non-zero only when , this gives
Therefore .
Similarly, if , then , and if , then
Hence .
The zeta function and the Möbius function
Within the incidence algebra, the zeta function is especially important. The zeta function here is not the Riemann zeta function famous from the Riemann hypothesis, but the simplest incidence-algebra element attached to a poset.
Definition 7.1.
For a locally finite poset , define by
This is called the zeta function of .
Using the zeta function , the cumulative sum considered earlier,
can be written as
because under the condition .
The inverse of this zeta function is the Möbius function.
Definition 7.2.
The Möbius function of a locally finite poset is the inverse of in the incidence algebra . In other words, is called the Möbius function if
hold.
At first sight this definition looks somewhat abstract, but its meaning is simple. If the zeta function represents the operation “sum everything below”, then the Möbius function represents the inverse operation. The Möbius function can be computed by the following recursion.
Theorem 7.3.
For every locally finite poset , the Möbius function exists and is uniquely determined by the following conditions:
Proof
If , then because .
Expand the condition . For arbitrary with ,
holds. Since this must equal , in the case one must have . In the case one has
Separating the last term , we obtain
and therefore
Since the interval is finite by local finiteness, this is a recursion which determines successively from shorter intervals. Hence is uniquely determined.
Similarly, expanding the condition gives
One can verify by induction on the size of the interval that the constructed above also satisfies this formula. Therefore is a two-sided inverse of .
Möbius inversion
With the terminology prepared so far, Möbius inversion can be written very briefly. Before that, however, there is one point to note. The incidence algebra and the Möbius function themselves can be defined on a locally finite poset. However, the one-variable zeta transform appearing in the theorem below is defined as written only when this sum is finite. Local finiteness of a poset assumes only that each interval is finite, so the set over which the sum runs need not itself be finite. For example, is locally finite, but
is infinite for every .
The set ultimately used in this note, motivated by coding theory, is finite when is finite. So, to keep the exposition simple, I now restrict to finite posets.
Theorem 8.1 (Möbius inversion).
Let be a finite poset. Suppose that functions and on satisfy
Then, for every ,
holds, where is the Möbius function of .
Proof
By assumption, . Therefore
Here
Hence the expression above becomes
This theorem includes ordinary finite-difference formulas and the inclusion–exclusion principle in a single form. Let us now look at a few examples.
Example 8.2 (Chains give finite differences).
Consider the chain . If
then
Thus, in this case Möbius inversion is just the finite difference of a cumulative sum.
The Möbius function in this case is
Indeed, one recovers simply by subtracting from , so only the difference with the previous term remains.
Example 8.3 (On the divisibility poset, one recovers the number-theoretic Möbius function).
Turn the set of divisors of a positive integer into a poset using divisibility. Then, for ,
is a cumulative sum of the form considered here.
Möbius inversion then becomes, for ,
which is the familiar formula from elementary number theory. Here is the number-theoretic Möbius function.
Its relation with the Möbius function of the poset is
In other words, the number-theoretic Möbius function is a special case of the Möbius function of the divisibility poset.
This example shows that Möbius inversion is not merely a tool from coding theory, but a universal inversion principle which also arises naturally in number theory. What we eventually use for the MacWilliams identity is the Boolean lattice introduced in Definition 3.3. Its Möbius function is especially simple.
Theorem 8.4.
The Möbius function of the Boolean lattice is given, for , by
Proof
If , then the right-hand side is , which agrees with .
Assume . It is enough to verify the recursion for the Möbius function. What we need to show is
Here each can be written uniquely in the form
so the left-hand side becomes
In the last equality, we used the fact that because .
Therefore satisfies the recursion for the Möbius function. Since the Möbius function is uniquely determined, the conclusion follows.
Hence Möbius inversion on the Boolean lattice takes the following form.
Corollary 8.5 (Möbius inversion on the Boolean lattice).
Suppose functions on satisfy
Then
holds.
This is exactly the inclusion–exclusion principle. The important point, however, is not to remember inclusion–exclusion as an isolated formula, but to view it as one example of Möbius inversion on a poset.
Example 8.6 (The Möbius function of the subspace lattice).
Let be a finite-dimensional vector space over , and consider the lattice of subspaces . For subspaces , put
Then
Indeed, the interval can be identified with the lattice of subspaces of the quotient space , so the value depends only on . This formula can be derived from identities for Gaussian binomial coefficients.
Example 8.7 (The Möbius function of the partition lattice).
Consider the partition lattice of a finite set under the refinement order. Let , and for each block of , let denote the number of blocks of contained in . Then
holds. In particular, if is the discrete partition and is the one-block partition, then
This is a more advanced example, but it vividly illustrates how rich the combinatorial content of the Möbius function can be.
Translating into coding theory: support exactly equal to, and support contained in
From here on, I apply Möbius inversion to the MacWilliams identity.
For simplicity, let the coordinate set be , and let be a linear code. For a codeword , define its support to be the set of coordinate positions of its non-zero components:
Then the Hamming weight is the cardinality of the support:
The weight enumerator is
To decompose this by support, define the following numbers.
Definition 9.1.
For , define
This is the number of codewords whose support is exactly .
Using this notation, the weight enumerator can be written as
However, from the linear-algebraic point of view, is somewhat awkward to handle. The reason is that the condition “support is exactly ” contains the conditions
if , then ,
if , then .
Being non-zero is not a linear condition.
By contrast, the condition “the support is contained in ” is linear. So, for , consider the following subcode.
Definition 9.2.
Define
This is the subcode of consisting of all codewords whose support is contained in .
Definition 9.3.
For , define
This is the number of codewords whose support is contained in .
The relation between these two numbers is exactly the zeta transform on the Boolean lattice.
Theorem 9.4.
For every ,
holds.
Proof
is the number of codewords whose support is contained in . The support of such a codeword is uniquely determined as some subset . The number of codewords whose support is exactly is , so summing over all gives
Therefore Möbius inversion gives the following.
Corollary 9.5.
For every ,
holds.
The structure visible here is not specific to coding theory. On the poset , one is simply using Möbius inversion to recover the original function from the cumulative one
exactly → contained in .
Rewriting the weight enumerator via Möbius inversion
Substituting the Möbius inversion from the previous section into the weight enumerator gives a convenient form. This is a calculation which will be used repeatedly later in the proof.
Theorem 10.1.
Let be a linear code. Then
holds.
Proof
Writing the weight enumerator in terms of the number of codewords with support exactly , we obtain
Substituting
into this gives
Since can be written uniquely as with , the inner sum becomes
Hence the claim follows.
This formula illustrates well the role played by Möbius inversion in the proof of the MacWilliams identity. The variable comes from the Möbius function of the Boolean lattice. Meanwhile, comes from the size formula for orthogonal complements over finite fields, .
Dual codes, puncturing, and shortening
At this point we know that, via Möbius inversion, the weight enumerator can be written using
The remaining task is to express for in terms of for .
For this purpose, I introduce the notation for puncturing and shortening.
Definition 11.1.
For , define
and call it the punctured code obtained by deleting the coordinates in . Also define
and call it the shortened code on the coordinates in .
Note that both the superscript and the subscript indicate the set of coordinates being deleted. Accordingly, the puncturing which keeps the coordinates in is , and is naturally identified, by restriction to , with . In other words,
Similarly, is naturally identified with .
Theorem 11.2.
For every ,
holds.
Proof
The left-hand equality is simply the definition of . So it is enough to prove the equality on the right.
Fix . Consider the restriction map to the coordinates in ,
By the definition of puncturing,
Also, the kernel of consists of the codewords whose coordinates on are all zero, so
Hence, by the first isomorphism theorem,
Next, let us study the relation between and the orthogonal complement of . Equip with the usual inner product
For , write for the vector obtained by extending by zeros on . That is, define
Then
holds. Indeed, for ,
Since has all coordinates in equal to zero, the condition is equivalent to . This proves (11.2).
The zero-extension map
is a linear isomorphism between and the set of vectors in whose support is contained in . Therefore, by (11.2),
In a finite-dimensional vector space, for every subspace , one has
This follows from and . Applying this to yields
Substituting (11.1) into this gives
which is the claim.
This theorem is the linear-algebraic core of the proof in this note. Möbius inversion moves between “support is contained in” and “support is exactly”, and the theorem above links with . In the language of coding theory, what is used here is the relation together with . So the computation in the next section uses the relation between puncturing and shortening in a linear-algebraic way.
Proof of the MacWilliams identity
We are now ready to prove the MacWilliams identity. One only has to combine the ingredients prepared so far. The overall combinatorial proof based on puncturing, shortening, and supports is close to that in Brualdi–Pless–Beissinger [BPB88] and Goldwasser [Gol97].
First, applying Theorem 10.1 to gives
By Theorem 11.2,
so
Now put . Then , so
Next, substitute
Then the right-hand side of (12.1) becomes
Writing with , the inner sum becomes
Therefore
This proves the MacWilliams identity.
What Möbius inversion was doing in this proof
Let us summarise the role played by Möbius inversion in the proof.
From the coding-theoretic point of view, there were two kinds of numbers:
: the number of codewords whose support is exactly ,
: the number of codewords whose support is contained in .
These are related by
which is the zeta transform on the Boolean lattice. Hence, in the reverse direction, one has
This is Möbius inversion on the Boolean lattice.
The important point is that is more compatible with linear algebra than . The condition “support is exactly ” involves a non-zero condition, so it does not define a linear subspace. By contrast, the set of codewords whose support is contained in is the subcode
Thus, in this proof, the quantity one wants to count, namely , is first replaced by the linear-algebraically manageable quantity , and only at the end recovered by Möbius inversion. This viewpoint is common in many situations where Möbius inversion is used.
Concepts covered in this note
In this note, with the proof of the MacWilliams identity as our target, we introduced the basic tools of Möbius inversion. To summarise:
- Poset
A set equipped with an order satisfying reflexivity, antisymmetry, and transitivity. Not every pair of elements has to be comparable.
- Lattice
A poset in which every pair of elements has a join and a meet. On the set of all subsets, the join is union and the meet is intersection.
- Interval
For a poset , and with , the set of all elements between them,
is called the interval from to .
- Local finiteness
The property that every interval is finite. In Möbius inversion, this condition arises naturally because one considers finite sums over intervals.
- Incidence algebra
The algebra of functions on which vanish whenever . The multiplication is defined by convolution over all intermediate elements.
- Zeta function
The incidence-algebra element which returns when and otherwise. It represents cumulative summation on a poset.
- Möbius function
The inverse of the zeta function in the incidence algebra. It supplies the coefficients for recovering the original function from a cumulative sum.
- Möbius inversion
The inversion formula which turns
into
- Boolean lattice
The lattice obtained by ordering the set of all subsets by inclusion. Its Möbius function is
In the proof of the MacWilliams identity, Möbius inversion on the Boolean lattice was used. However, the Boolean lattice is only one example of Möbius inversion. On chains one gets finite differences; on divisor posets one gets the number-theoretic Möbius inversion; on Boolean lattices one gets the inclusion–exclusion principle. In this sense, Möbius inversion is a general principle for recovering local information from cumulative information.
Where this proof sits among the five systems
As stated at the beginning, the proof in this note belongs to the family
Möbius inversion, lattice theory, and shortening/puncturing methods.
On the surface, it is a proof using the supports of codewords. At a deeper level, however, it uses the zeta transform and Möbius inversion on the Boolean lattice.
The key point of the proof may be summarised in the following sentence:
Recover the information that the support is exactly equal to something from the cumulative information that the support is contained in something.
This idea of recovering local information from cumulative information is the core of Möbius inversion.
More generally, there is also a framework for studying MacWilliams-type identities for weights arising from support maps taking values in lattices; see [Rav18].
Next time
Next time, I turn to characters of finite abelian groups, which are the tools closest to the most standard proof of the MacWilliams identity.
In the present proof, arose from the restriction map to coordinates and the dimension count for orthogonal complements. In the character-theoretic proof next time, will arise from the orthogonality relation
Even though the identity is the same MacWilliams identity, the main actor in the Möbius-inversion proof was the inclusion relation on supports. In the character-theoretic proof, the main actors are group duality and orthogonality relations. By seeing this difference, one should gradually begin to recognise how the MacWilliams identity lies at the intersection of several areas of mathematics.
Footnotes
- Note that does not necessarily mean . It also happens when and are incomparable (cf. Definition 2.2 below). ↩
- Strictly speaking, one might call this a closed interval, but it seems standard simply to say interval. ↩
- It is also referred to as the incidence ring. ↩
References
- [Rot64] Gian-Carlo Rota. On the foundations of combinatorial theory. I. Theory of Möbius functions. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, vol. 2, pp. 340–368, 1964. doi:10.1007/BF00531932 ↩
- [Sta12] Richard P. Stanley. Enumerative combinatorics. Volume 1. Cambridge University Press, Cambridge, vol. 49, pp. xiv+626, 2012 ↩
- [DP02] B. A. Davey and H. A. Priestley. Introduction to lattices and order. Cambridge University Press, New York, pp. xii+298, 2002. doi:10.1017/CBO9780511809088 ↩
- [BPB88] Richard A. Brualdi, Vera S. Pless, and Janet S. Beissinger. On the MacWilliams identities for linear codes. Linear Algebra Appl., vol. 107, pp. 181–189, 1988. doi:10.1016/0024-3795(88)90244-3 ↩
- [Gol97] John L. Goldwasser. Shortened and punctured codes and the MacWilliams identities. Linear Algebra Appl., vol. 253, pp. 1–13, 1997. doi:10.1016/0024-3795(95)00671-0 ↩
- [Rav18] Alberto Ravagnani. Duality of codes supported on regular lattices, with an application to enumerative combinatorics. Des. Codes Cryptogr., vol. 86, no. 9, pp. 2035–2063, 2018. doi:10.1007/s10623-017-0436-3 ↩
This series
A Series Learning through the MacWilliams Identity · Part 1 of 12
Disclaimer
Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.
Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.
If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.