Article and note

A Series Learning through the MacWilliams IdentityPart 1 of 12

An Introduction to Möbius Inversion through the MacWilliams Identity

Among the five proof systems for the MacWilliams identity, this note focuses on the Möbius inversion, lattice-theoretic, and shortening/puncturing approach, and introduces posets, lattices, incidence algebras, and Möbius functions.

Published:: May 27, 2026
Updated:: Jun 6, 2026
Reading time:: 26 min (about 5,566 words)

Tagscoding theoryMacWilliams identitymobius inversionposetlatticeincidence algebrafinite fieldsexpository note

Download PDF

Introduction

In coding theory, one of the basic theorems is the MacWilliams identity. For a linear code $C \leq \mathbb{F}_{q}^{E}$ over a finite field $\mathbb{F}_{q}$ with coordinate set $E$ , the weight enumerator of its dual code

C^{\perp} = \{ u \in \mathbb{F}_{q}^{E} : u \cdot c = 0 \text{ for all } c \in C \}

can be computed from the weight enumerator of $C$ as follows:

W_{C^{\perp}}(X,Y) = \frac{1}{\card{C}} W_{C}\bigl(X + (q - 1)Y, X - Y \bigr).

This is the MacWilliams identity.

There are several proofs of the MacWilliams identity. In this series, I use proofs of the MacWilliams identity as a guide to an introduction to neighbouring areas and concepts. Accordingly, this series is aimed at readers who already know at least the following basic material:

We assume that the reader already knows, at least at a basic level, the following notions:

what a (finite) field is,
what a linear code over a finite field is,
what the Hamming weight is,
what the dual code is.

(There is no problem if you do not know a proof of the MacWilliams identity.)

In this series, I look at the proof methods for the MacWilliams identity under the following five broad families.

Fourier, character, and Poisson methods.
Möbius inversion, lattice theory, and shortening/puncturing methods.
Orthogonal polynomials and association schemes.
Matroids and Tutte polynomials.
Moments and double-counting methods.

This note focuses on the second of these: the Möbius inversion, lattice-theoretic, and shortening/puncturing approach. In other words, the aim of this first note is to use a proof of the MacWilliams identity as a guide to an introduction to lattice theory and Möbius inversion. If one only wants to prove the MacWilliams identity, it is enough to understand the Boolean lattice. However, as an introduction to the subject itself, I also introduce some basic lattices such as the lattice of subspaces and the partition lattice. Readers who are interested only in the proof of the MacWilliams identity may skip everything except the discussion of Boolean lattices.

When one hears Möbius inversion, one may think of a formula from elementary number theory such as

g(n) = \sum_{d \mid n}f(d) \quad\Longleftrightarrow\quad f(n) = \sum_{d \mid n}\mu(d)g(n/d)

or of the inclusion–exclusion principle for finite sets.

In fact, both of these are special cases of the same theory. The systematic treatment of Möbius functions on posets was organised by Rota [Rot64] as one of the central tools of combinatorics. The common setting is that of a locally finite poset. If one considers the operation of “summing everything below”, then Möbius inversion appears as its inverse operation.

The route in this note is as follows:

posets → lattices → intervals → incidence algebras → Möbius functions → Möbius inversion → the MacWilliams identity

If one is concerned only with proving the MacWilliams identity, the poset ultimately needed is the set $2^{E}$ of all subsets of the coordinate set $E$ . This is a special lattice called the Boolean lattice. However, if one starts by treating only the Boolean lattice, it becomes harder to see the overall picture of Möbius inversion as a subject. So I first introduce Möbius inversion on general posets, then treat the Boolean lattice as one important example, and finally apply it to coding theory.

For incidence algebras and Möbius inversion on posets, Stanley [Sta12, Chapter 3] is a standard reference.

What is a poset?

The setting for Möbius inversion is a set equipped with an order. Let us begin with the most basic definitions. For the basics of posets and lattices, Davey–Priestley [DP02] is a readable standard introduction.

Definition 2.1.

Let $P$ be a set, and let $\leq$ be a binary relation on $P$ . We say that the pair $(P, \leq)$ is a partially ordered set or a poset if the following conditions hold for all $x, y, z \in P$ :

$x \leq x$ (reflexivity)
If $x \leq y$ and $y \leq x$ , then $x=y$ (antisymmetry)
If $x \leq y$ and $y \leq z$ , then $x \leq z$ (transitivity)

When the order under discussion is clear from the context, I simply call $P$ a poset. For convenience, I also define

$x \geq y$ $\defarrow$ $y \leq x$ ,
$x < y$ $\defarrow$ $x \leq y$ and $x \neq y$ ,
$x > y$ $\defarrow$ $x \geq y$ and $x \neq y$ ,
$x \nleq y$ $\defarrow$ $x \leq y$ does not hold ^{Footnote¹Note that $x \nleq y$ does not necessarily mean $x > y$ . It also happens when $x$ and $y$ are incomparable (cf. Definition 2.2 below).1}.

The word “partial” means that not every pair of elements must be comparable. If one puts the usual order on the integers $\mathbb{Z}$ , then any two integers can be compared. By contrast, if one considers the inclusion relation on subsets, it can happen that neither of two sets contains the other. The term poset is used precisely to treat such situations as orders.

Definition 2.2.

Let $P$ be a poset. Two elements $x, y \in P$ are said to be comparable if either $x \leq y$ or $y \leq x$ holds. A poset is called a chain, or totally ordered, or linearly ordered, if every pair of elements is comparable.

Example 2.3.

The set

P = \{ 0, 1, 2, \dots, \ell \}

with the usual order is a chain.

Example 2.4.

Fix a finite set $E = \{ 1, 2, \dots, n \}$ . The collection of all subsets of $E$ ,

2^{E} = \{ S : S \subseteq E\},

becomes a poset when equipped with the inclusion relation $\subseteq$ . However, it is not a chain when $\card{E} \geq 2$ . For example, if $E = \{ 1, 2 \}$ , then

\emptyset \subseteq \{ 1 \},\qquad \emptyset \subseteq \{ 2 \},\qquad \{ 1 \} \subseteq \{ 1, 2 \},\qquad \{ 2 \} \subseteq \{ 1, 2 \},

but $\{ 1 \}$ and $\{ 2 \}$ are incomparable.

Example 2.5.

Fix a positive integer $N$ . Let

D(N) = \{ d \in \mathbb{Z}_{> 0} : d \mid N \}

be the set of positive divisors of $N$ . Here $\mathbb{Z}_{> 0} \coloneqq \{ m \in \mathbb{Z} \mid m > 0 \}$ , and $d \mid N$ means that $d$ divides $N$ , in other words that $N$ is a multiple of $d$ . If we define

d \leq e \quad\Longleftrightarrow\quad d \mid e,

then $D(N)$ becomes a poset. As we shall see later, this example leads to the elementary number-theoretic Möbius function.

Lattices: joining two elements and taking what they have in common

Among posets, lattices are especially important. In a lattice, given two elements, one can consider the least element above both and the greatest element below both. These are called the join and the meet, respectively.

Definition 3.1.

Let $P$ be a poset, and let $x, y \in P$ . An element $z \in P$ is called an upper bound of $x$ and $y$ if

x \leq z \quad\text{and}\quad y \leq z

hold. If the least element among the upper bounds exists, it is called the join of $x$ and $y$ , and is denoted by $x \vee y$ .

Similarly, an element $w \in P$ is called a lower bound of $x$ and $y$ if

w \leq x, \quad\text{and}\quad w \leq y

hold. If the greatest element among the lower bounds exists, it is called the meet of $x$ and $y$ , and is denoted by $x \wedge y$ .

Definition 3.2.

A poset $P$ is called a lattice if, for every pair of elements $x, y \in P$ , both the join $x \vee y$ and the meet $x \wedge y$ exist in $P$ .

Definition 3.3.

For a finite set $E$ , the lattice obtained by ordering the set $2^{E}$ of all subsets of $E$ by inclusion is called the Boolean lattice on $E$ .

In the Boolean lattice, for two subsets $S, T \subseteq E$ we have

S \vee T = S \cup T, \qquad S \wedge T = S \cap T.

So the join is the union and the meet is the intersection. Moreover, the least element is $\emptyset$ , and the greatest element is $E$ .

Example 3.4 (The lattice of subspaces).

Let $V$ be a finite-dimensional vector space, and write $L(V)$ for the collection of all subspaces of $V$ . Ordered by inclusion, $L(V)$ is a lattice. Indeed, for subspaces $U, W \leq V$ we have

U \vee W = U + W, \qquad U \wedge W = U \cap W.

In coding theory over finite fields, this example appears repeatedly.

Example 3.5 (The partition lattice).

Let $\Pi(E)$ denote the set of all partitions of a finite set $E$ . For $\pi, \sigma \in \Pi(E)$ , define $\pi \leq \sigma$ to mean that “ $\pi$ is finer than $\sigma$ ”. With this order, $\Pi(E)$ is a lattice. Its least element is the discrete partition, and its greatest element is the partition with a single block $E$ . The meet is the common refinement, namely the partition consisting of all non-empty intersections of a block of $\pi$ with a block of $\sigma$ . The join is the finest partition which is coarser than both $\pi$ and $\sigma$ , and corresponds to the equivalence relation generated by “belonging to the same block”.

The divisor poset $D(N)$ is also a lattice. For two divisors $d, e \mid N$ , one has

d \vee e = \lcm(d,e), \qquad d \wedge e = \gcd(d,e).

So in $D(N)$ the join is the least common multiple and the meet is the greatest common divisor.

However, being a lattice is not what Möbius inversion itself requires. In fact, Möbius inversion can be carried out on the slightly broader class of locally finite posets. It is best to think of lattices as an important class in which Möbius inversion appears especially naturally.

Intervals and local finiteness

For Möbius inversion, what matters is not so much the whole poset as the part lying between two elements.

Definition 4.1.

Let $(P, \leq)$ be a poset, and let $x, y \in P$ satisfy $x \leq y$ . Define

\lbrack x, y \rbrack \coloneqq \{ z \in P : x \leq z \leq y\}.

This is called the interval from $x$ to $y$ .^{Footnote²Strictly speaking, one might call this a closed interval, but it seems standard simply to say interval.2}

For example, if $S \subseteq T$ in $2^{E}$ , then

\lbrack S, T \rbrack = \{ U \subseteq E \mid S \subseteq U \subseteq T \}

holds. This has the same form as the set of all subsets of $T \setminus S$ . Indeed, $U$ can be written uniquely in the form $U = S \cup A$ with $A \subseteq T \setminus S$ .

Definition 4.2.

A poset $P$ is called locally finite if, for every $x, y \in P$ with $x \leq y$ , the interval $\lbrack x, y \rbrack$ is a finite set.

Of course, every finite poset is locally finite. An example which is locally finite but not finite is the ring of integers $\mathbb{Z}$ . There are infinitely many integers, so $\mathbb{Z}$ itself is infinite. However, for two integers $s, t \in \mathbb{Z}$ , the integers lying between them are

\lbrack s, t \rbrack = \{ s, s + 1, \dots, t - 1, t \},

and this is finite, with $\card{\lbrack s, t \rbrack} = t - s + 1 < \infty$ . So $\mathbb{Z}$ is a locally finite poset. By contrast, the rationals $\mathbb{Q}$ and the reals $\mathbb{R}$ also become posets under the usual order, but whenever $s < t$ , the interval $\lbrack s, t \rbrack$ is infinite, so they are not locally finite.

The Boolean lattice $2^{E}$ used in this note for the MacWilliams identity is also locally finite, since it is finite. Local finiteness is needed because later we will consider sums such as $\displaystyle \sum_{z \in \lbrack x, y \rbrack}$ . If only finitely many $z$ appear in such a sum, then one can add them without worrying about convergence.

The zeta transform as cumulative sums

In one sentence, Möbius inversion is a method for recovering the original values from cumulative sums.

First, consider a finite poset $P$ . Suppose that to each element $x \in P$ a number $f(x)$ is assigned. Define

g(x) \coloneqq \sum_{y \in P: y \leq x} f(y)

by summing all the values at elements below $x$ . This $g$ is the cumulative sum of $f$ .

For example, on the chain $0 < 1 < 2 < \dots < m$ ,

g(i) = f(0) + f(1) + \dots + f(i).

In this case the original function $f$ can be recovered by finite differences:

f(i) = g(i) - g(i-1).

Here, for $i = 0$ , one interprets this as $f(0) = g(0)$ .

On the subset poset $2^{E}$ ,

g(S) = \sum_{T \subseteq E : T \subseteq S} f(T).

This means “sum over all subsets $T$ contained in $S$ ”. The formula for recovering $f$ in this case is the Inclusion–Exclusion Principle.

In this way, Möbius inversion is a single theory which includes ordinary finite differences and the inclusion–exclusion principle. In the context of posets, this cumulative-sum operation is called the zeta transform.

Incidence algebras

To write Möbius inversion cleanly, I introduce the incidence algebra. Despite the name “algebra”, at first it is enough to think of it as “a collection of functions assigning numbers to the intervals of a poset”.

Definition 6.1.

Let $P$ be a locally finite poset, and let $R$ be a commutative ring. The incidence algebra on $P$ ^{Footnote³It is also referred to as the incidence ring.3} is the set $I(P;R)$ of functions

\alpha \colon P \times P \to R

satisfying

x \nleq y \quad \Longrightarrow \quad \alpha(x,y) = 0.

It is equipped with the pointwise addition, scalar multiplication, and convolution product defined below. That is, for $\alpha, \beta \in I(P; R)$ and $r \in R$ , define addition and scalar multiplication by

(\alpha + \beta)(x, y) \coloneqq \alpha(x, y) + \beta(x, y), \quad (r\alpha)(x, y) \coloneqq r \cdot \alpha(x, y) \qquad (\forall x, y \in P),

and for $\alpha, \beta \in I(P;R)$ define the convolution product by

(\alpha \ast \beta)(x, y) = \sum_{z \in P} \alpha(x, z) \beta(z, y).

If you are not used to this, you may think of the commutative ring $R$ simply as a domain of numbers in which addition and multiplication can be performed, such as the rationals $\mathbb{Q}$ or the reals $\mathbb{R}$ . When the coefficient ring is not specified, I will take the rational field and write $I(P) \coloneqq I(P;\mathbb{Q})$ .

The non-zero terms in the convolution sum are restricted, by the defining condition for functions in $I(P;R)$ , to those $z \in P$ with $x \leq z \leq y$ . Hence, if $P$ is locally finite, the sum is finite. With this operation, $I(P;R)$ becomes a unital associative $R$ -algebra.

The convolution product resembles matrix multiplication. For the product of an $n \times \ell$ matrix $A$ and an $\ell \times m$ matrix $B$ , the $(i, j)$ -entry is

(AB)_{ij} = \sum_{1 \leq k \leq \ell} A_{ik} B_{kj},

so one sums over the intermediate index $k$ . In the incidence algebra, the intermediate index $z$ is simply restricted to elements of $\lbrack x, y \rbrack$ , that is, elements satisfying $x \leq z \leq y$ .

The convolution product has a unit, namely the following delta function.

Definition 6.2.

Define $\delta \in I(P)$ by

\delta(x, y) = \begin{cases} 1, & x = y,\\ 0, & x \neq y. \end{cases}

This is called the delta function of the incidence algebra.

The following proposition shows that the delta function is indeed the unit.

Proposition 6.3.

For every $\alpha \in I(P)$ ,

\delta \ast \alpha = \alpha, \qquad \alpha \ast \delta = \alpha

holds.

Proof

Fix arbitrary $x, y \in P$ . First suppose that $x \nleq y$ . Then there is no $z$ satisfying $x \leq z \leq y$ , so

(\delta \ast \alpha)(x,y)=0=\alpha(x,y).

Here the second equality follows from $\alpha \in I(P)$ .

Now suppose that $x \leq y$ . Then the non-zero terms in the convolution product are restricted to the interval $\lbrack x, y \rbrack$ , so

(\delta \ast \alpha)(x,y) = \sum_{z \in \lbrack x, y \rbrack} \delta(x,z)\alpha(z,y).

Since $\delta(x,z)$ is non-zero only when $z=x$ , this gives

(\delta \ast \alpha)(x,y)=\alpha(x,y).

Therefore $\delta \ast \alpha = \alpha$ .

Similarly, if $x \nleq y$ , then $(\alpha \ast \delta)(x,y)=0=\alpha(x,y)$ , and if $x \leq y$ , then

(\alpha \ast \delta)(x,y) = \sum_{z \in \lbrack x, y \rbrack} \alpha(x,z)\delta(z,y) = \alpha(x,y).

Hence $\alpha \ast \delta = \alpha$ .

The zeta function and the Möbius function

Within the incidence algebra, the zeta function is especially important. The zeta function here is not the Riemann zeta function famous from the Riemann hypothesis, but the simplest incidence-algebra element attached to a poset.

Definition 7.1.

For a locally finite poset $P$ , define $\zeta \in I(P)$ by

\zeta(x, y) \coloneqq \begin{cases} 1, & x \leq y \text{ holds}, \\ 0, & x \nleq y \text{ holds}. \end{cases}

This is called the zeta function of $P$ .

Using the zeta function $\zeta$ , the cumulative sum considered earlier,

g(x) = \sum_{y \leq x} f(y),

can be written as

g(x) = \sum_{y \leq x} f(y) \zeta(y, x),

because $\zeta(y, x) = 1$ under the condition $y \leq x$ .

The inverse of this zeta function is the Möbius function.

Definition 7.2.

The Möbius function of a locally finite poset $P$ is the inverse of $\zeta$ in the incidence algebra $I(P)$ . In other words, $\mu \in I(P)$ is called the Möbius function if

\mu \ast \zeta = \delta, \quad\text{and}\quad \zeta \ast \mu = \delta

hold.

At first sight this definition looks somewhat abstract, but its meaning is simple. If the zeta function represents the operation “sum everything below”, then the Möbius function represents the inverse operation. The Möbius function can be computed by the following recursion.

Theorem 7.3.

For every locally finite poset $P$ , the Möbius function $\mu$ exists and is uniquely determined by the following conditions:

\mu(x, y) = \begin{cases} 0, & x \nleq y \text{ holds}, \\ 1, & x = y \text{ holds}, \\ \displaystyle -\sum_{\substack{z \in \lbrack x, y \rbrack \\ z \neq y}} \mu(x, z), & x < y \text{ holds}. \end{cases}

Proof

If $x \nleq y$ , then $\mu(x,y)=0$ because $\mu \in I(P)$ .

Expand the condition $\mu \ast \zeta = \delta$ . For arbitrary $x, y \in P$ with $x \leq y$ ,

(\mu \ast \zeta)(x, y) = \sum_{z \in \lbrack x, y \rbrack} \mu(x, z) \zeta(z, y) = \sum_{z \in \lbrack x, y \rbrack} \mu(x, z)

holds. Since this must equal $\delta(x, y)$ , in the case $x = y$ one must have $\mu(x, x) = 1$ . In the case $x < y$ one has

\sum_{z \in \lbrack x, y \rbrack} \mu(x, z) = 0.

Separating the last term $z = y$ , we obtain

\left( \sum_{z \in \lbrack x, y \rbrack, \, z \neq y} \mu(x, z) \right) + \mu(x, y) = 0,

and therefore

\mu(x, y) = -\sum_{z \in \lbrack x, y \rbrack, \, z \neq y} \mu(x, z).

Since the interval $\lbrack x, y \rbrack$ is finite by local finiteness, this is a recursion which determines $\mu(x, y)$ successively from shorter intervals. Hence $\mu$ is uniquely determined.

Similarly, expanding the condition $\zeta \ast \mu = \delta$ gives

\mu(x, y) = -\sum_{z \in \lbrack x, y \rbrack, \, z \neq x} \mu(z, y) \qquad (x < y).

One can verify by induction on the size of the interval that the $\mu$ constructed above also satisfies this formula. Therefore $\mu$ is a two-sided inverse of $\zeta$ .

Möbius inversion

With the terminology prepared so far, Möbius inversion can be written very briefly. Before that, however, there is one point to note. The incidence algebra and the Möbius function themselves can be defined on a locally finite poset. However, the one-variable zeta transform $\displaystyle g(x) = \sum_{y \leq x} f(y)$ appearing in the theorem below is defined as written only when this sum is finite. Local finiteness of a poset assumes only that each interval $\lbrack x, y \rbrack$ is finite, so the set $\{ y \in P : y \leq x \}$ over which the sum runs need not itself be finite. For example, $\mathbb{Z}$ is locally finite, but

\{ y \in \mathbb{Z} : y \leq x \}

is infinite for every $x \in \mathbb{Z}$ .

The set $2^{E}$ ultimately used in this note, motivated by coding theory, is finite when $E$ is finite. So, to keep the exposition simple, I now restrict to finite posets.

Theorem 8.1 (Möbius inversion).

Let $P$ be a finite poset. Suppose that functions $f$ and $g$ on $P$ satisfy

g(x) = \sum_{y \in P, \, y \leq x} f(y).

Then, for every $x \in P$ ,

f(x) = \sum_{y \in P, \, y \leq x} \mu(y, x) g(y)

holds, where $\mu$ is the Möbius function of $P$ .

Proof

By assumption, $\displaystyle g(y) = \sum_{\substack{z \in P \\ z \leq y}} f(z)$ . Therefore

\begin{aligned} \sum_{y \in P, \, y \leq x} \mu(y, x)g(y) &= \sum_{y \in P, \, y \leq x} \mu(y, x) \sum_{z \in P, \, z \leq y} f(z)\\ &= \sum_{z \in P, \, z \leq x} f(z) \sum_{y \in \lbrack z, x \rbrack} \mu(y, x). \end{aligned}

Here

\sum_{y \in \lbrack z, x \rbrack} \mu(y, x) = (\zeta \ast \mu)(z, x) = \delta(z, x).

Hence the expression above becomes

\sum_{z \in P, \, z \leq x} f(z) \delta(z, x) = f(x).

This theorem includes ordinary finite-difference formulas and the inclusion–exclusion principle in a single form. Let us now look at a few examples.

Example 8.2 (Chains give finite differences).

Consider the chain $0 < 1 < 2 < \dots < m$ . If

g(i) = \sum_{j = 0}^{i} f(j),

then

f(0) = g(0), \qquad f(i) = g(i) - g(i-1) \quad (i \geq 1).

Thus, in this case Möbius inversion is just the finite difference of a cumulative sum.

The Möbius function in this case is

\mu(i, j) = \begin{cases} 0, & j < i,\\ 1, & j = i,\\ -1, & j = i + 1,\\ 0, & j \geq i + 2. \end{cases}

Indeed, one recovers $f(i)$ simply by subtracting $g(i-1)$ from $g(i)$ , so only the difference with the previous term remains.

Example 8.3 (On the divisibility poset, one recovers the number-theoretic Möbius function).

Turn the set $D(N)$ of divisors of a positive integer $N$ into a poset using divisibility. Then, for $n \in D(N)$ ,

g(n) = \sum_{d \mid n} f(d)

is a cumulative sum of the form considered here.

Möbius inversion then becomes, for $n \in D(N)$ ,

f(n) = \sum_{d \mid n} \mu_{\mathbb{Z}}(d) g\left(\frac{n}{d}\right),

which is the familiar formula from elementary number theory. Here $\mu_{\mathbb{Z}}$ is the number-theoretic Möbius function.

Its relation with the Möbius function of the poset is

\mu_{D(N)}(a, b) = \mu_{\mathbb{Z}} \left(\frac{b}{a}\right) \qquad (a, b \in D(N),\ a \mid b).

In other words, the number-theoretic Möbius function is a special case of the Möbius function of the divisibility poset.

This example shows that Möbius inversion is not merely a tool from coding theory, but a universal inversion principle which also arises naturally in number theory. What we eventually use for the MacWilliams identity is the Boolean lattice introduced in Definition 3.3. Its Möbius function is especially simple.

Theorem 8.4.

The Möbius function of the Boolean lattice $2^{E}$ is given, for $S \subseteq T$ , by

\mu(S, T) = (-1)^{\card{T} - \card{S}}.

Proof

If $S = T$ , then the right-hand side is $1$ , which agrees with $\mu(S, S) = 1$ .

Assume $S \subsetneq T$ . It is enough to verify the recursion for the Möbius function. What we need to show is

\sum_{U \in \lbrack S, T \rbrack}(-1)^{\card{U} - \card{S}} = 0.

Here each $U$ can be written uniquely in the form

U = S \cup A, \qquad A \subseteq T \setminus S,

so the left-hand side becomes

\sum_{A \subseteq T \setminus S} (-1)^{\card{A}} = (1 - 1)^{\card{T\setminus S}} = 0.

In the last equality, we used the fact that $\card{T \setminus S} > 0$ because $S \subsetneq T$ .

Therefore $(-1)^{\card{T} - \card{S}}$ satisfies the recursion for the Möbius function. Since the Möbius function is uniquely determined, the conclusion follows.

Hence Möbius inversion on the Boolean lattice takes the following form.

Corollary 8.5 (Möbius inversion on the Boolean lattice).

Suppose functions $f, g$ on $2^{E}$ satisfy

g(S) = \sum_{T \subseteq S} f(T).

Then

f(S) = \sum_{T \subseteq S} (-1)^{\card{S} - \card{T}} g(T)

holds.

This is exactly the inclusion–exclusion principle. The important point, however, is not to remember inclusion–exclusion as an isolated formula, but to view it as one example of Möbius inversion on a poset.

Example 8.6 (The Möbius function of the subspace lattice).

Let $V$ be a finite-dimensional vector space over $\mathbb{F}_{q}$ , and consider the lattice of subspaces $L(V)$ . For subspaces $U \subseteq W$ , put

r \coloneqq \dim W - \dim U.

Then

\mu(U, W) = (-1)^{r} q^{\binom{r}{2}}.

Indeed, the interval $\lbrack U, W \rbrack$ can be identified with the lattice of subspaces of the quotient space $W/U$ , so the value depends only on $r$ . This formula can be derived from identities for Gaussian binomial coefficients.

Example 8.7 (The Möbius function of the partition lattice).

Consider the partition lattice $\Pi(E)$ of a finite set $E$ under the refinement order. Let $\pi \leq \sigma$ , and for each block $B$ of $\sigma$ , let $k_{B}$ denote the number of blocks of $\pi$ contained in $B$ . Then

\mu(\pi, \sigma) = \prod_{B \in \sigma} (-1)^{k_{B}-1}(k_{B}-1)!

holds. In particular, if $\hat{0}$ is the discrete partition and $\hat{1}$ is the one-block partition, then

\mu(\hat{0}, \hat{1}) = (-1)^{\card{E}-1}(\card{E}-1)!.

This is a more advanced example, but it vividly illustrates how rich the combinatorial content of the Möbius function can be.

Translating into coding theory: support exactly equal to, and support contained in

From here on, I apply Möbius inversion to the MacWilliams identity.

For simplicity, let the coordinate set be $E = \{ 1, 2, \dots, n \}$ , and let $C \leq \mathbb{F}_{q}^{E}$ be a linear code. For a codeword $c = (c_{1}, \dots, c_{n})$ , define its support to be the set of coordinate positions of its non-zero components:

\supp(c)=\{ i \in E:c_{i}\neq 0 \}.

Then the Hamming weight is the cardinality of the support:

\wt(c) = \card{\supp(c)}.

The weight enumerator is

W_{C}(X, Y) = \sum_{c \in C} X^{n-\wt(c)} Y^{\wt(c)}.

To decompose this by support, define the following numbers.

Definition 9.1.

For $S \subseteq E$ , define

A_{C}(S) = \card{\{ c \in C:\supp(c) = S \}}.

This is the number of codewords whose support is exactly $S$ .

Using this notation, the weight enumerator can be written as

W_{C}(X,Y) = \sum_{S \subseteq E} A_{C}(S) X^{n - \card{S}} Y^{\card{S}}. \tag{9.1}

However, from the linear-algebraic point of view, $A_{C}(S)$ is somewhat awkward to handle. The reason is that the condition “support is exactly $S$ ” contains the conditions

if $i \in S$ , then $c_{i} \neq 0$ ,
if $i \notin S$ , then $c_{i} = 0$ .

Being non-zero is not a linear condition.

By contrast, the condition “the support is contained in $S$ ” is linear. So, for $S \subseteq E$ , consider the following subcode.

Definition 9.2.

Define

C(S) \coloneqq \{ c \in C : \supp(c) \subseteq S \}.

This is the subcode of $C$ consisting of all codewords whose support is contained in $S$ .

Definition 9.3.

For $S \subseteq E$ , define

B_{C}(S) \coloneqq \card{C(S)}.

This is the number of codewords whose support is contained in $S$ .

The relation between these two numbers is exactly the zeta transform on the Boolean lattice.

Theorem 9.4.

For every $S \subseteq E$ ,

B_{C}(S) = \sum_{T \subseteq S} A_{C}(T)

holds.

Proof

$B_{C}(S)$ is the number of codewords whose support is contained in $S$ . The support of such a codeword is uniquely determined as some subset $T \subseteq S$ . The number of codewords whose support is exactly $T$ is $A_{C}(T)$ , so summing over all $T \subseteq S$ gives

B_{C}(S) = \sum_{T \subseteq S} A_{C}(T).

Therefore Möbius inversion gives the following.

Corollary 9.5.

For every $S \subseteq E$ ,

A_{C}(S) = \sum_{T \subseteq S} (-1)^{\card{S} - \card{T}} B_{C}(T) = \sum_{T \subseteq S} (-1)^{\card{S} - \card{T}} \card{C(T)}

holds.

The structure visible here is not specific to coding theory. On the poset $2^{E}$ , one is simply using Möbius inversion to recover the original function from the cumulative one

exactly $T$ → contained in $S$ .

Rewriting the weight enumerator via Möbius inversion

Substituting the Möbius inversion from the previous section into the weight enumerator gives a convenient form. This is a calculation which will be used repeatedly later in the proof.

Theorem 10.1.

Let $C \leq \mathbb{F}_{q}^{E}$ be a linear code. Then

W_{C}(X,Y) = \sum_{T \subseteq E} B_{C}(T) Y^{\card{T}}(X - Y)^{n - \card{T}}

holds.

Proof

Writing the weight enumerator in terms of the number of codewords with support exactly $S$ , we obtain

W_{C}(X, Y) = \sum_{S \subseteq E} A_{C}(S) X^{n-\card{S}} Y^{\card{S}}.

Substituting

A_{C}(S) = \sum_{T \subseteq S}(-1)^{\card{S}-\card{T}} B_{C}(T)

into this gives

\begin{aligned} W_{C}(X,Y) &= \sum_{S \subseteq E} \sum_{T \subseteq S} (-1)^{\card{S} - \card{T}} B_{C}(T) X^{n - \card{S}} Y^{\card{S}}\\ &= \sum_{T \subseteq E} B_{C}(T) \sum_{S \subseteq E, \, T \subseteq S} (-1)^{\card{S}-\card{T}} X^{n - \card{S}}Y^{\card{S}}. \end{aligned}

Since $S$ can be written uniquely as $S = T \cup A$ with $A \subseteq E \setminus T$ , the inner sum becomes

\begin{aligned} \sum_{A \subseteq E \setminus T} (-1)^{\card{A}} X^{n - \card{T} - \card{A}} Y^{\card{T} + \card{A}} &= Y^{\card{T}} \sum_{A \subseteq E \setminus T} X^{n - \card{T} - \card{A}} (-Y)^{\card{A}}\\ &= Y^{\card{T}} (X - Y)^{n - \card{T}}. \end{aligned}

Hence the claim follows.

This formula illustrates well the role played by Möbius inversion in the proof of the MacWilliams identity. The variable $X - Y$ comes from the Möbius function $(-1)^{\card{S} - \card{T}}$ of the Boolean lattice. Meanwhile, $qY$ comes from the size formula for orthogonal complements over finite fields, $\card{U}\card{U^\perp}=q^{\card{S}}$ .

Dual codes, puncturing, and shortening

At this point we know that, via Möbius inversion, the weight enumerator can be written using

B_{C}(S) = \card{C(S)}.

The remaining task is to express $B_{C^{\perp}}(S)$ for $C^{\perp}$ in terms of $B_{C}(S)$ for $C$ .

For this purpose, I introduce the notation for puncturing and shortening.

Definition 11.1.

For $X \subseteq E$ , define

C^{X} \coloneqq \{ c|_{E \setminus X} : c \in C \} \leq \mathbb{F}_{q}^{E \setminus X}

and call it the punctured code obtained by deleting the coordinates in $X$ . Also define

C_{X} \coloneqq \{ c|_{E \setminus X} : c \in C,\ \supp(c) \cap X = \emptyset \} \leq \mathbb{F}_{q}^{E \setminus X}

and call it the shortened code on the coordinates in $X$ .

Note that both the superscript and the subscript indicate the set of coordinates being deleted. Accordingly, the puncturing which keeps the coordinates in $S$ is $C^{E \setminus S}$ , and $C(S)$ is naturally identified, by restriction to $S$ , with $C_{E \setminus S}$ . In other words,

\card{C_{E \setminus S}} = \card{C(S)} = B_{C}(S).

Similarly, $C^{\perp}(S)$ is naturally identified with $(C^{\perp})_{E \setminus S}$ .

Theorem 11.2.

For every $S \subseteq E$ ,

B_{C^{\perp}}(S) = \card{(C^{\perp})(S)} = \frac{q^{\card{S}}}{\card{C}} B_{C}(E \setminus S)

holds.

Proof

The left-hand equality is simply the definition of $B_{C^{\perp}}(S)$ . So it is enough to prove the equality on the right.

Fix $S \subseteq E$ . Consider the restriction map to the coordinates in $S$ ,

\rho_{S} \colon C \to \mathbb{F}_{q}^{S}, \qquad c \mapsto c|_{S}.

By the definition of puncturing,

\Im(\rho_{S}) = C^{E \setminus S}.

Also, the kernel of $\rho_S$ consists of the codewords whose coordinates on $S$ are all zero, so

\Ker(\rho_{S}) = \{ c \in C : \supp(c) \subseteq E \setminus S \} = C(E \setminus S).

Hence, by the first isomorphism theorem,

\card{C^{E \setminus S}} = \card{\Im(\rho_S)} = \frac{\card{C}}{\card{\Ker(\rho_S)}} = \frac{\card{C}}{\card{C(E \setminus S)}} = \frac{\card{C}}{B_C(E \setminus S)}. \tag{11.1}

Next, let us study the relation between $(C^{\perp})(S)$ and the orthogonal complement of $C^{E\setminus S}$ . Equip $\mathbb{F}_{q}^{S}$ with the usual inner product

u \cdot v = \sum_{i \in S} u_i v_i.

For $u \in \mathbb{F}_{q}^{S}$ , write $\widetilde{u} \in \mathbb{F}_{q}^{E}$ for the vector obtained by extending $u$ by zeros on $E$ . That is, define

\widetilde{u}_i = \begin{cases} u_i, & i \in S,\\ 0, & i \in E \setminus S. \end{cases}

Then

\{ u \in \mathbb{F}_{q}^{S} : \widetilde{u} \in (C^{\perp})(S) \} = \left(C^{E \setminus S}\right)^{\perp} \tag{11.2}

holds. Indeed, for $u \in \mathbb{F}_{q}^{S}$ ,

\begin{aligned} u \in \left(C^{E \setminus S}\right)^{\perp} &\Longleftrightarrow u \cdot v = 0 \quad \text{for all } v \in C^{E \setminus S} \\ &\Longleftrightarrow u \cdot (c|_S) = 0 \quad \text{for all } c \in C \\ &\Longleftrightarrow \widetilde{u} \cdot c = 0 \quad \text{for all } c \in C \\ &\Longleftrightarrow \widetilde{u} \in C^{\perp}. \end{aligned}

Since $\widetilde{u}$ has all coordinates in $E \setminus S$ equal to zero, the condition $\widetilde{u} \in C^{\perp}$ is equivalent to $\widetilde{u} \in (C^{\perp})(S)$ . This proves (11.2).

The zero-extension map

\mathbb{F}_{q}^{S} \to \mathbb{F}_{q}^{E}, \qquad u \mapsto \widetilde{u}

is a linear isomorphism between $\mathbb{F}_{q}^{S}$ and the set of vectors in $\mathbb{F}_{q}^{E}$ whose support is contained in $S$ . Therefore, by (11.2),

\card{(C^{\perp})(S)} = \card{\left(C^{E \setminus S}\right)^{\perp}}.

In a finite-dimensional vector space, for every subspace $U \leq \mathbb{F}_{q}^{S}$ , one has

\card{U}\,\card{U^{\perp}} = q^{\card{S}}.

This follows from $\dim U + \dim U^{\perp} = \card{S}$ and $\card{U}=q^{\dim U}$ . Applying this to $U = C^{E \setminus S}$ yields

\card{(C^{\perp})(S)} = \card{\left(C^{E \setminus S}\right)^{\perp}} = \frac{q^{\card{S}}}{\card{C^{E \setminus S}}}.

Substituting $\card{C^{E \setminus S}} = \card{\Im(\rho_S)} = \frac{\card{C}}{\card{\Ker(\rho_S)}} = \frac{\card{C}}{\card{C(E \setminus S)}} = \frac{\card{C}}{B_C(E \setminus S)}. \tag{11.1}$ (11.1) into this gives

\card{(C^{\perp})(S)} = \frac{q^{\card{S}}}{\card{C}} B_C(E \setminus S),

which is the claim.

This theorem is the linear-algebraic core of the proof in this note. Möbius inversion moves between “support is contained in” and “support is exactly”, and the theorem above links $C$ with $C^{\perp}$ . In the language of coding theory, what is used here is the relation $C^{E \setminus S}$ together with $C(S) \cong C_{E \setminus S}$ . So the computation in the next section uses the relation between puncturing and shortening in a linear-algebraic way.

Proof of the MacWilliams identity

We are now ready to prove the MacWilliams identity. One only has to combine the ingredients prepared so far. The overall combinatorial proof based on puncturing, shortening, and supports is close to that in Brualdi–Pless–Beissinger [BPB88] and Goldwasser [Gol97].

First, applying Theorem 10.1 to $C^{\perp}$ gives

W_{C^{\perp}}(X, Y) = \sum_{S \subseteq E} B_{C^{\perp}}(S) Y^{\card{S}} (X - Y)^{n - \card{S}}.

By Theorem 11.2,

B_{C^{\perp}}(S) = \frac{q^{\card{S}}}{\card{C}} B_{C}(E \setminus S),

\begin{aligned} W_{C^{\perp}}(X, Y) &= \frac{1}{\card{C}} \sum_{S \subseteq E} q^{\card{S}} B_{C}(E \setminus S) Y^{\card{S}} (X - Y)^{n - \card{S}}. \end{aligned}

Now put $J = E \setminus S$ . Then $\card{S} = n - \card{J}$ , so

\begin{aligned} W_{C^{\perp}}(X, Y) &= \frac{1}{\card{C}} \sum_{J \subseteq E} (qY)^{n-\card{J}} B_{C}(J) (X - Y)^{\card{J}}. \tag{12.1} \end{aligned}

Next, substitute

B_{C}(J) = \sum_{U \subseteq J} A_{C}(U).

Then the right-hand side of (12.1) becomes

\begin{aligned} &\frac{1}{\card{C}} \sum_{J \subseteq E} (qY)^{n - \card{J}}(X - Y)^{\card{J}} \sum_{U \subseteq J} A_{C}(U)\\ &= \frac{1}{\card{C}} \sum_{U \subseteq E} A_{C}(U) \sum_{J \subseteq E: U \subseteq J} (qY)^{n - \card{J}} (X - Y)^{\card{J}}. \end{aligned}

Writing $J = U \cup A$ with $A \subseteq E \setminus U$ , the inner sum becomes

\begin{aligned} &\sum_{A \subseteq E \setminus U} (qY)^{n-\card{U}-\card{A}} (X - Y)^{\card{U} + \card{A}} \\ &\quad= (X - Y)^{\card{U}} \sum_{A \subseteq E \setminus U} (qY)^{n-\card{U}-\card{A}}(X - Y)^{\card{A}}\\ &\quad= (X - Y)^{\card{U}} \bigl(qY + X - Y \bigr)^{n - \card{U}}\\ &\quad= (X - Y)^{\card{U}} \bigl(X + (q - 1)Y \bigr)^{n - \card{U}}. \end{aligned}

Therefore

\begin{aligned} W_{C^{\perp}}(X,Y) &= \frac{1}{\card{C}} \sum_{U \subseteq E} A_{C}(U) \bigl(X + (q - 1)Y \bigr)^{n - \card{U}}(X - Y)^{\card{U}} \\ &= \frac{1}{\card{C}} W_{C}\bigl( X + (q - 1)Y, X - Y \bigr). \end{aligned}

This proves the MacWilliams identity.

What Möbius inversion was doing in this proof

Let us summarise the role played by Möbius inversion in the proof.

From the coding-theoretic point of view, there were two kinds of numbers:

$A_{C}(S)$ : the number of codewords whose support is exactly $S$ ,
$B_{C}(S)$ : the number of codewords whose support is contained in $S$ .

These are related by

B_{C}(S) = \sum_{T \subseteq S} A_{C}(T),

which is the zeta transform on the Boolean lattice. Hence, in the reverse direction, one has

A_{C}(S) = \sum_{T \subseteq S}(-1)^{\card{S} - \card{T}} B_{C}(T).

This is Möbius inversion on the Boolean lattice.

The important point is that $B_{C}(S)$ is more compatible with linear algebra than $A_{C}(S)$ . The condition “support is exactly $S$ ” involves a non-zero condition, so it does not define a linear subspace. By contrast, the set of codewords whose support is contained in $S$ is the subcode

C(S) = \{ c \in C : \supp(c) \subseteq S \}.

Thus, in this proof, the quantity one wants to count, namely $A_{C}(S)$ , is first replaced by the linear-algebraically manageable quantity $B_{C}(S)=\card{C(S)}$ , and only at the end recovered by Möbius inversion. This viewpoint is common in many situations where Möbius inversion is used.

Concepts covered in this note

In this note, with the proof of the MacWilliams identity as our target, we introduced the basic tools of Möbius inversion. To summarise:

Poset

A set equipped with an order satisfying reflexivity, antisymmetry, and transitivity. Not every pair of elements has to be comparable.

Lattice

A poset in which every pair of elements has a join and a meet. On the set $2^{E}$ of all subsets, the join is union and the meet is intersection.

Interval

For a poset $P$ , and $x, y \in P$ with $x \leq y$ , the set of all elements between them,

\lbrack x, y \rbrack = \{ z \in P : x \leq z \leq y\},

is called the interval from $x$ to $y$ .

Local finiteness

The property that every interval is finite. In Möbius inversion, this condition arises naturally because one considers finite sums over intervals.

Incidence algebra

The algebra of functions on $P \times P$ which vanish whenever $x \nleq y$ . The multiplication is defined by convolution over all intermediate elements.

Zeta function

The incidence-algebra element which returns $1$ when $x \leq y$ and $0$ otherwise. It represents cumulative summation on a poset.

Möbius function

The inverse of the zeta function in the incidence algebra. It supplies the coefficients for recovering the original function from a cumulative sum.

Möbius inversion

The inversion formula which turns

g(x)=\sum_{y \in P, \, y \leq x}f(y)

into

f(x) = \sum_{y \in P, \, y \leq x}\mu(y,x)g(y).

Boolean lattice

The lattice obtained by ordering the set $2^{E}$ of all subsets by inclusion. Its Möbius function is

\mu(S, T) = (-1)^{\card{T} - \card{S}}.

In the proof of the MacWilliams identity, Möbius inversion on the Boolean lattice was used. However, the Boolean lattice is only one example of Möbius inversion. On chains one gets finite differences; on divisor posets one gets the number-theoretic Möbius inversion; on Boolean lattices one gets the inclusion–exclusion principle. In this sense, Möbius inversion is a general principle for recovering local information from cumulative information.

Where this proof sits among the five systems

As stated at the beginning, the proof in this note belongs to the family

Möbius inversion, lattice theory, and shortening/puncturing methods.

On the surface, it is a proof using the supports of codewords. At a deeper level, however, it uses the zeta transform and Möbius inversion on the Boolean lattice.

The key point of the proof may be summarised in the following sentence:

Recover the information that the support is exactly equal to something from the cumulative information that the support is contained in something.

This idea of recovering local information from cumulative information is the core of Möbius inversion.

More generally, there is also a framework for studying MacWilliams-type identities for weights arising from support maps taking values in lattices; see [Rav18].

Next time

Next time, I turn to characters of finite abelian groups, which are the tools closest to the most standard proof of the MacWilliams identity.

In the present proof, $C^{\perp}$ arose from the restriction map to coordinates and the dimension count for orthogonal complements. In the character-theoretic proof next time, $C^{\perp}$ will arise from the orthogonality relation

\sum_{c \in C}\chi(u\cdot c) = \begin{cases} \card{C}, & u \in C^{\perp},\\ 0, & u \notin C^{\perp}. \end{cases}

Even though the identity is the same MacWilliams identity, the main actor in the Möbius-inversion proof was the inclusion relation on supports. In the character-theoretic proof, the main actors are group duality and orthogonality relations. By seeing this difference, one should gradually begin to recognise how the MacWilliams identity lies at the intersection of several areas of mathematics.

Footnotes

Note that $x \nleq y$ does not necessarily mean $x > y$ . It also happens when $x$ and $y$ are incomparable (cf. Definition 2.2 below). ↩
Strictly speaking, one might call this a closed interval, but it seems standard simply to say interval. ↩
It is also referred to as the incidence ring. ↩

References

[Rot64] Gian-Carlo Rota. On the foundations of combinatorial theory. I. Theory of Möbius functions. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, vol. 2, pp. 340–368, 1964. doi:10.1007/BF00531932 ↩
[Sta12] Richard P. Stanley. Enumerative combinatorics. Volume 1. Cambridge University Press, Cambridge, vol. 49, pp. xiv+626, 2012 ↩
[DP02] B. A. Davey and H. A. Priestley. Introduction to lattices and order. Cambridge University Press, New York, pp. xii+298, 2002. doi:10.1017/CBO9780511809088 ↩
[BPB88] Richard A. Brualdi, Vera S. Pless, and Janet S. Beissinger. On the MacWilliams identities for linear codes. Linear Algebra Appl., vol. 107, pp. 181–189, 1988. doi:10.1016/0024-3795(88)90244-3 ↩
[Gol97] John L. Goldwasser. Shortened and punctured codes and the MacWilliams identities. Linear Algebra Appl., vol. 253, pp. 1–13, 1997. doi:10.1016/0024-3795(95)00671-0 ↩
[Rav18] Alberto Ravagnani. Duality of codes supported on regular lattices, with an application to enumerative combinatorics. Des. Codes Cryptogr., vol. 86, no. 9, pp. 2035–2063, 2018. doi:10.1007/s10623-017-0436-3 ↩

This series

A Series Learning through the MacWilliams Identity · Part 1 of 12

NextAn Introduction to Character Theory through the MacWilliams Identity

Back to series list

Disclaimer

Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.

Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.

If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.

§1Introduction

§2What is a poset?

§3Lattices: joining two elements and taking what they have in common

§4Intervals and local finiteness

§5The zeta transform as cumulative sums

§6Incidence algebras

§7The zeta function and the Möbius function

§8Möbius inversion

§9Translating into coding theory: support exactly equal to, and support contained in

§10Rewriting the weight enumerator via Möbius inversion

§11Dual codes, puncturing, and shortening

§12Proof of the MacWilliams identity

§13What Möbius inversion was doing in this proof

§14Concepts covered in this note

§15Where this proof sits among the five systems

§16Next time

Footnotes

References

This series

Disclaimer

Introduction

What is a poset?

Lattices: joining two elements and taking what they have in common

Intervals and local finiteness

The zeta transform as cumulative sums

Incidence algebras

The zeta function and the Möbius function

Möbius inversion

Translating into coding theory: support exactly equal to, and support contained in

Rewriting the weight enumerator via Möbius inversion

Dual codes, puncturing, and shortening

Proof of the MacWilliams identity

What Möbius inversion was doing in this proof

Concepts covered in this note

Where this proof sits among the five systems

Next time