Back to resources

Article and note

A Series Learning through the MacWilliams IdentityPart 9 of 12

An Introduction to Factor Graphs and Partition Functions through the MacWilliams Identity

Among the five proof systems for the MacWilliams identity, this note focuses on the proof that appears through factor graphs and partition functions, and introduces local factors, constraint factors, partition functions, Tanner-graph-type factor graphs, local Fourier transforms, and dualisation.
Published:
Updated:
Reading time:
33 min (about 7,187 words)
Tagscoding theoryMacWilliams identityfactor graphspartition functionsnormal factor graphsFourier transformfinite fieldsweight enumeratorexpository note

Introduction

One of the fundamental theorems in coding theory is the MacWilliams identity. Let EE be the coordinate set, let nEn \coloneqq \card{E}, and, for a linear code CFqEC \leq \F_{q}^{E} over the finite field Fq\F_{q}, consider its dual code

C{uFqE:uc=0 for all cC},C^{\perp} \coloneqq \{ u \in \F_{q}^{E} : u \cdot c = 0 \text{ for all } c \in C \},

where

uc=eEuece.u \cdot c = \sum_{e \in E} u_{e} c_{e}.

For a codeword cFqEc \in \F_{q}^{E}, write its support and Hamming weight as

supp(c){eE:ce0},wt(c)supp(c)\supp(c) \coloneqq \{ e \in E : c_{e} \neq 0 \}, \qquad \wt(c) \coloneqq \card{\supp(c)}

respectively. Define the weight enumerator of the linear code CC by

WC(X,Y)=cCXnwt(c)Ywt(c).W_{C}(X, Y) = \sum_{c \in C} X^{n - \wt(c)} Y^{\wt(c)}.

The MacWilliams identity is the formula saying that the weight enumerator of the dual code can be computed from the weight enumerator of CC as follows:

WC(X,Y)=1CWC(X+(q1)Y,XY).W_{C^{\perp}}(X, Y) = \frac{1}{\card{C}} W_{C}\bigl( X + (q - 1)Y, X - Y \bigr).

This is the MacWilliams identity.

In this note, we prove this MacWilliams identity in the language of factor graphs and partition functions. The note introduces the necessary notation and concepts in the text so that it can be read on its own. The only prerequisites are the basics of linear codes over finite fields, Hamming weight, and dual codes. More concretely, we assume familiarity with the following level of material:

We assume basic familiarity with elementary coding theory, namely with

  • what a (finite) field is,

  • what a linear code over a finite field is,

  • what the Hamming weight is,

  • what the dual code is.

(It is not necessary to know a proof of the MacWilliams identity.)

This note does not presuppose Parts 1–8. In the series as a whole, we compare several proofs of the MacWilliams identity, but the classification across the whole series is not needed in order to read the proof in this note. Parity-check matrices, Tanner graphs, factor graphs, partition functions, and Fourier transforms are explained in the text to the extent needed. Even if you have not read the other parts of the series, the necessary notation and calculations are introduced in order so that you can follow the proof of the MacWilliams identity itself. The aim of this note is to use a proof of the MacWilliams identity as a guide to an introduction to factor graphs and partition functions.

A factor graph is a tool for representing a large function depending on many variables as a product of small factors, each depending on only a few variables, and for visualising those dependencies by a graph. A partition function is obtained by summing that product over all values of all variables. In statistical physics it appears as a total weight over all states, and in coding theory it appears as a total weight over words satisfying constraints.

When a linear code CFqEC \leq \F_{q}^{E} is written by a parity-check matrix HH, CC is the set of all words x=(xe)eEFqEx = (x_{e})_{e \in E} \in \F_{q}^{E} satisfying Hx=0Hx^{\top} = 0. This condition decomposes into local constraints, one for each check equation. Then the weight enumerator can be represented as

the partition function obtained by multiplying the coordinate weights for each word xx satisfying all check equations, and then summing over all such words.

When each local constraint in this partition function is Fourier-expanded, the dual code CC^{\perp} appears naturally. This is the MacWilliams identity as seen from factor graphs and partition functions.

The flow of the note is as follows.

partition functions → factor graphs → constraint factors → Tanner-graph-type factor graphs → local Fourier transform → dualisation → the MacWilliams identity

As a standard introduction to factor graphs and the sum–product algorithm, see Kschischang–Frey–Loeliger [KFL01]. For Forney-style normal factor graphs, Loeliger [Loe04] is readable, and Forney [For01] is a basic reference for normal realisations of codes on graphs. For the relation between partition-function duality for normal factor graphs and the MacWilliams identity, see Forney [For11]. In this note, we do not develop all this general theory, but instead extract only the part needed for the ordinary Hamming-weight MacWilliams identity for linear codes over finite fields.

Partition Functions: Weighted Sums with Constraints

We begin with the term partition function. Rather than entering into its physical interpretation, we use it here as a particular kind of finite sum. The word weight is not restricted here to probabilities or positive real numbers: polynomial-valued and complex-valued weights will be treated in the same way. Following the notation often used in statistical physics, we denote it by Z\Zpart. Every partition function in this note is a finite sum. Consequently, we may later change the order of summation, or decompose a sum over a finite direct product into a product of coordinatewise sums, without any analytic issue.

Consider finitely many variables x1,x2,,xmx_{1}, x_{2}, \dots, x_{m}. Assume that each variable xix_{i} takes values in a finite set Xi\mathcal{X}_{i}. The total state space is

X1×X2××Xm.\mathcal{X}_{1} \times \mathcal{X}_{2} \times \dots \times \mathcal{X}_{m}.

If a weight F(x)F(x) is assigned to each state x=(x1,,xm)x = (x_{1}, \dots, x_{m}), consider the total sum

Zx1X1x2X2xmXmF(x1,,xm).\Zpart \coloneqq \sum_{x_{1} \in \mathcal{X}_{1}} \sum_{x_{2} \in \mathcal{X}_{2}} \dots \sum_{x_{m} \in \mathcal{X}_{m}} F(x_{1}, \dots, x_{m}).

In this note, we call such a quantity a partition function.

This definition alone may make a partition function look like just a finite sum. The important point is that, in many cases, the weight F(x)F(x) can be written as a product of small factors. For example, suppose that it can be written as

F(x1,x2,x3,x4)=f12(x1,x2)f23(x2,x3)f34(x3,x4).F(x_{1}, x_{2}, x_{3}, x_{4}) = f_{12}(x_{1}, x_{2}) f_{23}(x_{2}, x_{3}) f_{34}(x_{3}, x_{4}).

Then the partition function is

Z=x1,x2,x3,x4f12(x1,x2)f23(x2,x3)f34(x3,x4).\Zpart = \sum_{x_{1}, x_{2}, x_{3}, x_{4}} f_{12}(x_{1}, x_{2}) f_{23}(x_{2}, x_{3}) f_{34}(x_{3}, x_{4}).

This expression is not merely the sum of an arbitrary function of four variables, but the sum of a product of local factors, each depending only on neighbouring variables. A factor graph is a graph representation of this “local dependency relation”.

Partition functions also appear naturally in coding theory. For example, let CFqEC \leq \F_{q}^{E} be a linear code, and assign a weight λ(a)\lambda(a) to each coordinate value aFqa \in \F_{q}. Then

cCeEλ(ce)\sum_{c \in C} \prod_{e \in E} \lambda(c_{e})

is a weighted sum over codewords. This is exactly a partition function. If λ(0)=X\lambda(0) = X and λ(a)=Y\lambda(a) = Y (a0a \neq 0), then this partition function becomes the ordinary weight enumerator WC(X,Y)W_{C}(X, Y). In other words, a weight enumerator is a partition function on the codeword space.

Factor Graphs

A factor graph is a tool for representing a factorisation of a function as a graph. Here we define the most basic bipartite-graph type factor graph.

Definition 3.1 (Factor graph).

Let the variable set be II, and let the factor set be A\mathcal{A}. Suppose that a finite set Xi\mathcal{X}_{i} is assigned to each variable iIi \in I. Suppose that, for each factor αA\alpha \in \mathcal{A}, we are given both the set of variables on which the factor depends, αI\partial \alpha \subseteq I, and a function

fα ⁣:iαXiKf_{\alpha} \colon \prod_{i \in \partial\alpha} \mathcal{X}_{i} \to \mathcal{K}

1.

Represent this data by a bipartite graph consisting of variable vertices iIi \in I and factor vertices αA\alpha \in \mathcal{A}, and join the variable vertex ii and the factor vertex α\alpha by an edge when iαi \in \partial \alpha. This is called a factor graph.

The large function represented by the factor graph is

F(x)=αAfα(xα).F(x) = \prod_{\alpha \in \mathcal{A}} f_{\alpha}(x_{\partial\alpha}).

Here x=(xi)iIx = (x_{i})_{i \in I} is the value of all variables, and xαx_{\partial\alpha} is obtained by taking only the variables belonging to α\partial\alpha.

Define the partition function of this function by

ZxiIXiαAfα(xα).\Zpart \coloneqq \sum_{x \in \prod_{i \in I} \mathcal{X}_{i}} \prod_{\alpha \in \mathcal{A}} f_{\alpha}(x_{\partial\alpha}).

Thus a factor graph is a diagrammatic representation of

a function locally decomposed as a product, and the sum of that function over all states.

Example 3.2 (Three local factors).

Suppose that the variables are x1x_{1}, x2x_{2}, x3x_{3}, x4x_{4}, and that the factors are f12(x1,x2)f_{12}(x_{1}, x_{2}), f23(x2,x3)f_{23}(x_{2}, x_{3}), f34(x3,x4)f_{34}(x_{3}, x_{4}). Then the partition function is

Z=x1,x2,x3,x4f12(x1,x2)f23(x2,x3)f34(x3,x4).\Zpart = \sum_{x_{1}, x_{2}, x_{3}, x_{4}} f_{12}(x_{1}, x_{2}) f_{23}(x_{2}, x_{3}) f_{34}(x_{3}, x_{4}).

In the factor graph, f12f_{12} is connected only to x1x_{1}, x2x_{2}, f23f_{23} only to x2x_{2}, x3x_{3}, and f34f_{34} only to x3x_{3}, x4x_{4}.

Figure 1An example of a factor graph consisting of three local factors. Circles are variable nodes, and rectangles are factor nodes.

Figure 1 is the dependency relation in the expression

f12(x1,x2)f23(x2,x3)f34(x3,x4)f_{12}(x_{1}, x_{2})f_{23}(x_{2}, x_{3})f_{34}(x_{3}, x_{4})

drawn as a diagram.

As this example shows, a factor graph makes the structure of an expression visible. If the graph is a tree, the sum–product algorithm can compute the partition function and marginal sums efficiently. Even when the graph has cycles, factor graphs are useful as a language for describing dependency relations. In this note, we do not go deeply into the sum–product algorithm as a computational algorithm, and instead use factor graphs as a language for transforming partition functions and deriving the MacWilliams identity.

Representing Constraints as Factors

In factor graphs, not only weights but also constraints can be represented as factors. For this purpose, we use indicator functions.

Definition 4.1 (Indicator function).

For a set SS and a subset TST \subseteq S, define the indicator function of TT by

1T(x){1,xT,0,xT.\one_{T}(x) \coloneqq \begin{cases} 1, & x \in T,\\ 0, & x \notin T \end{cases}.

Over the finite field Fq\F_{q}, we write the indicator function of the zero element as

δ0(t){1,t=0,0,t0.\delta_{0}(t) \coloneqq \begin{cases} 1, & t = 0,\\ 0, & t \neq 0 \end{cases}.

That is, δ0=1{0}\delta_{0} = \one_{\{ 0 \}}. This is the finite-field version of the Kronecker delta.

For example, if we want to keep only the pairs satisfying the condition x+y=0x + y = 0, it is enough to multiply by the factor δ0(x+y)\delta_{0}(x + y). This factor is 11 when x+y=0x + y = 0, and 00 otherwise. Therefore, in the sum defining the partition function, contributions from states not satisfying the condition vanish. For example, over F2\F_{2}, we have the following.

xxyyδ0(x+y)\delta_{0}(x + y)
000011
001100
110000
111111

In other words, only 0000 and 1111 remain, and the contributions of 0101 and 1010 vanish.

Example 4.2 (One linear constraint).

Suppose that, for x1,x2,x3Fqx_{1}, x_{2}, x_{3} \in \F_{q}, we want to impose the constraint

a1x1+a2x2+a3x3=0.a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3} = 0.

This constraint can be represented by the factor

f(x1,x2,x3)=δ0(a1x1+a2x2+a3x3).f(x_{1}, x_{2}, x_{3}) = \delta_{0}(a_{1} x_{1} + a_{2} x_{2} + a_{3} x_{3}).

This factor is 11 only when the constraint is satisfied, and is 00 when it is not satisfied.

Using constraint factors, one can represent a linear code as a product of local constraints. This is the Tanner-graph-type factor graph in the next section.

Viewing a Code as a Tanner-Graph-Type Factor Graph

Let CFqEC \leq \F_{q}^{E} be a linear code. Set kdimCk \coloneqq \dim C. The dual code CC^{\perp} has dimension nkn - k. Choose a basis of CC^{\perp}, and let HH be the matrix obtained by arranging that basis as its rows. Since (C)=C(C^{\perp})^{\perp} = C, this HH gives

C={xFqE:Hx=0}.C = \{ x \in \F_{q}^{E} : Hx^{\top} = 0 \}.

Here Hx=0Hx^{\top} = 0 means that xx is orthogonal to every row of HH. Since the rows of HH are a basis of CC^{\perp}, this means that xx is orthogonal to every element of CC^{\perp}. Therefore x(C)=Cx \in (C^{\perp})^{\perp} = C, and conversely, if xCx \in C, then xx is orthogonal to every element of CC^{\perp}, so Hx=0Hx^{\top} = 0 holds. When writing the abstract coordinate set EE and the row set in matrix notation, one may think of having chosen an order on each of them. From now on we use indexed notation and write things in a form independent of that order.

Let the row set be JJ, and write H=(Hj,e)jJ,eEH = (H_{j, e})_{j \in J,\,e \in E}. Here J=nk\card{J} = n - k, and the rows of HH are a basis of CC^{\perp}. Then

C={xFqE:eEHj,exe=0 for all jJ}.C = \left\{ x \in \F_{q}^{E}: \sum_{e \in E} H_{j, e} x_{e} = 0 \text{ for all } j \in J \right\}.

Remark 5.1.

In general, one also uses parity-check matrices containing redundant check equations. In this note, in order that the map yyHy\mapsto yH appearing later runs through CC^{\perp} exactly once, we use an HH with no redundant rows, namely an HH whose rows form a basis of CC^{\perp}. Under this assumption, choices of yy and dual codewords u=yHu = yH correspond one-to-one, so later we do not have to consider extra multiplicities. Even when redundant rows are included, the conclusion is the same, but the map yyHy\mapsto yH is no longer one-to-one, so multiplicities must be handled. Also, even for the same code, the shape of the Tanner graph and the edge labels change depending on the choice of parity-check matrix. To keep the proof simple, in this note we fix one HH whose rows form a basis of CC^{\perp}.

In coding theory, this bipartite graph consisting of variable nodes corresponding to coordinates and check nodes corresponding to check equations is called a Tanner graph [Tan81]. The variable nodes correspond to coordinates eEe \in E, and the check nodes correspond to parity-check equations jJj \in J. We join the check node jj and the variable node ee by an edge exactly when Hj,e0H_{j, e} \neq 0. In the language of factor graphs, this is viewed as a Tanner graph whose check nodes carry zero-constraint factors. For a binary code, all non-zero coefficients are 11, so the check equations can largely be represented by the presence or absence of edges alone. Over a general Fq\F_{q}, however, the values of the non-zero coefficients Hj,eH_{j,e} also affect the check equations. Therefore, for a qq-ary Tanner graph, we regard the edge jjee as carrying the coefficient Hj,eH_{j, e} as a label.

For each jJj \in J, write

j={eE:Hj,e0}\partial j = \{ e \in E : H_{j, e} \neq 0 \}

for the set of coordinates connected to the check node jj, that is, the set of variables on which the check factor hjh_{j} depends. Here j\partial j is not a derivative, but a notation often used in factor graphs for a “neighbourhood” or “set of adjacent variables”. Then the check factor is

hj((xe)ej)=δ0(ejHj,exe).h_{j}((x_{e})_{e \in \partial j}) = \delta_{0} \left( \sum_{e \in \partial j} H_{j, e} x_{e} \right).

It is the same if the sum is written over all of EE, because the terms with Hj,e=0H_{j, e} = 0 vanish. In the factor graph, it is connected only to those variables xex_{e} for which eje \in \partial j.

For example, over F2\F_{2}, if

H=(11010110),H = \begin{pmatrix} 1 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 \end{pmatrix},

then the two check factors are

h1(x1,x2,x4)=δ0(x1+x2+x4),h2(x2,x3)=δ0(x2+x3).h_{1}(x_{1}, x_{2}, x_{4}) = \delta_{0}(x_{1} + x_{2} + x_{4}), \qquad h_{2}(x_{2}, x_{3}) = \delta_{0}(x_{2} + x_{3}).

In this example, h1h_{1} is connected only to x1x_{1}, x2x_{2}, x4x_{4}, and h2h_{2} is connected only to x2x_{2}, x3x_{3}.

In addition, put a one-coordinate weight factor λ(xe)\lambda(x_{e}) at each coordinate eEe \in E. Here λ ⁣:FqK\lambda \colon \F_{q} \to \mathcal{K} is an arbitrary weight function. Later we specialise to λ(0)=X\lambda(0) = X and λ(a)=Y\lambda(a) = Y (a0a \neq 0). An ordinary Tanner graph consists of variable nodes and check nodes, but in the partition function considered here we additionally attach a one-variable weight factor λ(xe)\lambda(x_{e}) to each coordinate. Strictly speaking, we are considering the factor graph obtained by adding coordinate weight factors to the Tanner graph.

Figure 2A schematic factor graph obtained by adding one-coordinate weight factors to a Tanner graph. Edge labels represent the non-zero coefficients in the check equations.

In Figure 2, the rectangular check nodes h1h_{1}, h2h_{2} represent zero-constraint factors, and the small rectangles below or above represent one-coordinate weight factors.

The partition function of this factor graph is

ZC(λ)xFqE(jJδ0(eEHj,exe))eEλ(xe).(5.1)\begin{aligned} \Zpart_{C}(\lambda) &\coloneqq \sum_{x \in \F_{q}^{E}} \left( \prod_{j \in J} \delta_{0} \left( \sum_{e \in E} H_{j, e} x_{e} \right) \right) \prod_{e \in E}\lambda(x_{e}). \tag{5.1} \end{aligned}

The product of the constraint factors is 11 exactly when xCx \in C, and is 00 otherwise. Therefore

ZC(λ)=cCeEλ(ce).(5.2)\Zpart_{C}(\lambda) = \sum_{c \in C} \prod_{e \in E} \lambda(c_{e}). \tag{5.2}

Thus ZC(λ)\Zpart_{C}(\lambda) is a weighted sum over codewords. From now on, for any linear code DFqED \leq \F_{q}^{E}, we write

ZD(λ)=dDeEλ(de).\Zpart_{D}(\lambda) = \sum_{d \in D} \prod_{e \in E} \lambda(d_{e}).

The first definition (5.1) is a constrained sum using a parity-check matrix, and (5.2) confirms that it agrees with the weighted sum over codewords. With this notation, ZC(λ^)\Zpart_{C^{\perp}}(\what{\lambda}) appearing later means the partition function obtained by summing the product of the weights λ^\what{\lambda} over the dual codewords. In particular, if

λX,Y(a){X,a=0,Y,a0(5.3)\lambda_{X, Y}(a) \coloneqq \begin{cases} X, & a = 0,\\ Y, & a \neq 0 \end{cases} \tag{5.3}

then

ZC(λX,Y)=WC(X,Y).\Zpart_{C}(\lambda_{X, Y}) = W_{C}(X,Y).

In words, the discussion so far says the following.

A weight enumerator is the partition function of a Tanner-graph-type factor graph.

This observation is the starting point of this note.

One-Coordinate Fourier Transform

We now prepare the Fourier transform used to dualise local factors. Fix once and for all a non-trivial additive character ψ ⁣:FqC×\psi \colon \F_{q} \to \C^{\times} of the finite field Fq\F_{q}, that is, a map satisfying

ψ(a+b)=ψ(a)ψ(b)\psi(a + b) = \psi(a)\psi(b)

and not identically equal to 11. When q=pq = p, one may think of

ψ(a)=exp(2π1a/p).\psi(a) = \exp(2\pi\sqrt{-1}a/p).

Here aFpa \in \F_{p} is represented by one of 0,1,,p10, 1, \dots, p - 1. For general q=pmq = p^{m}, one can use the trace map and define

ψ(a)=exp(2π1pTrFq/Fp(a)).\psi(a) = \exp\left( \frac{2\pi\sqrt{-1}}{p} \Tr_{\F_{q}/\F_{p}}(a) \right).

The basic property we need is the following orthogonality relation.

Lemma 6.1.

For bFqb \in \F_{q},

aFqψ(ab)={q,b=0,0,b0\sum_{a \in \F_{q}} \psi(ab) = \begin{cases} q, & b = 0,\\ 0, & b \neq 0 \end{cases}

holds.

Proof

If b=0b = 0, every term is 11, and the sum is qq. Suppose b0b \neq 0. Then the map aaba \mapsto ab is a bijection of Fq\F_{q}, so

aFqψ(ab)=tFqψ(t).\sum_{a \in \F_{q}} \psi(ab) = \sum_{t \in \F_{q}} \psi(t).

Put the right-hand side equal to SS. Since ψ\psi is non-trivial, there exists dFqd \in \F_{q} such that ψ(d)1\psi(d) \neq 1. As tt runs through all of Fq\F_{q}, so does t+dt + d, and hence

S=tFqψ(t+d)=ψ(d)tFqψ(t)=ψ(d)S.S = \sum_{t \in \F_{q}} \psi(t + d) = \psi(d) \sum_{t \in \F_{q}} \psi(t) = \psi(d)S.

Since ψ(d)1\psi(d) \neq 1, we have S=0S = 0.

End of proof

For a one-coordinate weight function λ ⁣:FqK\lambda \colon \F_{q} \to \mathcal{K}, define its Fourier transform by

λ^(b)aFqλ(a)ψ(ab)(bFq).(6.1)\what{\lambda}(b) \coloneqq \sum_{a \in \F_{q}} \lambda(a)\psi(ab) \qquad (b \in \F_{q}). \tag{6.1}

In the Fourier transform, the values ψ(ab)\psi(ab) are complex numbers, so we enlarge the coefficient ring to a ring containing C\C as needed. For the Hamming weight enumerator, it is enough to work in C[X,Y]\C\lbrack X, Y \rbrack, and for the complete weight enumerator, in C[Ta:aFq]\C\lbrack T_{a} : a \in \F_{q} \rbrack, which appears later. Here the Fourier transform is defined in the unnormalised form. Therefore, a factor 1/q1/q appears in the expansion of the zero constraint. The coefficient 1/C1/\card{C^{\perp}} appearing later is obtained by multiplying this 1/q1/q once for each check equation.

Let us compute the Fourier transform of the weight

λX,Y(a)={X,a=0,Y,a0\lambda_{X,Y}(a) = \begin{cases} X, & a = 0,\\ Y, & a \neq 0 \end{cases}

appearing in the Hamming weight enumerator.

Lemma 6.2.

Let λX,Y\lambda_{X, Y} be defined by (5.3). Then

λ^X,Y(b)={X+(q1)Y,b=0,XY,b0\what{\lambda}_{X, Y}(b) = \begin{cases} X + (q - 1)Y, & b = 0,\\ X - Y, & b \neq 0 \end{cases}

holds.

Proof

When b=0b = 0,

λ^X,Y(0)=aFqλX,Y(a)=X+(q1)Y.\what{\lambda}_{X, Y}(0) = \sum_{a \in \F_{q}} \lambda_{X, Y}(a) = X + (q - 1)Y.

Next suppose b0b \neq 0. By Lemma 6.1,

aFqψ(ab)=0.\sum_{a \in \F_{q}} \psi(ab) = 0.

Hence

aFq×ψ(ab)=1.\sum_{a \in \F_{q}^{\times}} \psi(ab) = -1.

Therefore

λ^X,Y(b)=X+YaFq×ψ(ab)=XY.\what{\lambda}_{X, Y}(b) = X + Y\sum_{a \in \F_{q}^{\times}} \psi(ab) = X - Y.
End of proof

This local calculation is precisely the source of the change of variables in the MacWilliams identity,

XX+(q1)Y,YXY.X \mapsto X + (q - 1)Y, \qquad Y \mapsto X-Y.

In the language of factor graphs, it is the result of Fourier-transforming a one-coordinate weight factor.

Fourier Expansion of the Zero-Constraint Factor

Next, we Fourier-expand the zero constraint δ0(t)\delta_{0}(t) appearing in the check factors.

Lemma 7.1.

For any tFqt \in \F_{q},

δ0(t)=1qyFqψ(yt)(7.1)\delta_{0}(t) = \frac{1}{q}\sum_{y \in \F_{q}}\psi(yt) \tag{7.1}

holds.

Proof

If t=0t = 0, the right-hand side is q/q=1q/q = 1. If t0t \neq 0, then by Lemma 6.1,

yFqψ(yt)=0.\sum_{y \in \F_{q}} \psi(yt) = 0.

This agrees with the left-hand side δ0(t)=0\delta_{0}(t) = 0.

End of proof

This formula rewrites a constraint factor as a sum over a dual variable yy. In factor-graph terms, it says that when a check factor is Fourier-expanded, one dual variable appears at that check factor. This dual variable will later form a linear combination of the rows of the parity-check matrix and produce a codeword of the dual code CC^{\perp}.

Fourier-Expanding the Partition Function

Apply Lemma 7.1 to the partition function

ZC(λ)=xFqE(jJδ0(eEHj,exe))eEλ(xe)\Zpart_{C}(\lambda) = \sum_{x \in \F_{q}^{E}} \left( \prod_{j \in J} \delta_{0} \left(\sum_{e \in E} H_{j,e} x_{e} \right) \right) \prod_{e \in E}\lambda(x_{e})

of the Tanner-graph-type factor graph. Here we write the sum over all of EE, including terms with Hj,e=0H_{j, e} = 0. This is the same as summing over j\partial j, but it makes the later coordinatewise organisation easier to see. All sums are finite sums, so below we freely change the order of summation. Also, when a product separates coordinate by coordinate, we rewrite a sum over a finite direct product as a product of coordinatewise sums.

For each jJj \in J,

δ0(eEHj,exe)=1qyjFqψ(yjeEHj,exe).\delta_{0} \left(\sum_{e \in E} H_{j,e} x_{e} \right) = \frac{1}{q}\sum_{y_{j} \in \F_{q}} \psi\left(y_{j} \sum_{e \in E} H_{j, e} x_{e} \right).

Substituting this into all check equations gives

ZC(λ)=qJyFqJxFqEjJψ(yjeEHj,exe)eEλ(xe).(8.1)\begin{aligned} \Zpart_{C}(\lambda) &= q^{-\card{J}} \sum_{y \in \F_{q}^{J}} \sum_{x \in \F_{q}^{E}} \prod_{j \in J} \psi\left( y_{j} \sum_{e \in E} H_{j, e} x_{e} \right) \prod_{e\in E}\lambda(x_e). \tag{8.1} \end{aligned}

Let us pause to organise the roles of the notation. xex_{e} is the ee-coordinate of the original candidate codeword xFqEx \in \F_{q}^{E}. On the other hand, the capital letters XX, YY are formal variables used later in the Hamming weight enumerator, and are different from xex_{e} and from yjy_{j} appearing here. y=(yj)jJy = (y_{j})_{j \in J} is the dual variable corresponding to the check factors, and yjy_{j} is the coefficient appearing when check equation jj is Fourier-expanded. yjy_{j} is not a coordinate of a dual codeword. The actual dual codeword is obtained later as u=yHu = yH, and its ee-coordinate is

ue=(yH)e=jJyjHj,e.u_{e} = (yH)_{e} = \sum_{j \in J} y_{j} H_{j,e}.

Thus, in this calculation, the role of the notation moves from the original coordinate variable xex_{e}, through the coefficients yjy_{j} attached to check equations, to the coordinates ueu_{e} of a dual codeword.

\begin{tabular}{c|p{0.68\linewidth}} Notation & Role \\ \hline xex_{e} & the ee-coordinate of the original candidate codeword xx \\ yjy_{j} & the summation variable appearing when check equation jj is Fourier-expanded, or the coefficient used to linearly combine row jj of HH \\ ueu_{e} & the ee-coordinate of the dual codeword u=yHu = yH \\ X,YX, Y & formal variables in the Hamming weight enumerator \end{tabular}

By the property ψ(s+t)=ψ(s)ψ(t)\psi(s+t)=\psi(s)\psi(t) of an additive character, the inner character factors separate coordinate by coordinate. Indeed,

jJψ(yjeEHj,exe)=ψ(jJyjeEHj,exe)=ψ(eE(jJyjHj,e)xe)=eEψ((jJyjHj,e)xe).\begin{aligned} \prod_{j \in J} \psi\left(y_{j} \sum_{e \in E} H_{j, e} x_{e} \right) &= \psi\left(\sum_{j \in J} y_{j} \sum_{e \in E} H_{j, e} x_{e} \right) \\ &= \psi\left(\sum_{e \in E}\left(\sum_{j \in J} y_{j} H_{j, e} \right) x_{e} \right) \\ &= \prod_{e \in E} \psi\left(\left(\sum_{j \in J} y_{j} H_{j, e} \right) x_{e} \right). \end{aligned}

Here we are using the basic decomposition of a sum over a finite direct product. For one-variable functions geg_{e} for each ee, one has

xFqEeEge(xe)=eEaFqge(a).\sum_{x \in \F_{q}^{E}} \prod_{e \in E} g_{e}(x_{e}) = \prod_{e \in E} \sum_{a \in \F_{q}} g_{e}(a).

This is obtained by repeating the two-variable identity

x1,x2g1(x1)g2(x2)=(x1g1(x1))(x2g2(x2))\sum_{x_{1}, x_{2}} g_{1}(x_{1}) g_{2}(x_{2}) = \left(\sum_{x_{1}} g_{1}(x_{1}) \right) \left(\sum_{x_{2}} g_{2}(x_{2}) \right)

for the number of coordinates. Therefore the inner sum over xx in (8.1) is

xFqEeEλ(xe)ψ((jJyjHj,e)xe)=eEaFqλ(a)ψ((jJyjHj,e)a)=eEλ^(jJyjHj,e).\begin{aligned} &\sum_{x \in \F_{q}^{E}} \prod_{e \in E} \lambda(x_{e}) \psi\left(\left(\sum_{j \in J} y_{j} H_{j, e} \right) x_{e} \right) \\ &\qquad = \prod_{e \in E} \sum_{a \in \F_{q}} \lambda(a) \psi\left(\left(\sum_{j \in J} y_{j} H_{j, e} \right) a \right) \\ &\qquad = \prod_{e \in E} \what{\lambda}\left(\sum_{j \in J} y_{j} H_{j, e} \right). \end{aligned}

Define yHFqEyH \in \F_{q}^{E} by

(yH)ejJyjHj,e.(yH)_{e} \coloneqq \sum_{j \in J} y_{j} H_{j,e}.

This is a linear combination of the rows of HH. Thus the above calculation gives

ZC(λ)=qJyFqJeEλ^((yH)e).(8.2)\Zpart_{C}(\lambda) = q^{-\card{J}} \sum_{y \in \F_{q}^{J}} \prod_{e \in E} \what{\lambda}((yH)_{e}). \tag{8.2}

Now the rows of HH have been chosen as a basis of CC^{\perp}, so the map

FqJC,yyH\F_{q}^{J} \to C^{\perp}, \qquad y \mapsto yH

is a bijection. Also J=dimC=nk\card{J} = \dim C^{\perp} = n - k, and hence

qJ=C.q^{\card{J}} = \card{C^{\perp}}.

Therefore (8.2) becomes

ZC(λ)=1CuCeEλ^(ue).(8.3)\Zpart_{C}(\lambda) = \frac{1}{\card{C^{\perp}}} \sum_{u \in C^{\perp}} \prod_{e \in E} \what{\lambda}(u_{e}). \tag{8.3}

The right-hand side is the partition function over the dual code CC^{\perp}. In other words,

ZC(λ)=1CZC(λ^).(8.4)\Zpart_{C}(\lambda) = \frac{1}{\card{C^{\perp}}} \Zpart_{C^{\perp}}(\what{\lambda}). \tag{8.4}

This is the central formula in the direction obtained directly from the calculation. In other words, it writes the partition function of CC in terms of the partition function on the CC^{\perp} side. By contrast, the MacWilliams identity is usually written in the opposite direction: it expresses the weight enumerator of CC^{\perp} in terms of that of CC. To obtain that standard direction, we apply this general formula to CC^{\perp} and reverse the direction. In general, applying the same formula to a linear code DFqED \leq \F_{q}^{E} gives

ZD(λ)=1DZD(λ^).(8.5)\Zpart_{D}(\lambda) = \frac{1}{\card{D^{\perp}}} \Zpart_{D^{\perp}}(\what{\lambda}). \tag{8.5}

Now set DCD \coloneqq C^{\perp}. Since D=CD^{\perp} = C, we obtain

ZC(λ)=1CZC(λ^).(8.6)\Zpart_{C^{\perp}}(\lambda) = \frac{1}{\card{C}} \Zpart_{C}(\what{\lambda}). \tag{8.6}

The coefficient changes from 1/C1/\card{C^{\perp}} to 1/C1/\card{C} because, in the standard direction, we put D=CD = C^{\perp}, and the dual code DD^{\perp} in that case is CC.

In words, the directly obtained direction says the following.

More precisely, the partition function of the code CC is 1/C1/\card{C^{\perp}} times the partition function over the dual code CC^{\perp} using the Fourier-transformed one-coordinate weight λ^\what{\lambda}.

Reading This as Factor-Graph Duality

(8.4) is partition-function duality for linear codes over finite fields. Here we reread this formula in the language of factor graphs.

The original factor graph had the following two types of factors.

Check factors

For each jJj \in J, the factor representing the linear constraint

δ0(eEHj,exe).\delta_{0}\left(\sum_{e \in E} H_{j, e} x_{e} \right).
Weight factors

For each coordinate eEe \in E, the one-coordinate weight factor λ(xe)\lambda(x_{e}).

When a check factor is Fourier-expanded, a dual variable yjy_{j} appears for each check. Using the values of these dual variables, for each coordinate we obtain

(yH)e=jJyjHj,e.(yH)_{e} = \sum_{j \in J} y_{j} H_{j,e}.

This is a linear combination of the rows of the parity-check matrix HH. Therefore, as yy varies, yHyH runs through all of CC^{\perp}.

On the other hand, the sum over the coordinate variable xex_{e} locally becomes

aFqλ(a)ψ(a(yH)e)=λ^((yH)e).\sum_{a \in \F_{q}}\lambda(a)\psi(a(yH)_{e}) = \what{\lambda}((yH)_{e}).

Thus the coordinate weight factor λ\lambda is changed into the Fourier-transformed weight factor λ^\what{\lambda}.

In factor-graph terms, when the original check factors are Fourier-expanded, summation variables yjy_{j} corresponding to those check factors appear. These yjy_{j} are coefficients used to linearly combine the rows of HH and form u=yHu = yH. The coordinates of a dual codeword are the ueu_{e} indexed by eEe \in E, and not the yjy_{j}. Also, the original coordinate weight factor λ(xe)\lambda(x_{e}), after summing over xex_{e}, becomes the Fourier-transformed weight factor λ^(ue)\what{\lambda}(u_{e}).

In the general theory of normal factor graphs, one performs this simultaneously for each local factor. Depending on the convention, sign-inversion factors and normalisation constants appear, but roughly speaking, one Fourier-transforms each local factor and replaces edge variables by dual variables. The partition function of the factor graph on the dual side obtained in this way is related to the original partition function by an explicit constant factor. The calculation in this note is the specialisation of that general theory to a Tanner-graph-type factor graph.

Specialisation to the Hamming Weight Enumerator

We now return to the ordinary Hamming weight enumerator. Let the one-coordinate weight be λX,Y\lambda_{X, Y} from (5.3). Then, for any code DFqED \leq \F_{q}^{E},

ZD(λX,Y)=WD(X,Y).\Zpart_{D}(\lambda_{X, Y}) = W_{D}(X,Y).

Also, by Lemma 6.2, we had

λ^X,Y(b)={X+(q1)Y,b=0,XY,b0.\what{\lambda}_{X, Y}(b) = \begin{cases} X + (q - 1)Y, & b = 0,\\ X - Y, & b \neq 0 \end{cases}.

This λ^X,Y\what{\lambda}_{X, Y} is a one-coordinate weight which returns X+(q1)YX + (q - 1)Y when the input is 00, and returns XYX - Y when the input is non-zero. Therefore, multiplying this weight over CC and summing is the same as substituting X+(q1)YX + (q - 1)Y for XX, and XYX - Y for YY, in WC(X,Y)W_{C}(X, Y). Thus, substituting λ=λX,Y\lambda = \lambda_{X,Y} into (8.6), we obtain

WC(X,Y)=ZC(λX,Y)=1CZC(λ^X,Y)=1CWC(X+(q1)Y,XY).\begin{aligned} W_{C^{\perp}}(X, Y) &= \Zpart_{C^{\perp}}(\lambda_{X, Y}) \\ &= \frac{1}{\card{C}}\Zpart_{C}(\what{\lambda}_{X, Y}) \\ &= \frac{1}{\card{C}} W_{C} \bigl( X + (q - 1)Y, X - Y\bigr). \end{aligned}

That is,

WC(X,Y)=1CWC(X+(q1)Y,XY).W_{C^{\perp}}(X, Y) = \frac{1}{\card{C}} W_{C} \bigl( X + (q - 1)Y, X - Y \bigr).

This is precisely the MacWilliams identity.

The important point in this proof is that the change of variables

XX+(q1)Y,YXYX \mapsto X + (q - 1)Y, \qquad Y \mapsto X - Y

does not appear suddenly at the global level. It is the result of applying the one-coordinate Fourier transform to the weight factor λX,Y\lambda_{X, Y} at each coordinate. In other words, the MacWilliams transform is obtained by gluing together the local Fourier transforms at the coordinates of the factor graph across the whole graph.

A Small Example: The Binary Repetition Code

Finally, let us check the formula in a small example. Let E={1,2}E = \{ 1, 2 \}, and consider the binary repetition code

C={00,11}F22.C = \{ 00, 11 \} \leq \F_{2}^{2}.

This code is self-dual, that is, it satisfies C=CC^{\perp} = C. Indeed, the condition that u=(u1,u2)u = (u_{1}, u_{2}) be orthogonal to every codeword of CC is

u1+u2=0u_{1} + u_{2} = 0

and over F2\F_{2} this means that u1=u2u_{1} = u_{2}. Therefore C=CC^{\perp} = C.

We may take

H=(11)H = \begin{pmatrix} 1 & 1 \end{pmatrix}

as a parity-check matrix. The partition function is

ZC(λ)=x1,x2F2δ0(x1+x2)λ(x1)λ(x2)=λ(0)2+λ(1)2.\begin{aligned} \Zpart_{C}(\lambda) &= \sum_{x_{1}, x_{2} \in \F_{2}} \delta_{0}(x_{1} + x_{2})\lambda(x_{1})\lambda(x_{2}) \\ &= \lambda(0)^{2} + \lambda(1)^{2}. \end{aligned}

If λ(0)=X\lambda(0) = X and λ(1)=Y\lambda(1) = Y, then

WC(X,Y)=X2+Y2.W_{C}(X, Y) = X^{2} + Y^{2}.

The one-coordinate Fourier transform in the binary case is

λ^(0)=X+Y,λ^(1)=XY.\what{\lambda}(0) = X + Y, \qquad \what{\lambda}(1) = X - Y.

In this example too, the same local calculation as in the general case appears directly. Writing the additive character of F2\F_{2} as ψ(t)=(1)t\psi(t) = (-1)^{t}, we have

δ0(x1+x2)=12yF2(1)y(x1+x2).\delta_{0}(x_{1} + x_{2}) = \frac{1}{2}\sum_{y \in \F_{2}}(-1)^{y(x_{1} + x_{2})}.

Substituting this into the partition function, for fixed yF2y \in \F_{2} we get

x1,x2F2λ(x1)λ(x2)(1)yx1(1)yx2=(x1F2λ(x1)(1)yx1)(x2F2λ(x2)(1)yx2)=λ^(y)2.\begin{aligned} &\sum_{x_{1}, x_{2} \in \F_{2}} \lambda(x_{1}) \lambda(x_{2}) (-1)^{yx_{1}} (-1)^{yx_{2}} \\ &\qquad = \left(\sum_{x_{1} \in \F_{2}} \lambda(x_{1})(-1)^{yx_{1}} \right) \left(\sum_{x_{2} \in \F_{2}} \lambda(x_{2})(-1)^{yx_{2}} \right) \\ &\qquad = \what{\lambda}(y)^{2}. \end{aligned}

Thus, even in this small example, the general calculation appears: after Fourier-expanding the constraint factor and taking the coordinatewise sums, the one-coordinate weight factor changes into λ^\what{\lambda}. Formula (8.4) asserts, since C=CC = C^{\perp} and C=2\card{C^{\perp}} = 2, that

ZC(λ)=12ZC(λ^).\Zpart_{C}(\lambda) = \frac{1}{2} \Zpart_{C}(\what{\lambda}).

Computing the right-hand side gives

12((X+Y)2+(XY)2)=X2+Y2.\frac{1}{2}\left((X + Y)^{2} + (X - Y)^{2} \right) = X^{2} + Y^{2}.

It indeed agrees with the original weight enumerator.

Since this example is self-dual, the distinction between the CC side and the CC^{\perp} side is not very visible. In general, one first obtains

ZC(λ)=1CZC(λ^)\Zpart_{C}(\lambda) = \frac{1}{\card{C^{\perp}}}\Zpart_{C^{\perp}}(\what{\lambda})

and then applies the same formula to CC^{\perp} in order to put it in the usual direction of the MacWilliams identity.

Although the graph in this example is very small, the structure is the same as in the general case. Fourier-expand the constraint factor δ0(x1+x2)\delta_{0}(x_{1} + x_{2}) and take the coordinatewise sums; then the one-coordinate weight factor is Fourier transformed. As a result, the partition function over the dual code appears.

Let us also look at a non-self-dual example where the standard direction is easier to see. Over F2\F_{2}, take

C={000,111}F23.C = \{ 000, 111 \} \leq \F_{2}^{3}.

Then

WC(X,Y)=X3+Y3.W_{C}(X, Y) = X^{3} + Y^{3}.

The dual code is

C={000,011,101,110}C^{\perp} = \{ 000, 011, 101, 110 \}

and

WC(X,Y)=X3+3XY2.W_{C^{\perp}}(X, Y) = X^{3} + 3XY^{2}.

For example, we can take

H=(110101)H = \begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \end{pmatrix}

as a parity-check matrix. Then

yH=(y1+y2,y1,y2).yH = (y_{1} + y_{2}, y_{1}, y_{2}).

As y1,y2F2y_{1}, y_{2} \in \F_{2} vary,

000,110,101,011000,\quad 110,\quad 101,\quad 011

appear exactly once each. Here again, y1,y2y_{1}, y_{2} are not coordinates of a dual codeword, but coefficients used to linearly combine the two rows of HH and produce the dual codeword u=yHu = yH. The right-hand side of the MacWilliams identity is

12WC(X+Y,XY)=12{(X+Y)3+(XY)3}=X3+3XY2\begin{aligned} \frac{1}{2} W_{C}(X + Y, X - Y) &= \frac{1}{2}\{(X + Y)^{3} + (X - Y)^{3} \} \\ &= X^{3} + 3XY^{2} \end{aligned}

which indeed agrees with WC(X,Y)W_{C^{\perp}}(X, Y). Since this example is not self-dual,

WC(X,Y)=1CWC(X+Y,XY)W_{C^{\perp}}(X, Y) = \frac{1}{\card{C}} W_{C}(X + Y, X - Y)

shows the standard direction more clearly.

Supplement: How to View Normal Factor Graphs

This section supplies background that was not used in the proof in the main text. It is included as a supplement for placing the calculation in this note within the general theory of normal factor graphs, after one has read the derivation of the MacWilliams identity. The proof above did not invoke a general theorem on normal factor graphs; it directly Fourier-expanded the partition function of a Tanner-graph-type factor graph. For readers unfamiliar with normal factor graphs, it is enough to read this section as background explanation.

The proof itself proceeded by following formulae for an ordinary bipartite factor graph of Tanner-graph type. On the other hand, normal factor graphs provide background for understanding that this local Fourier transform is a special case of a more general graph duality.

In a Forney-style normal factor graph, variables are drawn as edges and factors as vertices. In an ordinary factor graph, both variables and factors are vertices. In a normal factor graph, variables become edges connecting factors to factors. In a Forney-style normal factor graph, internal variables are drawn as edges and external variables as half-edges.

The basic advantage of a normal factor graph is that the operation of taking a dual can be described locally. Depending on the convention, sign-inversion factors and normalisation constants appear, but roughly speaking, by Fourier-transforming each factor and replacing each edge variable by a dual variable, one obtains what is called the dual graph in the general theory of normal factor graphs. Its partition function is related to the original partition function by a constant factor. This general theorem is Forney-style normal-factor-graph duality.

We do not develop the general theory of normal factor graphs in full here. Instead, we record the underlying idea.

In an ordinary factor graph, a variable xx can appear in three or more factors. In normal form, one creates copies x(1),x(2),x(3),x^{(1)}, x^{(2)}, x^{(3)}, \dots of that variable and inserts an equality factor eq(x(1),x(2),x(3),)\eq(x^{(1)}, x^{(2)}, x^{(3)},\dots) forcing all of them to be equal. Here

eq(x(1),x(2),,x(s)){1,x(1)=x(2)==x(s),0,otherwise.\eq(x^{(1)}, x^{(2)}, \dots, x^{(s)}) \coloneqq \begin{cases} 1, & x^{(1)} = x^{(2)} = \dots = x^{(s)}, \\ 0, & \text{otherwise} \end{cases}.

In this way, each variable copy can be made adjacent to at most two factors.

For the Tanner-graph-type factor graph in this note as well, if one wants a strict Forney-style normal factor graph, one introduces copies of each coordinate variable and equality factors. However, the essence of the calculation deriving the MacWilliams identity lies in Fourier-expanding each check factor and locally separating the coordinatewise sums. For this reason, this note did not go deeply into the details of diagrammatic normalisation, but followed the local dualisation inside the formulae.

Let us emphasise this point. The proof here uses the same Fourier principle as the ordinary character-theoretic proof. However, instead of writing one large character sum all at once, it proceeds as the local operation

Fourier-expand each check factor, and separate the sum for each coordinate variable.

This is the factor-graph viewpoint.

Supplement: Complete Weight Enumerator Version

This section is supplementary. If you only want to read about the ordinary Hamming weight enumerator, you may skip it. Here we look at the finer weight enumerator before all non-zero elements are collected into the same variable YY.

Writing the one-coordinate weight λ\lambda with separate variables gives the complete weight enumerator. Introduce a variable TaT_{a} for each aFqa \in \F_{q}, and put λ(a)Ta\lambda(a) \coloneqq T_{a}. Then

ZC(λ)=cCeETce.\Zpart_{C}(\lambda) = \sum_{c \in C} \prod_{e \in E} T_{c_e}.

This is the complete weight enumerator of CC. In this note, we write it as

cweC((Ta)aFq)=cCeETce.\cwe_{C}((T_{a})_{a \in \F_{q}}) = \sum_{c \in C} \prod_{e \in E} T_{c_{e}}.

The one-coordinate Fourier transform appears as the change of variables

TbF=aFqTaψ(ab)(bFq).T_{b}^{\mathrm{F}} = \sum_{a \in \F_{q}} T_{a} \psi(ab) \qquad (b \in \F_{q}).

The superscript F\mathrm{F} is a symbol indicating that the variable is after Fourier transform, and is different from the symbol F\F denoting a finite field. Here we enlarge the coefficient ring to C[Ta:aFq]\C\lbrack T_{a} : a \in \F_{q} \rbrack and compute there. Therefore (8.4) can be written as

cweC((Ta)aFq)=1CcweC((TbF)bFq).(13.1)\cwe_{C}((T_{a})_{a \in \F_{q}}) = \frac{1}{\card{C^{\perp}}} \cwe_{C^{\perp}}((T_{b}^{\mathrm{F}})_{b \in \F_{q}}). \tag{13.1}

This is one direction of the MacWilliams identity for the complete weight enumerator.

To put it in the form usually seen, apply (13.1) to CC^{\perp}. Since (C)=C(C^{\perp})^{\perp} = C and (C)=C\card{(C^{\perp})^{\perp}}=\card{C}, we obtain

cweC((Ta)aFq)=1CcweC((TbF)bFq).(13.2)\cwe_{C^{\perp}}((T_{a})_{a \in \F_{q}}) = \frac{1}{\card{C}} \cwe_{C}((T_{b}^{\mathrm{F}})_{b \in \F_{q}}). \tag{13.2}

This formula is the MacWilliams identity for the complete weight enumerator.

One point is worth noting: depending on the convention for the Fourier transform, ψ(ab)\psi(-ab) may appear instead of ψ(ab)\psi(ab). For the complete weight enumerator, this difference appears as the replacement of the index of the transformed variable from bb to b-b. On the other hand, when specialising to the Hamming weight enumerator, all non-zero values are sent to the same variable YY, so this replacement of indices becomes invisible. Therefore it does not affect the final Hamming-weight version of the MacWilliams identity.

What Factor Graphs and Partition Functions Did in This Proof

Let us look back at the proof in this note.

First, we viewed the weight enumerator as a partition function. The weight enumerator is obtained by multiplying coordinatewise weights for each codeword cCc \in C and summing all of them. That is,

WC(X,Y)=cCeEλX,Y(ce).W_{C}(X,Y) = \sum_{c \in C} \prod_{e \in E} \lambda_{X, Y}(c_{e}).

This viewpoint lets us regard the weight enumerator as a “weighted sum over states satisfying constraints”.

Second, we represented the code CC as a product of local constraints. Using a parity-check matrix HH, the code CC is the simultaneous solution set of the constraints, one for each row jj,

eEHj,exe=0.\sum_{e \in E} H_{j, e} x_{e} = 0.

In a qq-ary Tanner graph, not only the presence or absence of an edge, but also the edge label Hj,eH_{j, e}, is part of the data determining the check equation. This constraint is incorporated into the partition function as the factor

δ0(eEHj,exe).\delta_{0} \left( \sum_{e \in E} H_{j, e} x_{e} \right).

Thus the code is represented not as one large set, but as a collection of local constraints, one for each check equation. The factor graph actually used in this note was obtained by adding the one-coordinate weight factor λ(xe)\lambda(x_{e}) at each coordinate to this Tanner graph.

Third, we Fourier-expanded the zero constraint. The constraint factor can be written as

δ0(t)=1qyFqψ(yt).\delta_{0}(t) = \frac{1}{q} \sum_{y \in \F_{q}} \psi(yt).

This is an operation that transforms a constraint into a sum over a dual variable yy. Applying this transformation for each check equation produces the dual variables yjy_{j}. Here again, yjy_{j} is not a coordinate of a dual codeword, but a coefficient used to linearly combine check rows.

Fourth, the coordinatewise sums became one-coordinate Fourier transforms. When the dual variables yjy_{j} from the check equations are collected together, each coordinate contains

(yH)e=jJyjHj,e.(yH)_{e} = \sum_{j \in J} y_{j} H_{j,e}.

Taking the sum over xex_{e} at that coordinate gives

aFqλ(a)ψ(a(yH)e)=λ^((yH)e).\sum_{a \in \F_{q}} \lambda(a)\psi(a(yH)_{e}) = \what{\lambda}((yH)_{e}).

Thus the original weight factor λ\lambda changes into the Fourier-transformed weight factor λ^\what{\lambda}.

Fifth, yHyH ran through the dual code. Since the rows of HH have been chosen as a basis of CC^{\perp}, as yy runs through FqJ\F_{q}^{J}, yHyH runs through all of CC^{\perp} exactly once. Writing the dual codeword as u=yHu = yH, its coordinates are ue=(yH)eu_{e} = (yH)_{e}. Thus the partition function of the original code CC is rewritten, with the coefficient 1/C1/\card{C^{\perp}} and the Fourier-transformed one-coordinate weight λ^\what{\lambda}, as a partition function over the dual code CC^{\perp}. In the usual direction of the MacWilliams identity, the same general formula is applied to CC^{\perp}, so the coefficient becomes 1/C1/\card{C}.

In one sentence, the point is as follows.

The MacWilliams identity is the fact that, for the partition function of a Tanner-graph-type factor graph, if one locally Fourier-expands the zero-constraint factors and then takes the coordinatewise sums, a partition function over the dual code appears, together with a constant factor and the Fourier-transformed one-coordinate weight.

Concepts Seen in This Note

Along the way to proving the MacWilliams identity, this note introduced the basic tools of factor graphs and partition functions. They can be organised as follows.

Partition function

The total sum of weights over all states. In this note, we treated it as a sum over finite sets,

Z=xαfα(xα).\Zpart = \sum_{x} \prod_{\alpha} f_{\alpha}(x_{\partial\alpha}).
Factor

A local function depending on only a small number of variables. By decomposing a large function into a product of factors, one can make the dependency relations easier to see.

Factor graph

A representation of the dependency relation between variables and factors as a bipartite graph. One prepares variable vertices and factor vertices, and joins a factor to the variables on which it depends.

Constraint factor

A factor which is 11 when a condition is satisfied and 00 when it is not. We represented linear constraints using δ0\delta_{0}.

Tanner-graph-type factor graph

A factor graph representing a linear code as local constraints on a Tanner graph consisting of variable nodes corresponding to coordinates and check nodes corresponding to check equations. For a qq-ary code, the edge labels Hj,eH_{j, e} are also data determining the check equations. In this note, we further added the one-coordinate weight factor λ(xe)\lambda(x_{e}) at each coordinate. The weight enumerator can be represented as the partition function of this graph.

Normal factor graph

A Forney-style factor graph in which variables are drawn as edges and factors as vertices. It is a framework in which dualisation can be described factor by factor locally.

One-coordinate Fourier transform

The operation transforming a one-coordinate weight factor λ\lambda by

λ^(b)=aFqλ(a)ψ(ab).\what{\lambda}(b) = \sum_{a \in \F_{q}} \lambda(a)\psi(ab).

For the Hamming weight factor, X+(q1)YX + (q - 1)Y and XYX - Y appear.

Fourier expansion of the zero constraint

The formula writing the constraint factor δ0(t)\delta_{0}(t) as

δ0(t)=1qyFqψ(yt).\delta_{0}(t) = \frac{1}{q} \sum_{y \in \F_{q}} \psi(yt).

This produces dual variables from local constraints.

Partition-function duality

The principle that, when local factors are Fourier-transformed, the original partition function and the partition function of the factor graph on the dual side are related by a constant factor. In the general theory of normal factor graphs, this factor graph on the dual side is treated as the dual graph. In this note, as the most basic example of this, we derived the MacWilliams identity for linear codes.

Review of the Proof Family in This Note

On the surface, the proof in this note is a proof using factor graphs and partition functions. We represented a code as local constraints, one for each parity-check equation, and read the weight enumerator as the corresponding constrained partition function. Then we Fourier-expanded each constraint factor and separated the coordinatewise sums.

Within the five-family classification used in this series, the proof in this note belongs to the

Fourier, character, and Poisson family.

The reason is that the fundamental principle by which the dual code appears is the orthogonality relation for additive characters of finite fields. Indeed, the expansion of the zero-constraint factor,

δ0(t)=1qyFqψ(yt)\delta_{0}(t) = \frac{1}{q}\sum_{y \in \F_{q}} \psi(yt)

is precisely a finite Fourier transform formula.

However, the viewpoint in this note differs from the ordinary character-theoretic proof. In the usual proof, one treats the character sum over the whole code CC,

cCψ(uc)\sum_{c \in C}\psi(u \cdot c)

all at once. By contrast, in this note we started from local Fourier expansions for each check factor and transformed the one-coordinate weight factors through coordinatewise sums. That is, instead of applying the Fourier transform “once globally”, we glued together local operations on the factor graph.

This difference is the value of the factor-graph viewpoint. Even for the same MacWilliams identity, one can view it as “global character orthogonality”, or as “duality of partition functions with local constraints”. The latter viewpoint extends naturally to convolutional codes, tail-biting codes, codes on graphs, and models in statistical physics.

Towards the Next Part

At this point, the derivation of the MacWilliams identity from factor graphs and partition functions, which is the main claim of this note, is complete.

In the next note, we look at the MacWilliams identity from the perspective of analytic trace formulae.

In this proof, we represented the weight enumerator as the partition function of a Tanner-graph-type factor graph, and moved to the partition function of the dual code by a local Fourier transform. In the next note, we consider operators on Hamming spaces, especially heat kernels and Markov operators on Hamming graphs, and compute their traces in two ways. In one calculation, the dual code appears from the spectral side, and in the other calculation, the weight enumerator of the original code appears from the trace of a translation action.

Thus the main ingredients next time are

Hamming graph → operators → heat kernel → trace formula → the MacWilliams identity

Even though the proof belongs to the same Fourier, character, and Poisson family, the next note will show not a “partition function with local constraints”, but the analytic viewpoint of “computing the trace of an operator in two ways”.

Footnotes

  1. Here K\mathcal{K} is a range of coefficients in which addition and multiplication can be performed, for example C\C or a polynomial ring. At first it is enough to think of a commutative semiring or a commutative ring. From the section where Fourier expansion is used onward, because we handle 1/q1/q and character values, we enlarge the coefficient ring to one containing C\C as needed.

References

  1. [KFL01] Frank R. Kschischang, Brendan J. Frey, and Hans-Andrea Loeliger. Factor graphs and the sum-product algorithm. IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 498–519, 2001. doi:10.1109/18.910572
  2. [Loe04] Hans-Andrea Loeliger. An introduction to factor graphs. IEEE Signal Process. Mag., vol. 21, no. 1, pp. 28-41, 2004. doi:10.1109/MSP.2004.1267047 https://api.semanticscholar.org/CorpusID:7722934
  3. [For01] Jr. G. David Forney. Codes on graphs: normal realizations. IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 520–548, 2001. doi:10.1109/18.910573
  4. [For11] Jr. G. David Forney. Codes on graphs: duality and MacWilliams identities. IEEE Trans. Inform. Theory, vol. 57, no. 3, pp. 1382–1397, 2011. doi:10.1109/TIT.2011.2104994
  5. [Tan81] R. Michael Tanner. A recursive approach to low complexity codes. IEEE Trans. Inform. Theory, vol. 27, no. 5, pp. 533–547, 1981. doi:10.1109/TIT.1981.1056404

This series

A Series Learning through the MacWilliams Identity · Part 9 of 12

Back to series list

Disclaimer

Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.

Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.

If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.