Back to resources

Article and note

A Series Learning through the MacWilliams IdentityPart 5 of 12

An Introduction to Association Schemes through the MacWilliams Identity

Among the five proof systems for the MacWilliams identity, this note focuses on the orthogonal-polynomial and association-scheme approach, and introduces relation partitions, adjacency matrices, Bose–Mesner algebras, primitive idempotents, Hamming schemes, and inner and dual distributions.
Published:
Updated:
Reading time:
42 min (about 9,141 words)
Tagscoding theoryMacWilliams identityassociation schemesBose-Mesner algebraHamming schemeKrawtchouk polynomialsfinite fieldsexpository note
Download PDF

Introduction

One of the fundamental theorems in coding theory is the MacWilliams identity. Let EE be the coordinate set, let n=En = \card{E}, and let CFqEC\leq \F_{q}^{E} be a linear code over the finite field Fq\F_{q}. Consider its dual code

C={uFqE:uc=0 for all cC}.C^{\perp} = \{ u \in \F_{q}^{E} : u \cdot c = 0 \text{ for all } c \in C \}.

Here

uc=eEuece.u \cdot c = \sum_{e \in E} u_{e} c_{e}.

First, define the weight enumerator of the linear code CC by

WC(X,Y)=cCXnwt(c)Ywt(c).W_{C}(X,Y) = \sum_{c \in C} X^{n - \wt(c)} Y^{\wt(c)}.

Also write the weight distribution as

Ai(C)={cC:wt(c)=i}(0in).A_{i}(C) = \card{\{ c \in C : \wt(c) = i\}} \qquad (0 \leq i\leq n).

Then

WC(X,Y)=i=0nAi(C)XniYi.W_{C}(X, Y) = \sum_{i = 0}^{n} A_{i}(C) X^{n - i} Y^{i}.

The MacWilliams identity is the formula saying that the weight enumerator of the dual code can be computed from the weight enumerator of CC as follows:

WC(X,Y)=1CWC(X+(q1)Y,XY).W_{C^{\perp}}(X, Y) = \frac{1}{\card{C}} W_{C}\bigl( X + (q - 1)Y, X - Y \bigr).

This is the MacWilliams identity.

In this series, I use proofs of the MacWilliams identity as a guide to an introduction to neighbouring areas and concepts. Accordingly, this series is aimed at readers who already know the following basic material in coding theory:

We assume familiarity, at least at a basic level, with

  • what a (finite) field is,

  • what a linear code over a finite field is,

  • what the Hamming weight is,

  • what the dual code is.

(There is no problem if you do not know a proof of the MacWilliams identity.)

This note does not assume Parts 1–4. Krawtchouk polynomials, tensor products, and additive characters appear along the way, but the parts needed in this note are explained when they are used. If you have read Part 4, the outlook will be clearer, but the note is written so that it can be read independently.

In this series, I look at proof methods for the MacWilliams identity under the following five broad families.

  1. Fourier, character, and Poisson methods.

  2. Möbius inversion, lattice-theoretic, and shortening/puncturing methods.

  3. Orthogonal-polynomial and association-scheme methods.

  4. Matroid and Tutte-polynomial methods.

  5. Moment and double-counting methods.

In this note, I focus on the third of these: the orthogonal-polynomial and association-scheme approach. Written using Krawtchouk polynomials, the MacWilliams identity appears as

a Krawtchouk transform of the weight distribution.

Here the Krawtchouk transform means a transform of the form

(ai)i=0n(i=0naiKj(i))j=0n.(a_{i})_{i=0}^{n} \longmapsto \left( \sum_{i = 0}^{n} a_{i} \Kraw_{j}(i) \right)_{j=0}^{n}.

Let me first say how to read the subscripts. If we write Ki(w)\Kraw_{i}(w), this means the eigenvalue of the distance-ii adjacency matrix on the ww-th eigenspace. On the other hand, in the MacWilliams transform we write the dual-side weight as jj and the original weight as ii, so the coefficient appears in the form Kj(i)\Kraw_{j}(i). It is the same Krawtchouk polynomial, but one has to read from context which subscript belongs to the distance side and which belongs to the eigenspace side. We will organise this point once more after introducing the eigenspaces of the Hamming scheme. In this note, I explain in the language of association schemes why these Krawtchouk polynomials are naturally attached to Hamming space. In other words, the aim of this note is to explain why the coefficients Kj(i)\Kraw_j(i) naturally appear as eigenvalues of the Hamming scheme. The goal of this instalment is to use a proof of the MacWilliams identity as a guide to an introduction to association schemes, especially the Hamming scheme and the Bose–Mesner algebra.

An association scheme abstracts the situation where binary relations on a finite set are divided into several types and these relations behave in a sufficiently regular way. In Hamming space, the relation between two words x,yFqEx, y \in \F_{q}^{E} is classified according to which of

dist(x,y)=0,1,,n\dist(x, y) = 0, 1, \dots, n

holds. Here dist(x,y)=wt(xy)\dist(x, y) = \wt(x - y) is the Hamming distance. From these distance relations we form adjacency matrices; when we look at the commutative algebra spanned by them, Krawtchouk polynomials appear as its eigenvalues.

The route in this note is as follows.

relation partitions → adjacency matrices → association schemes → Bose–Mesner algebras → primitive idempotents → inner and dual distributions → the MacWilliams identity

The classical starting point for coding theory from the viewpoint of association schemes is Delsarte [Del73]. For a survey of the relation between association schemes and coding theory, including distance distributions, MacWilliams transforms, and linear programming, see Delsarte–Levenshtein [DL98]. For a systematic reference on association schemes, Bannai–Ito [BI84] is standard. For a detailed account including the relation with distance-regular graphs, see Brouwer–Cohen–Neumaier [BCN89]. For the MacWilliams identity and Krawtchouk polynomials in coding theory, MacWilliams–Sloane [MS77, Chapter 5] is also a classical standard reference. For basic material on association schemes used in coding theory, Camion [Cam98] is also useful.

Viewing Relations on a Finite Set as Matrices

Before entering association schemes, we look at how to represent relations on a finite set by matrices.

Let Ω\Omega be a finite set. If we think of the elements of Ω\Omega as vertices, then a subset RΩ×ΩR \subseteq \Omega \times \Omega can be viewed as a set of directed edges. The condition (x,y)R(x,y)\in R means that there is relation RR from xx to yy.

Definition 2.1.

For a relation RΩ×ΩR\subseteq \Omega\times \Omega on a finite set Ω\Omega, define the matrix AR\mathsf{A}_R, whose rows and columns are indexed by Ω\Omega, by

(AR)x,y={1,(x,y)R,0,(x,y)R.(\mathsf{A}_R)_{x,y} = \begin{cases} 1, & (x,y)\in R,\\ 0, & (x,y)\notin R. \end{cases}

This is called the adjacency matrix of RR.

This definition generalises the adjacency matrix of a graph. For an ordinary simple graph, we record by 00 and 11 whether two vertices are joined by an edge. Here we are simply recording, in matrix form, whether a pair belongs to the relation RR.

Let CΩ\C^{\Omega} be the set of all complex-valued functions on Ω\Omega. Equivalently, one may view it as the complex vector space with basis vectors exe_x corresponding to the elements xx of Ω\Omega. The matrix AR\mathsf{A}_R acts as a linear map on CΩ\C^{\Omega}. For a function f ⁣:ΩCf\colon \Omega\to\C,

(ARf)(x)=yΩ(x,y)Rf(y).(\mathsf{A}_R f)(x) = \sum_{\substack{y\in \Omega\\(x,y)\in R}} f(y).

Thus AR\mathsf{A}_R is the operator that “adds the values over all yy related to xx by RR”.

Example 2.2 (The one-coordinate relation of a complete graph).

Let Ω=Fq\Omega=\F_q. Consider the two relations

R0={(a,a):aFq},R1={(a,b):a,bFq,ab}.\begin{aligned} R_{0} &= \{ (a, a) : a \in \F_{q} \},\\ R_{1} &= \{ (a, b) : a, b \in \F_{q},\, a \neq b\}. \end{aligned}

The adjacency matrix of R0R_{0} is the identity matrix II, and the adjacency matrix of R1R_{1} is JIJ - I. Here JJ is the matrix all of whose entries are 11. This is the adjacency matrix of the complete graph KqK_{q}.

In what follows, II and JJ denote the identity matrix and the all-one matrix of the size appropriate to the space on which they act. For instance, in the example above they are q×qq\times q matrices indexed by Fq\F_{q}. On the other hand, the II and JJ which later appear as operators on Ω=FqE\Omega=\F_{q}^{E} are qn×qnq^{n}\times q^{n} matrices indexed by Ω\Omega. Where necessary, I will specify whether an operator acts on one coordinate or on the full multi-coordinate space.

When we treat several relations at the same time, the important case is when they partition Ω×Ω\Omega\times \Omega.

Definition 2.3.

Relations R0,R1,,RdΩ×ΩR_{0}, R_{1}, \dots, R_{d} \subseteq \Omega \times \Omega on a finite set Ω\Omega form a relation partition of Ω×Ω\Omega\times \Omega if

Ω×Ω=R0R1Rd.\Omega\times \Omega=R_0\sqcup R_1\sqcup\cdots\sqcup R_d.

A relation partition corresponds to colouring the relation between any two points x,yΩx, y \in \Omega with exactly one colour. For this reason, association schemes are sometimes described as complete graphs with coloured edges. Strictly speaking, however, the diagonal pairs (x,x)(x, x) are also coloured with the colour R0R_{0}, so it is natural to think of a complete graph with loops, whose edges have been coloured. Of course, arbitrary colouring is not enough. We require strong regularity in the colouring. That is the definition in the next section.

Association Schemes

In this note, we treat the most basic case, sufficient for coding theory: symmetric association schemes. There are non-symmetric association schemes in general, but Hamming schemes are symmetric, so it is clearest to begin with this case.

Definition 3.1.

Let Ω\Omega be a finite set, and let R0,R1,,RdΩ×ΩR_{0}, R_{1}, \dots, R_{d} \subseteq \Omega \times \Omega be non-empty relations giving a relation partition

Ω×Ω=R0R1Rd.\Omega \times \Omega = R_{0} \sqcup R_{1} \sqcup \dots \sqcup R_{d}.

This partition is called a symmetric association scheme if it satisfies:

  1. R0={(x,x):xΩ}R_{0} = \{ (x, x) : x \in \Omega \}.

  2. For each ii, if (x,y)Ri(x, y) \in R_{i}, then (y,x)Ri(y, x) \in R_{i}.

  3. For any 0i,j,kd0 \leq i, j ,k \leq d, if x,yΩx, y \in \Omega satisfy (x,y)Rk(x, y) \in R_{k}, then

    {zΩ:(x,z)Ri, (z,y)Rj}\card{\{ z \in \Omega : (x, z) \in R_{i},\ (z, y) \in R_{j} \}}

    is independent of the choice of xx and yy, and depends only on ii, jj, and kk.

Since each RkR_{k} is assumed to be non-empty, there is no ambiguity in choosing (x,y)Rk(x, y)\in R_{k} in condition (A3).

The number appearing in condition (A3) is written pijkp_{ij}^{k} and is called an intersection number.

In words, the definition means the following. Suppose that the relation between two points xx and yy is RkR_{k}. Then the number of points zz which are in relation RiR_{i} from the viewpoint of xx and in relation RjR_{j} from the viewpoint of yy does not depend on the concrete choice of xx and yy, but only on the types of relations ii, jj, and kk. Thus we are requiring the colouring of the relations to be highly uniform.

In terms of adjacency matrices, this condition looks very natural. Write the adjacency matrix of RiR_{i} as AiARi\mathsf{A}_{i} \coloneqq \mathsf{A}_{R_{i}}. Then the (x,y)(x, y) entry of the matrix product AiAj\mathsf{A}_{i}\mathsf{A}_{j} is

(AiAj)x,y=zΩ(Ai)x,z(Aj)z,y.(\mathsf{A}_i\mathsf{A}_j)_{x,y} = \sum_{z\in \Omega}(\mathsf{A}_i)_{x,z}(\mathsf{A}_j)_{z,y}.

This counts the number of zz such that (x,z)Ri(x, z) \in R_{i} and (z,y)Rj(z, y) \in R_{j}. Therefore the association-scheme condition says that this product of matrices can again be written as a linear combination of the same adjacency matrices.

Theorem 3.2.

The adjacency matrices of a symmetric association scheme satisfy

AiAj=k=0dpijkAk.\mathsf{A}_{i} \mathsf{A}_{j} = \sum_{k=0}^{d}p_{ij}^{k}\mathsf{A}_{k}.
Proof

Compare the (x,y)(x,y) entries of both sides. Suppose that (x,y)Rk(x,y)\in R_{k}. The (x,y)(x,y) entry of the left-hand side is the number of zz satisfying (x,z)Ri(x,z)\in R_{i} and (z,y)Rj(z,y)\in R_{j}. By the definition of an association scheme, this is equal to pijkp_{ij}^{k}. On the other hand, since the relation containing (x,y)(x,y) is RkR_{k}, the (x,y)(x,y) entry of the right-hand side is

=0dpij(A)x,y=pijk.\sum_{\ell=0}^{d}p_{ij}^{\ell}(\mathsf{A}_{\ell})_{x,y} = p_{ij}^{k}.

Hence the entries of the two sides agree.

End of proof

Thus the intersection-number condition guarantees that the operation “follow relation RiR_i, then follow relation RjR_j” can again be expressed as a linear combination of the original relations R0,,RdR_0,\dots,R_d. For this reason, a relation partition is not merely a classification; it produces a matrix algebra formed by the adjacency matrices.

Also, because the relations partition Ω×Ω\Omega \times \Omega,

A0+A1++Ad=J.\mathsf{A}_{0} + \mathsf{A}_{1} + \dots + \mathsf{A}_{d} = J.

Here JJ is the all-one matrix. Moreover, since R0R_0 is the diagonal relation, A0=I\mathsf{A}_{0} = I.

Bose–Mesner Algebras

The adjacency matrices of an association scheme are closed under matrix multiplication. We therefore consider the vector space spanned by these matrices.

Definition 4.1.

Let (Ω,{Ri}i=0d)(\Omega, \{R_{i} \}_{i = 0}^{d}) be a symmetric association scheme, and let A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} be its adjacency matrices. The complex vector space spanned by these matrices,

A=SpanC{A0,A1,,Ad}MatΩ(C),\mathcal{A} = \Span_{\C}\{\mathsf{A}_0,\mathsf{A}_1,\dots,\mathsf{A}_d\} \subseteq \Mat_{\Omega}(\C),

is called the Bose–Mesner algebra of the scheme.

By Theorem 3.2, A\mathcal{A} is closed under matrix multiplication. It also has an identity element, since A0=I\mathsf{A}_0=I. Furthermore, since R0,,RdR_0,\dots,R_d partition Ω×Ω\Omega\times \Omega, the matrices A0,,Ad\mathsf{A}_0,\dots,\mathsf{A}_d have 11's only in disjoint positions. Therefore, if

c0A0++cdAd=0,c_{0} \mathsf{A}_{0} + \dots + c_{d}\mathsf{A}_{d} = 0,

then by looking at an entry with (x,y)Rk(x, y) \in R_{k} we get ck=0c_{k} = 0. Hence A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} are linearly independent, and dimCA=d+1\dim_{\C} \mathcal{A} = d + 1. For a symmetric association scheme, the matrices Ai\mathsf{A}_i are real symmetric matrices, and they also commute with one another. In general, two real symmetric matrices need not commute. The commutativity here follows because, by Theorem 3.2,

AiAj=k=0dpijkAk\mathsf{A}_i\mathsf{A}_j = \sum_{k=0}^{d}p_{ij}^{k}\mathsf{A}_k

is a linear combination of symmetric matrices, and is therefore symmetric. Indeed,

AiAj=(AiAj)T=AjTAiT=AjAi.\mathsf{A}_i\mathsf{A}_j = (\mathsf{A}_i\mathsf{A}_j)^T = \mathsf{A}_j^T\mathsf{A}_i^T = \mathsf{A}_j\mathsf{A}_i.

Thus A\mathcal{A} is a commutative complex matrix algebra spanned by the real symmetric, mutually commuting adjacency matrices A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d}. These adjacency matrices can be simultaneously diagonalised in the sense of linear algebra. From this simultaneous diagonalisation one obtains another basis of the Bose–Mesner algebra: the primitive idempotents. We now verify this fact using only finite-dimensional linear algebra.

Theorem 4.2.

In the Bose–Mesner algebra A\mathcal{A} of a symmetric association scheme, there exist matrices E0,E1,,EdA\mathsf{E}_{0}, \mathsf{E}_{1}, \dots, \mathsf{E}_{d} \in \mathcal{A} such that:

  1. ErEs=0\mathsf{E}_{r} \mathsf{E}_{s} = 0 (rsr \neq s).

  2. Er2=Er\mathsf{E}_{r}^{2} = \mathsf{E}_{r}.

  3. E0+E1++Ed=I\mathsf{E}_{0} + \mathsf{E}_{1} + \dots + \mathsf{E}_{d} = I.

  4. E0,E1,,Ed\mathsf{E}_{0}, \mathsf{E}_{1}, \dots, \mathsf{E}_{d} form a basis of A\mathcal{A}.

  5. Each Ai\mathsf{A}_{i} acts as a scalar multiple on the image of each Er\mathsf{E}_{r}.

  6. No Er\mathsf{E}_{r} can be decomposed further, inside A\mathcal{A}, as a non-trivial sum of orthogonal idempotents.

These Er\mathsf{E}_{r} are called the primitive idempotents.

Proof

As seen above, A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} are mutually commuting real symmetric matrices. Hence, by the simultaneous diagonalisation theorem from linear algebra, CΩ\C^{\Omega} decomposes as an orthogonal direct sum of their common eigenspaces.

Let Λ\Lambda denote the set of all common eigenvalue tuples. Thus an element λΛ\lambda \in \Lambda is a tuple of the form

λ=(λ0,λ1,,λd)Cd+1\lambda = (\lambda_{0}, \lambda_{1}, \dots, \lambda_{d}) \in \C^{d+1}

for which the corresponding common eigenspace

Wλ={vCΩ:Aiv=λiv for all 0id}W_{\lambda} = \{ v \in \C^{\Omega}: \mathsf{A}_{i} v = \lambda_{i} v \text{ for all } 0 \leq i \leq d\}

is non-zero. Simultaneous diagonalisation gives the orthogonal direct-sum decomposition

CΩ=λΛWλ.\C^{\Omega} = \bigoplus_{\lambda\in\Lambda} W_{\lambda}.

For each λΛ\lambda \in \Lambda, write Fλ\mathsf{F}_{\lambda} for the orthogonal projection onto WλW_{\lambda}.

Basic properties of the projections

From the orthogonal direct-sum decomposition,

FλFμ=0(λμ),Fλ2=Fλ,λΛFλ=I\mathsf{F}_{\lambda}\mathsf{F}_{\mu} = 0 \quad(\lambda \neq \mu), \qquad \mathsf{F}_{\lambda}^{2} = \mathsf{F}_{\lambda}, \qquad \sum_{\lambda \in \Lambda} \mathsf{F}_{\lambda} = I

holds. Also, since each Ai\mathsf{A}_{i} acts as multiplication by λi\lambda_{i} on WλW_{\lambda},

Ai=λΛλiFλ.\mathsf{A}_{i} = \sum_{\lambda \in \Lambda} \lambda_{i} \mathsf{F}_{\lambda}.

Next we show that each Fλ\mathsf{F}_{\lambda} actually belongs to A\mathcal{A}.

That FλA\mathsf{F}_{\lambda} \in \mathcal{A}

Fix λΛ\lambda \in \Lambda. If μΛ\mu \in \Lambda and μλ\mu \neq \lambda, then the common eigenvalue tuples are different, so there exists an index i=i(μ)i = i(\mu) such that λi(μ)μi(μ)\lambda_{i(\mu)} \neq \mu_{i(\mu)}. Consider the polynomial

pλ(X0,,Xd)=μΛμλXi(μ)μi(μ)λi(μ)μi(μ).p_\lambda(X_{0}, \dots, X_{d}) = \prod_{\substack{\mu \in \Lambda \\ \mu \neq \lambda}} \frac{X_{i(\mu)}-\mu_{i(\mu)}}{\lambda_{i(\mu)}-\mu_{i(\mu)}}.

Substitute the commuting matrices A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} into this polynomial. On WλW_{\lambda}, each Ai\mathsf{A}_{i} acts as multiplication by λi\lambda_{i}, so pλ(A0,,Ad)p_\lambda(\mathsf{A}_{0}, \dots, \mathsf{A}_{d}) acts as multiplication by 11 on WλW_{\lambda}. On the other hand, if νλ\nu \neq \lambda, then on WνW_{\nu} the factor corresponding to μ=ν\mu = \nu in the product becomes 00. Hence pλ(A0,,Ad)p_\lambda(\mathsf{A}_{0}, \dots, \mathsf{A}_{d}) acts as multiplication by 00 on WνW_{\nu}.

Therefore

pλ(A0,,Ad)=Fλ.p_\lambda(\mathsf{A}_{0}, \dots, \mathsf{A}_{d}) = \mathsf{F}_{\lambda}.

The left-hand side is built from sums and products of A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d}. Since A\mathcal{A} is closed under matrix multiplication, it follows that FλA\mathsf{F}_{\lambda} \in \mathcal{A}.

We next determine the number of elements of Λ\Lambda.

The number of common eigenvalue tuples is d+1d+1

Let mm be the number of elements of Λ\Lambda. First, as shown above, the Fλ\mathsf{F}_{\lambda} are non-zero orthogonal idempotents belonging to A\mathcal{A}. In particular, they are linearly independent. Indeed, if

λΛcλFλ=0,\sum_{\lambda \in \Lambda} c_{\lambda}\mathsf{F}_{\lambda} = 0,

then restricting both sides to WμW_{\mu} gives cμ=0c_{\mu} = 0. Hence

mdimCA.m \leq \dim_{\C} \mathcal{A}.

We prove the opposite inequality. Any MA\mathsf{M} \in \mathcal{A} can be written in the form

M=c0A0++cdAd.\mathsf{M} = c_{0}\mathsf{A}_{0} + \dots + c_{d}\mathsf{A}_{d}.

Then on WλW_{\lambda}, M\mathsf{M} acts as multiplication by c0λ0++cdλdc_{0}\lambda_{0} + \dots + c_{d}\lambda_{d}. Thus the action of M\mathsf{M} is completely determined by its eigenvalue on each WλW_{\lambda}.

Define a map Φ ⁣:ACΛ\Phi \colon \mathcal{A} \to \C^{\Lambda} by

Φ(M)(the eigenvalue of M on Wλ)λΛ.\Phi(\mathsf{M}) \coloneqq \bigl( \text{the eigenvalue of $\mathsf{M}$ on $W_\lambda$} \bigr)_{\lambda\in\Lambda}.

If Φ(M)=0\Phi(\mathsf{M}) = 0, then M\mathsf{M} acts as 00 on every WλW_{\lambda}. By the orthogonal direct-sum decomposition

CΩ=λΛWλ,\C^{\Omega} = \bigoplus_{\lambda \in \Lambda} W_{\lambda},

this means that M\mathsf{M} is the zero matrix. Thus Φ\Phi is injective, and

dimCAm.\dim_{\C} \mathcal{A} \leq m.

Therefore

m=dimCA.m = \dim_{\C} \mathcal{A}.

Since we already saw that A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} are linearly independent,

dimCA=d+1.\dim_{\C} \mathcal{A} = d + 1.

Hence m=d+1m = d + 1.

We can therefore number the elements of Λ\Lambda as λ(0),λ(1),,λ(d)\lambda^{(0)}, \lambda^{(1)}, \dots, \lambda^{(d)}. Define

ErFλ(r)(0rd).\mathsf{E}_{r} \coloneqq \mathsf{F}_{\lambda^{(r)}} \qquad (0 \leq r \leq d).
Checking the listed properties

Since Er\mathsf{E}_{r} is an orthogonal projection, Er2=Er\mathsf{E}_{r}^{2} = \mathsf{E}_{r}. Since these are projections onto distinct common eigenspaces,

ErEs=0(rs)\mathsf{E}_{r} \mathsf{E}_{s} = 0 \qquad (r \neq s)

holds. Since the common eigenspaces sum to the whole space,

E0+E1++Ed=I\mathsf{E}_{0} + \mathsf{E}_{1} + \dots + \mathsf{E}_{d} = I

also holds.

Moreover, E0,,Ed\mathsf{E}_{0}, \dots, \mathsf{E}_{d} are d+1d + 1 linearly independent elements of A\mathcal{A}. Since dimCA=d+1\dim_{\C} \mathcal{A} = d + 1, they form a basis of A\mathcal{A}.

Finally, each Ai\mathsf{A}_{i} acts on Wλ(r)=Er(CΩ)W_{\lambda^{(r)}} = \mathsf{E}_{r}(\C^{\Omega}) as multiplication by λi(r)\lambda_{i}^{(r)}. Thus each Ai\mathsf{A}_{i} acts as a scalar multiple on the image of each Er\mathsf{E}_{r}.

It remains to check that these idempotents are primitive inside A\mathcal{A}.

Primitivity

Suppose that for some rr, Er=G+H\mathsf{E}_{r} = \mathsf{G} + \mathsf{H} is a decomposition as a sum of two orthogonal idempotents G,H\mathsf{G},\mathsf{H} belonging to A\mathcal{A}. Thus

G2=G,H2=H,GH=0.\mathsf{G}^{2} = \mathsf{G}, \qquad \mathsf{H}^{2} = \mathsf{H}, \qquad \mathsf{G}\mathsf{H} = 0.

Since G\mathsf{G} and H\mathsf{H} belong to A\mathcal{A}, they act as scalar multiples on each common eigenspace Wλ(s)W_{\lambda^{(s)}}. Write gsg_{s} for the eigenvalue of G\mathsf{G} on Wλ(s)W_{\lambda^{(s)}}, and hsh_{s} for the eigenvalue of H\mathsf{H} on Wλ(s)W_{\lambda^{(s)}}. Since G\mathsf{G} and H\mathsf{H} are idempotents,

gs2=gs,hs2=hs.g_{s}^{2} = g_{s}, \qquad h_{s}^{2} = h_{s}.

Hence gs,hs{0,1}g_{s}, h_{s} \in \{ 0, 1 \}.

On the other hand, Er=G+H\mathsf{E}_{r} = \mathsf{G} + \mathsf{H}. If srs \neq r, then Er\mathsf{E}_{r} acts as 00 on Wλ(s)W_{\lambda^{(s)}}, so gs+hs=0g_{s} + h_{s} = 0. Since gs,hs{0,1}g_{s}, h_{s} \in \{ 0, 1 \},

gs=hs=0.g_{s} = h_{s} = 0.

If s=rs = r, then Er\mathsf{E}_{r} acts as 11 on Wλ(r)W_{\lambda^{(r)}}, so gr+hr=1g_{r} + h_{r} = 1. Hence

(gr,hr)=(1,0)or(gr,hr)=(0,1).(g_{r}, h_{r}) = (1, 0) \quad\text{or}\quad (g_{r}, h_{r}) = (0, 1).

It follows that one of G\mathsf{G} and H\mathsf{H} acts as 11 only on Wλ(r)W_{\lambda^{(r)}}, and as 00 on all other common eigenspaces. Thus one of them is Er\mathsf{E}_{r}. The other acts as 00 on every common eigenspace, and is therefore the zero matrix.

Consequently

{G,H}={Er,0}.\{ \mathsf{G}, \mathsf{H} \} = \{ \mathsf{E}_{r}, 0 \}.

Thus Er\mathsf{E}_{r} cannot be decomposed further, inside A\mathcal{A}, as a non-trivial sum of orthogonal idempotents.

This proves the theorem.

End of proof

First let us check that this numbering is natural. In a symmetric association scheme, every adjacency matrix Ai\mathsf{A}_{i} has constant row sum. Indeed, since (x,x)R0(x,x)\in R_{0} for xΩx\in\Omega, the definition of the intersection number pii0p_{ii}^{0} gives

pii0={zΩ:(x,z)Ri, (z,x)Ri}.p_{ii}^{0} = \card{\{z\in\Omega:(x,z)\in R_{i},\ (z,x)\in R_{i}\}}.

By symmetry, (z,x)Ri(z,x)\in R_{i} is equivalent to (x,z)Ri(x,z)\in R_{i}, so

pii0={zΩ:(x,z)Ri}.p_{ii}^{0} = \card{\{z\in\Omega:(x,z)\in R_{i}\}}.

The right-hand side is the sum of the xx-row of Ai\mathsf{A}_{i}, and the intersection number does not depend on xx. Thus the row sum of Ai\mathsf{A}_{i} is constant. Therefore the space of constant functions is a common eigenspace for all the Ai\mathsf{A}_{i}.

From now on, we number the projection onto the one-dimensional space of constant functions as E0\mathsf{E}_{0}. This projection is

E0=1ΩJ.\mathsf{E}_{0} = \frac{1}{\card{\Omega}}J.

Indeed, since JJ is the matrix all of whose entries are 11, Ω1J\card{\Omega}^{-1}J is the orthogonal projection onto the space of constant functions. Also,

J=A0+A1++Ad,J= \mathsf{A}_{0} + \mathsf{A}_{1} + \dots + \mathsf{A}_{d},

so this projection also belongs to A\mathcal A.

An idempotent is an element whose square is itself. A general idempotent matrix represents a projection along some direct-sum decomposition. The primitive idempotents here are also real symmetric, or self-adjoint, and hence are orthogonal projections with respect to the standard inner product. Thus primitive idempotents are the orthogonal projections onto the natural eigenspaces seen by the Bose–Mesner algebra. Here “primitive” means that, inside the Bose–Mesner algebra, the idempotent cannot be decomposed further as a non-trivial sum of orthogonal idempotents. It does not mean that the matrix has rank 11. Indeed, common eigenspaces need not be one-dimensional in general.

The adjacency matrices and the primitive idempotents are both bases of the Bose–Mesner algebra. Therefore each can be expressed as a linear combination of the other. We will write the concrete coefficients in the section on the Hamming scheme.

Thus, in an association scheme, we move between the following two bases.

basis representing relations: A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d} ↔ basis representing projections onto eigenspaces: E0,,Ed\mathsf{E}_{0}, \dots, \mathsf{E}_{d}.

In the association-scheme proof of the MacWilliams identity, this passage between the “relation side” and the “eigenspace side” is essential. The weight distribution of a code is defined on the relation side, while the dual distribution is defined by projection onto the eigenspace side. For a readable reference on the linear-algebraic treatment of Bose–Mesner algebras and primitive idempotents discussed in this section, see also Godsil [God93].

The Hamming Scheme

We now look at the Hamming scheme, the most important association scheme appearing in coding theory.

Let EE be a finite coordinate set, and let n=En = \card{E}. This is a different symbol from the primitive idempotents Ej\mathsf{E}_j which appear later; the fonts distinguish them. Let the vertex set be ΩFqE\Omega \coloneqq \F_{q}^{E}. Thus the vertices are qq-ary words of length nn. For two words x,yΩx, y \in \Omega, define the Hamming distance by

dist(x,y)wt(xy).\dist(x, y) \coloneqq \wt(x - y).

Definition 5.1.

For 0in0\leq i\leq n, define

Ri{(x,y)Ω×Ω:dist(x,y)=i}.R_{i} \coloneqq \{ (x, y) \in \Omega \times \Omega : \dist(x, y) = i \}.

The association scheme obtained from this relation partition is called the Hamming scheme, and is denoted by H(n,q)H(n, q).

Write Ai\mathsf{A}_{i} for the adjacency matrix of RiR_{i}. Thus Ai\mathsf{A}_{i} is the matrix joining two words whose distance is ii.

Theorem 5.2.

H(n,q)H(n,q) is a symmetric association scheme.

Proof

R0R_{0} is the diagonal relation, and since the Hamming distance is symmetric, each RiR_{i} is symmetric. Also, any two words have exactly one distance among 0,1,,n0, 1, \dots, n, so R0,,RnR_{0}, \dots, R_{n} partition Ω×Ω\Omega \times \Omega. Each RiR_{i} is non-empty as well. Indeed, take the zero word 0FqE0 \in \F_{q}^{E} and a word yy which is non-zero in exactly ii coordinates; then dist(0,y)=i\dist(0, y) = i.

We check the intersection-number condition. Suppose dist(x,y)=k\dist(x, y) = k. We want to count the number of zΩz\in \Omega satisfying

dist(x,z)=i,dist(z,y)=j.\dist(x, z) = i, \qquad \dist(z, y) = j.

Since dist(x,y)=k\dist(x, y) = k, there are nkn - k coordinates with xe=yex_{e} = y_{e}, and kk coordinates with xeyex_{e} \neq y_{e}.

The contribution at each coordinate is summarised in the following table, where Δx\Delta_x and Δy\Delta_y denote the one-coordinate contributions to dist(x,z)\dist(x, z) and dist(z,y)\dist(z, y):

case(Δx,Δy)(\Delta_x, \Delta_y)choices
xe=ye, ze=xe=yex_{e} = y_{e},\ z_{e} = x_{e} = y_{e}(0,0)(0, 0)11
xe=ye, zexex_{e} = y_{e},\ z_{e} \neq x_{e}(1,1)(1, 1)q1q - 1
xeye, ze=xex_{e} \neq y_{e},\ z_{e} = x_{e}(0,1)(0, 1)11
xeye, ze=yex_{e} \neq y_{e},\ z_{e} = y_{e}(1,0)(1, 0)11
xeye, zexe,yex_{e} \neq y_{e},\ z_{e} \neq x_{e}, y_{e}(1,1)(1, 1)q2q - 2

Recording the contribution to dist(x,z)\dist(x, z) by ss and the contribution to dist(z,y)\dist(z, y) by tt, the first two rows show that the one-coordinate generating function for a coordinate with xe=yex_{e} = y_{e} is

1+(q1)st,1 + (q - 1)st,

and the last three rows show that the one-coordinate generating function for a coordinate with xeyex_e\neq y_e is

s+t+(q2)st.s + t + (q - 2)st.

Since the choices at different coordinates are independent, the total number is given by the product of these one-coordinate generating functions. Here the exponent of ss records the contribution to dist(x,z)\dist(x,z), and the exponent of tt records the contribution to dist(z,y)\dist(z,y). Thus the number of zz satisfying dist(x,z)=i\dist(x,z)=i and dist(z,y)=j\dist(z,y)=j can be written, for example, as

[sitj](1+(q1)st)nk(s+t+(q2)st)k.\lbrack s^{i} t^{j} \rbrack \bigl( 1 + (q - 1)st \bigr)^{n - k} \bigl( s + t + (q - 2)st \bigr)^{k}.

Here [sitj]\lbrack s^{i} t^{j} \rbrack denotes extraction of the coefficient of sitjs^{i} t^{j} in the expanded polynomial. When q=2q = 2, the term q2=0q - 2 = 0 expresses that there is no option “different from both”. This depends only on nn, qq, ii, jj, and kk, not on the concrete values of xx and yy. Hence the intersection-number condition holds.

End of proof

In this proof we did not use an explicit formula for the intersection numbers. The important point is that the counting is governed not by the concrete positions of the two points xx and yy, but only by their distance kk. This is the homogeneity of Hamming space.

The row sum of Ai\mathsf{A}_{i} is the number of words at distance ii from a fixed word. It is

vi=(ni)(q1)i.v_{i} = \binom{n}{i}(q-1)^{i}.

This viv_{i} is called the valency of the relation RiR_{i}. Indeed, there are (ni)\binom{n}{i} ways to choose the coordinates at distance ii, and in each such coordinate there are q1q - 1 choices different from the original value.

Eigenspaces of the Hamming Scheme

To understand the Bose–Mesner algebra of the Hamming scheme, we need to simultaneously diagonalise the adjacency matrices A0,,An\mathsf{A}_{0}, \dots, \mathsf{A}_{n}. This is where Krawtchouk polynomials appear.

We begin with one coordinate. Let the one-coordinate function space be U=CFqU = \C^{\F_q}. Inside UU there is the one-dimensional subspace U0=C1U_{0} = \C\one consisting of constant functions. Also define

U1={fU:aFqf(a)=0},U_{1} = \left\{ f \in U : \sum_{a \in \F_{q}} f(a) = 0 \right\},

the subspace of functions whose values sum to 00. With respect to the standard inner product, we have the orthogonal decomposition U=U0U1U = U_{0} \oplus U_{1}.

Let N=JIN = J - I be the one-coordinate adjacency matrix for “moving to a different value”. This is the adjacency matrix of the complete graph KqK_{q}. The matrix NN acts on U0U_{0} with eigenvalue q1q - 1, and on U1U_{1} with eigenvalue 1-1. Indeed, on constant functions, from each point there are q1q - 1 other points to go to, so the eigenvalue is q1q - 1. On the other hand, if fU1f \in U_{1}, then

(Nf)(a)=baf(b)=bFqf(b)f(a)=f(a).(Nf)(a) = \sum_{b \neq a} f(b) = \sum_{b \in \F_{q}}f(b) - f(a) = -f(a).

For this note, it is enough to think of the tensor product as the operation which makes a function of many variables by multiplying one-variable functions over the coordinates, and which lets one-coordinate operators act independently on each coordinate. No general theory of tensor products is needed to understand this article.

The multi-coordinate space is

CΩ=CFqEeEU.\C^{\Omega} = \C^{\F_{q}^{E}} \cong \bigotimes_{e \in E} U.

If for each eEe \in E we choose a one-coordinate function feUf_{e} \in U, then the pure tensor eEfe\bigotimes_{e \in E} f_{e} represents the function

x=(xe)eEeEfe(xe).x = (x_{e})_{e \in E} \longmapsto \prod_{e \in E} f_{e}(x_{e}).

Also, the tensor product eETe\bigotimes_{e \in E} T_{e} of one-coordinate operators Te ⁣:UUT_{e} \colon U \to U should be understood as applying TeT_{e} independently to the component in the ee-th coordinate. For example, when E={1,2}E = \{ 1, 2 \}, the tensor f1f2f_{1} \otimes f_{2} represents the two-variable function

(a,b)f1(a)f2(b).(a, b) \longmapsto f_{1}(a)f_{2}(b).

Also, T1T2T_{1} \otimes T_{2} acts by

(T1T2)(f1f2)=(T1f1)(T2f2),(T_{1} \otimes T_{2})(f_{1} \otimes f_{2}) = (T_{1} f_{1}) \otimes (T_{2} f_{2}),

so it is the operator which applies T1T_{1} to the first coordinate and T2T_{2} to the second coordinate. For a general EE, this is simply extended to all coordinates. Decomposing each coordinate as U=U0U1U = U_{0} \oplus U_{1} gives the orthogonal decomposition

CΩ=SE(eSU1eSU0).\C^{\Omega} = \bigoplus_{S\subseteq E} \left( \bigotimes_{e\in S}U_1 \otimes \bigotimes_{e\notin S}U_0 \right).

Here SS denotes the set of coordinates in which we have chosen the non-constant direction.

Definition 6.1.

For 0wn0 \leq w \leq n, define

Vw=SES=w(eSU1eSU0).V_w = \bigoplus_{\substack{S\subseteq E\\ \card{S} = w}} \left( \bigotimes_{e\in S}U_1 \otimes \bigotimes_{e\notin S}U_0 \right).

Then

CΩ=V0V1Vn.\C^{\Omega} = V_{0} \oplus V_{1} \oplus \dots \oplus V_{n}.

For each direct-summand component with S=w\card{S}=w, the dimension is (dimU1)w=(q1)w(\dim U_{1})^{w} = (q - 1)^{w}, and there are (nw)\binom{n}{w} such sets SS. Thus

dimVw=(nw)(q1)w.\dim V_{w} = \binom{n}{w}(q - 1)^{w}.

Intuitively, VwV_{w} is the component with non-constant direction in ww coordinates. This decomposition will be the eigenspace decomposition of the Hamming scheme.

Next consider the generating function collecting the distance matrices Aj\mathsf{A}_j. The II and NN on the right-hand side are operators on the one-coordinate space U=CFqU = \C^{\F_{q}}. On the other hand, the Aj\mathsf{A}_{j} on the left-hand side are operators on the multi-coordinate space CΩ=CFqE\C^{\Omega} = \C^{\F_{q}^{E}}. Taking a tensor product extends the one-coordinate action independently to all coordinates. In one coordinate, keeping the same value is the action II, and moving to a different value is the action N=JIN = J - I. Therefore, on the multi-coordinate space,

j=0nAjzj=eE(I+zN)(6.1)\sum_{j=0}^{n}\mathsf{A}_{j} z^{j} = \bigotimes_{e \in E}(I + zN) \tag{6.1}

holds. Indeed, when the right-hand side is expanded, the coefficient of zjz^{j} is exactly the operator which changes the value in precisely jj coordinates and keeps the remaining coordinates unchanged.

Definition 6.2.

For 0wn0 \leq w \leq n, define numbers K0(w),,Kn(w)\Kraw_{0}(w), \dots, \Kraw_{n}(w) by

j=0nKj(w)zj=(1+(q1)z)nw(1z)w.(6.2)\sum_{j=0}^{n}\Kraw_{j}(w)z^{j} = \bigl( 1 + (q - 1)z \bigr)^{n - w}(1 - z)^{w}. \tag{6.2}

These are called the values of the Krawtchouk polynomials associated with the qq-ary Hamming scheme.

Expanding the generating function gives

Kj(w)==0j(1)(q1)j(w)(nwj).\Kraw_j(w) = \sum_{\ell=0}^{j} (-1)^\ell (q-1)^{j-\ell} \binom{w}{\ell} \binom{n-w}{j-\ell}.

Here binomial coefficients outside their range are interpreted as 00.

This definition may look as though we are introducing Krawtchouk polynomials as polynomials defined by a generating function, but the meaning here is slightly different. Equation (6.2) is the eigenvalue generating function for the adjacency matrices of the Hamming scheme.

Theorem 6.3.

Let 0j,wn0 \leq j, w \leq n. The adjacency matrix Aj\mathsf{A}_{j} of the Hamming scheme acts on the eigenspace VwV_{w} as multiplication by Kj(w)\Kraw_{j}(w).

Proof

An element of VwV_w is a sum of tensors which have the U1U_{1} direction in ww coordinates and the U0U_{0} direction in the remaining nwn - w coordinates. The one-coordinate matrix NN acts by q1q - 1 on U0U_{0} and by 1-1 on U1U_{1}. Therefore jAjzj\sum_{j} \mathsf{A}_{j} z^{j} acts on VwV_{w} as multiplication by

(1+(q1)z)nw(1z)w.\bigl( 1 + (q - 1)z \bigr)^{n - w}(1 - z)^{w}.

Comparing the coefficient of zjz^{j}, the eigenvalue of Aj\mathsf{A}_{j} is Kj(w)\Kraw_{j}(w).

End of proof

This theorem gives the following interpretation of the Krawtchouk polynomials.

A Krawtchouk polynomial is the eigenvalue of the distance-jj adjacency matrix of the Hamming scheme on the ww-th eigenspace VwV_{w}.

As mentioned earlier, let us now organise the roles of the subscripts. The subscript ii of Ai\mathsf{A}_{i} is the distance-side index, while the subscript ww of VwV_{w}, and later of Ew\mathsf{E}_{w}, is the eigenspace-side index. Thus Ki(w)\Kraw_{i}(w) denotes the eigenvalue of the distance-ii adjacency matrix Ai\mathsf{A}_{i} on the ww-th eigenspace VwV_w. On the other hand, when we later write

Ej=1qni=0nKj(i)Ai,\mathsf{E}_{j} = \frac{1}{q^{n}} \sum_{i=0}^{n} \Kraw_{j}(i) \mathsf{A}_{i},

jj is on the eigenspace side and ii is on the distance side. In the Hamming scheme, these two sides are connected by Krawtchouk polynomials.

In particular, when j=1j = 1,

K1(w)=(q1)(nw)w=(q1)nqw,\Kraw_{1}(w) = (q - 1)(n - w) - w = (q - 1)n - qw,

which takes distinct values for w=0,1,,nw = 0, 1, \dots, n. Thus V0,V1,,VnV_{0}, V_{1}, \dots, V_{n} are already distinguished by A1\mathsf{A}_{1} alone. Consequently, they are the common eigenspaces for the Bose–Mesner algebra of the Hamming scheme, and the corresponding orthogonal projections are the primitive idempotents. Indeed, since the eigenvalues of A1\mathsf{A}_{1} on V0,,VnV_{0}, \dots, V_{n} are distinct, Lagrange interpolation gives

Ew=u=0uwnA1K1(u)IK1(w)K1(u).\mathsf{E}_{w} = \prod_{\substack{u = 0 \\ u \neq w}}^{n} \frac{\mathsf{A}_{1} - \Kraw_{1}(u)I}{\Kraw_{1}(w)-\Kraw_{1}(u)}.

This formula extracts the projection onto VwV_{w} as a polynomial in A1\mathsf{A}_{1}. Therefore Ew\mathsf{E}_{w} belongs to the Bose–Mesner algebra. In this sense, for the Hamming scheme, primitive idempotents can be understood explicitly without relying on the general theory of semisimple algebras.

This viewpoint is the heart of the association-scheme perspective. The polynomial does not come first; rather, when one simultaneously diagonalises the matrix algebra built from the distance relations, the Krawtchouk polynomials appear as its eigenvalues.

Writing the Primitive Idempotents Explicitly

Write Ew\mathsf{E}_{w} for the orthogonal projection onto VwV_{w}. Then E0,E1,,En\mathsf{E}_{0}, \mathsf{E}_{1}, \dots, \mathsf{E}_{n} are the primitive idempotents of the Bose–Mesner algebra of the Hamming scheme. That is,

ErEs=δr,sEr,r=0nEr=I,\mathsf{E}_{r}\mathsf{E}_{s} = \delta_{r, s} \mathsf{E}_{r}, \qquad \sum_{r=0}^{n}\mathsf{E}_{r} = I,

and each Aj\mathsf{A}_{j} acts on the image of Ew\mathsf{E}_{w} as multiplication by Kj(w)\Kraw_{j}(w). The image is exactly VwV_w, and

dimEw(CΩ)=dimVw=(nw)(q1)w.\dim \mathsf{E}_w(\C^{\Omega}) = \dim V_{w} = \binom{n}{w}(q-1)^{w}.

Thus the primitive idempotents of the Hamming scheme are also generally not of rank 11.

We can write this projection explicitly as a linear combination of the adjacency matrices Ai\mathsf{A}_i. For this, look at the one-coordinate projections. Let Π0\Pi_{0} be the orthogonal projection onto U0U_{0}, and let Π1\Pi_{1} be the orthogonal projection onto U1U_{1}. Then

Π0=1qJ,Π1=I1qJ.\Pi_{0} = \frac{1}{q}J, \qquad \Pi_{1} = I - \frac{1}{q}J.

In components, for a,bFqa, b \in \F_{q},

Π0(a,b)=1qΠ1(a,b)={q1q,a=b,1q,ab.\Pi_{0}(a, b) = \frac{1}{q} \qquad \Pi_{1}(a,b) = \begin{cases} \dfrac{q-1}{q}, & a = b,\\[4pt] -\dfrac{1}{q}, & a \neq b. \end{cases}

For SES\subseteq E, consider the projection obtained by applying Π1\Pi_1 in the coordinates of SS and Π0\Pi_0 in the remaining coordinates. Summing all such projections with S=w\card{S} = w gives Ew\mathsf{E}_{w}. When we compute its entries, Krawtchouk polynomials appear again.

Theorem 7.1.

The primitive idempotents of the Hamming scheme H(n,q)H(n,q) can be written as

Ej=1qni=0nKj(i)Ai.(7.1)\mathsf{E}_j = \frac{1}{q^n} \sum_{i=0}^{n}\Kraw_j(i)\mathsf{A}_i. \tag{7.1}
Proof

Take x,yΩ=FqEx,y\in \Omega=\F_q^E, and suppose dist(x,y)=i\dist(x,y)=i. We compute the (x,y)(x,y) entry of Ej\mathsf{E}_j. The projection Ej\mathsf{E}_j is the sum, over subsets SES\subseteq E with S=j\card{S}=j, of the projections which apply Π1\Pi_1 in the coordinates of SS and Π0\Pi_0 in the remaining coordinates.

Putting these together as a generating function,

j=0nqn(Ej)x,yzj=eE(qΠ0(xe,ye)+qΠ1(xe,ye)z).\begin{aligned} \sum_{j=0}^{n} q^n(\mathsf{E}_j)_{x,y}z^j &= \prod_{e\in E} \left(q\Pi_0(x_e,y_e)+q\Pi_1(x_e,y_e)z\right). \end{aligned}

If xe=yex_e=y_e, then

qΠ0(xe,ye)=1,qΠ1(xe,ye)=q1.q\Pi_0(x_e,y_e)=1, \qquad q\Pi_1(x_e,y_e)=q-1.

If xeyex_e\neq y_e, then

qΠ0(xe,ye)=1,qΠ1(xe,ye)=1.q\Pi_0(x_e,y_e)=1, \qquad q\Pi_1(x_e,y_e)=-1.

Since xx and yy differ in ii coordinates and agree in nin-i coordinates,

j=0nqn(Ej)x,yzj=(1+(q1)z)ni(1z)i=j=0nKj(i)zj.\sum_{j=0}^{n} q^n(\mathsf{E}_j)_{x,y}z^j = \bigl(1+(q-1)z\bigr)^{n-i}(1-z)^i = \sum_{j=0}^{n}\Kraw_j(i)z^j.

Hence

(Ej)x,y=1qnKj(i).(\mathsf{E}_j)_{x,y} = \frac{1}{q^n}\Kraw_j(i).

This means that the entry is qnKj(i)q^{-n}\Kraw_j(i) at positions of distance ii. On the other hand,

1qni=0nKj(i)Ai\frac{1}{q^n}\sum_{i=0}^{n}\Kraw_j(i)\mathsf{A}_i

is the matrix which places exactly qnKj(i)q^{-n}\Kraw_j(i) in the entries of distance ii, since Ai\mathsf{A}_i has 11 exactly in positions of distance ii. Therefore

Ej=1qni=0nKj(i)Ai.\mathsf{E}_j = \frac{1}{q^n}\sum_{i=0}^{n}\Kraw_j(i)\mathsf{A}_i.

All entries agree, so the matrices are equal.

End of proof

This formula is very important. The left-hand side is a projection onto an eigenspace. The right-hand side is a linear combination of adjacency matrices representing distance relations. Thus the Krawtchouk polynomials are

the coefficients connecting the relation-side basis Ai\mathsf{A}_i and the eigenspace-side basis Ej\mathsf{E}_{j}.

Let us relate this to the general theory. For a general association scheme, when we write

Ai=r=0dPr(i)Er,Er=1Ωi=0dQr(i)Ai,\mathsf{A}_i = \sum_{r=0}^{d}P_r(i)\mathsf{E}_r, \qquad \mathsf{E}_r = \frac{1}{\card{\Omega}} \sum_{i=0}^{d}Q_r(i)\mathsf{A}_i,

the coefficients are called the first eigenmatrix PP and the second eigenmatrix QQ, respectively. In the notation for the Hamming scheme used in this note,

Ai=w=0nKi(w)Ew,Ej=1qni=0nKj(i)Ai,\mathsf{A}_i=\sum_{w=0}^{n}\Kraw_i(w)\mathsf{E}_w, \qquad \mathsf{E}_j=\frac{1}{q^n}\sum_{i=0}^{n}\Kraw_j(i)\mathsf{A}_i,

so this corresponds to Pw(i)=Ki(w)P_w(i)=\Kraw_i(w) and Qj(i)=Kj(i)Q_j(i)=\Kraw_j(i).

In the Hamming scheme, the same type of Krawtchouk polynomial appears in the two directions of this change of basis. However, the simple symmetry Kj(i)=Ki(j)\Kraw_{j}(i) = \Kraw_{i}(j) does not hold in general. More precisely, using the valency vi=(ni)(q1)iv_{i} = \binom{n}{i}(q-1)^{i}, we have the weighted symmetry

viKj(i)=vjKi(j).v_{i} \Kraw_{j}(i) = v_{j}\Kraw_{i}(j).

Inner Distributions: Viewing Codes from the Relation Side

We now return to codes. In an association scheme, for a subset DΩD\subseteq \Omega of the finite set Ω\Omega, we count which relations connect pairs of points inside that subset. This is the inner distribution. Inner and dual distributions are fundamental quantities in Delsarte's association-scheme approach to coding theory [Del73; Cam98].

Let us first organise some similar notation. Ai\mathsf{A}_i is the adjacency matrix for distance ii, Ai(C)A_i(C) is the number of codewords of weight ii in a linear code CC, and ai(D)a_i(D) is the inner distribution of a subset DD.

Definition 8.1.

For a subset DΩD\subseteq \Omega, define its characteristic vector by

1D=xDexCΩ.\one_D = \sum_{x\in D}e_x \in \C^{\Omega}.

We use the standard inner product

f,g=xΩf(x)g(x).\langle f,g\rangle = \sum_{x\in \Omega}f(x)\overline{g(x)}.

Using the adjacency matrix Ai\mathsf{A}_i, the number of ordered pairs in DD at distance ii is Ai1D,1D\langle \mathsf{A}_{i} \one_{D}, \one_{D} \rangle.

Definition 8.2.

Let DΩD\subseteq \Omega be a non-empty subset. The inner distribution of DD is defined by

ai(D)=1DAi1D,1D(0in).a_i(D) = \frac{1}{\card{D}} \langle \mathsf{A}_i\one_D,\one_D\rangle \qquad (0\leq i\leq n).

With this normalisation, a0(D)=1a_0(D)=1. Indeed, since A0=I\mathsf{A}_0=I,

a0(D)=1D1D,1D=1.a_{0}(D) = \frac{1}{\card{D}} \langle \one_{D}, \one_{D} \rangle = 1.

Also,

i=0nai(D)=1Di=0nAi1D,1D=1DJ1D,1D=D,\sum_{i=0}^{n}a_i(D) = \frac{1}{\card{D}} \left\langle \sum_{i=0}^{n}\mathsf{A}_i\one_D,\one_D \right\rangle = \frac{1}{\card{D}}\langle J\one_D,\one_D\rangle = \card{D},

so with this normalisation it is not a probability distribution. It is useful to think of ai(D)a_i(D) as the average number of points of DD which are in relation RiR_i to a fixed point of DD. In the Hamming scheme, this reads as the average number of points of DD at distance ii.

For a linear code, the inner distribution agrees with the usual weight distribution.

Theorem 8.3.

Let CFqEC \leq \F_{q}^{E} be a linear code. Write the usual weight distribution as Ai(C)={cC:wt(c)=i}A_{i}(C) = \card{\{ c \in C : \wt(c) = i \}}. Then ai(C)=Ai(C)a_{i}(C) = A_{i}(C).

Proof

The quantity Ai1C,1C\langle \mathsf{A}_i\one_C,\one_C\rangle is the number of ordered pairs (x,y)(x,y) satisfying

x,yC,dist(x,y)=i.x,y\in C, \qquad \dist(x,y)=i.

Since CC is a linear code, the difference yxy-x again belongs to CC. Also, dist(x,y)=wt(yx)\dist(x,y)=\wt(y-x).

Fix a codeword dCd\in C of weight ii. The pairs (x,y)C2(x,y)\in C^2 satisfying yx=dy-x=d are obtained by choosing xCx\in C arbitrarily and putting y=x+dy=x+d, so there are exactly C\card{C} such pairs. Since there are Ai(C)A_i(C) codewords of weight ii,

Ai1C,1C=CAi(C).\langle \mathsf{A}_i\one_C,\one_C\rangle = \card{C}A_i(C).

Hence ai(C)=Ai(C)a_{i}(C) = A_{i}(C).

End of proof

This theorem shows that the language of association schemes contains the weight distribution of coding theory. For a general subset DΩD\subseteq \Omega, ai(D)a_{i}(D) is the distance distribution inside DD. For a linear code, taking differences returns us to the usual weight distribution.

Dual Distributions: Viewing Codes from the Eigenspace Side

The inner distribution was defined using the adjacency matrices Ai\mathsf{A}_i. This is information on the relation side. In an association scheme, there is another distribution, defined using the primitive idempotents Ej\mathsf{E}_j. This is the dual distribution.

Definition 9.1.

Let DΩD\subseteq \Omega be a non-empty subset. The dual distribution of DD is defined by

aj(D)=ΩD2Ej1D,1D(0jn).a_j^{\ast}(D) = \frac{\card{\Omega}}{\card{D}^2} \langle \mathsf{E}_j\one_D,\one_D\rangle \qquad (0\leq j\leq n).

Here

Ej1D,1D=Ej1D,Ej1D\langle \mathsf{E}_j\one_D,\one_D\rangle = \langle \mathsf{E}_j\one_D,\mathsf{E}_j\one_D\rangle

is the squared norm of the VjV_j component of the characteristic vector 1D\one_D. If we used this quantity as it stands, its scale would not match the weight distribution of the dual code in the case of a linear code. Thus we multiply by the factor ΩD2\displaystyle \frac{\card{\Omega}}{\card{D}^2}, normalising it so that aj(C)=Aj(C)a_j^{\ast}(C)=A_j(C^\perp) when D=CD=C is a linear code.

With this normalisation, a0(D)=1a_0^{\ast}(D)=1. Indeed, the projection onto the space of constant functions is E0=Ω1J\mathsf{E}_0=\card{\Omega}^{-1}J, so

E01D,1D=D2Ω.\langle \mathsf{E}_0\one_D,\one_D\rangle = \frac{\card{D}^2}{\card{\Omega}}.

Also,

j=0naj(D)=ΩD2j=0nEj1D,1D=ΩD.\sum_{j=0}^{n}a_j^{\ast}(D) = \frac{\card{\Omega}}{\card{D}^2} \left\langle \sum_{j=0}^{n}\mathsf{E}_j\one_D,\one_D \right\rangle = \frac{\card{\Omega}}{\card{D}}.

Therefore the dual distribution is not a probability distribution with this normalisation either. For a linear code CC, this corresponds to C=Ω/C\card{C^{\perp}}=\card{\Omega}/\card{C} and agrees with the sum of the weight distribution of the dual code. For a general subset DΩD\subseteq \Omega, there is no dual code DD^{\perp} in the usual sense. Nevertheless, the distribution aj(D)a_j^{\ast}(D) seen from the eigenspace side of the Bose–Mesner algebra can still be defined. Non-negativity follows from the fact that Ej\mathsf{E}_j is an orthogonal projection.

As a warning, for a general subset DD, the number aj(D)a_j^{\ast}(D) need not be an integer, and it is not directly counting the elements of anything. It is the size of the eigenspace component of the characteristic vector 1D\one_D, normalised so that, in the case of a linear code, it agrees with the weight distribution of the dual code.

This definition may look a little abstract. But its meaning is simple. We decompose the characteristic vector 1D\one_D into the eigenspaces V0,V1,,VnV_{0}, V_{1}, \dots, V_{n} of the Hamming scheme, and look at the size of each component. Since Ej\mathsf{E}_j is the orthogonal projection onto VjV_j,

Ej1D,1D=Ej1D,Ej1D0.\langle \mathsf{E}_j\one_D,\one_D\rangle = \langle \mathsf{E}_j\one_D,\mathsf{E}_j\one_D\rangle \geq 0.

Thus the dual distribution is non-negative.

For a general subset DD, aj(D)a_j^{\ast}(D) is the dual distribution in Delsarte's sense. For a linear code, it is genuinely the weight distribution of the dual code. This fact connects the association-scheme proof with ordinary coding theory.

In the next theorem we use additive characters just once. The purpose is to check that the dual distribution defined in the association scheme agrees, for a linear code, with the usual weight distribution of the dual code CC^\perp. Thus what is needed here is not the whole theory of characters, but only the basic orthogonality relation for additive characters of finite fields.

Theorem 9.2.

Let CFqEC \leq \F_{q}^{E} be a linear code. Then aj(C)=Aj(C)a_{j}^{\ast}(C) = A_{j}(C^{\perp}).

Proof

Fix a non-trivial additive character ψ ⁣:FqC×\psi \colon \F_{q} \to \C^{\times} of the finite field. For example, when q=pmq=p^m,

ψ(a)=exp(2π1pTrFq/Fp(a))\psi(a) = \exp\left(\frac{2\pi \sqrt{-1}}{p}\Tr_{\F_q/\F_p}(a)\right)

gives a non-trivial additive character. However, we will not enter the details of this construction. In what follows, we use only the basic orthogonality relation for a non-trivial additive character:

aFqψ(ta)={q,t=0,0,t0.\sum_{a\in\F_q}\psi(ta) = \begin{cases} q, & t=0,\\ 0, & t\neq 0. \end{cases}

For uFqEu\in\F_q^E, put

χu(x)=ψ(ux)(xFqE).\chi_u(x)=\psi(u\cdot x) \qquad (x\in\F_q^E).

This decomposes coordinatewise as

χu(x)=ψ(ux)=eEψ(uexe).\chi_u(x) = \psi(u\cdot x) = \prod_{e\in E}\psi(u_ex_e).

Thus, in one coordinate, if ue=0u_e=0 then the function lies in the constant direction, and if ue0u_e\neq 0 then it is naturally seen to lie in the direction where the sum of values is 00. This is why, later, χuVj\chi_u\in V_j corresponds to wt(u)=j\wt(u)=j. We also use the fact that additive characters have complex absolute value 11, so

ψ(vx)=ψ(vx).\overline{\psi(v\cdot x)}=\psi(-v\cdot x).

After normalisation, these functions form an orthonormal basis of CΩ\C^{\Omega}. Indeed, for u,vFqEu,v\in\F_q^E,

χu,χv=xFqEψ((uv)x)=eExeFqψ((ueve)xe).\begin{aligned} \langle \chi_u,\chi_v\rangle &= \sum_{x\in\F_q^E}\psi((u-v)\cdot x)\\ &= \prod_{e\in E} \sum_{x_e\in\F_q}\psi((u_e-v_e)x_e). \end{aligned}

If u=vu=v, this product is qnq^n. If uvu\neq v, then for some coordinate ueve0u_e-v_e\neq 0, and the corresponding factor is 00. Hence

{qn/2χu:uFqE}\{q^{-n/2}\chi_u:u\in\F_q^E\}

is an orthonormal basis of CΩ\C^{\Omega}.

In one coordinate, when ue=0u_e=0, the function xeψ(uexe)x_e\mapsto\psi(u_ex_e) is constant, and when ue0u_e\neq0, the sum of its values is 00. Therefore χu\chi_u belongs to VjV_j if and only if wt(u)=j\wt(u)=j. Thus Ej\mathsf{E}_j is the projection onto the span of the characters of weight jj.

We now compute the norm of the weight-jj component of 1C\one_C. By Parseval's identity for the orthonormal basis,

Ej1C,1C=uFqEwt(u)=j1C,qn/2χu2.\begin{aligned} \langle \mathsf{E}_j\one_C,\one_C\rangle &= \sum_{\substack{u\in\F_q^E\\ \wt(u)=j}} \left\lvert \left\langle \one_C,q^{-n/2}\chi_u\right\rangle \right\rvert^2. \end{aligned}

Here

1C,qn/2χu=qn/2cCψ(uc).\left\langle \one_C,q^{-n/2}\chi_u\right\rangle = q^{-n/2}\sum_{c\in C}\overline{\psi(u\cdot c)}.

The right-hand side is qn/2Suq^{-n/2}\overline{S_u}, where

SucCψ(uc).S_u\coloneqq\sum_{c\in C}\psi(u\cdot c).

If uCu\in C^{\perp}, then uc=0u\cdot c=0 for all cCc\in C, so

Su=C.S_u=\card{C}.

On the other hand, if uCu\notin C^{\perp}, then cucc \mapsto u \cdot c is a non-zero Fq\F_q-linear map on CC and is therefore surjective onto Fq\F_q. Each value occurs exactly C/q\card{C}/q times. Thus

Su=CqaFqψ(a)=0.S_u = \frac{\card{C}}{q}\sum_{a\in\F_q}\psi(a) = 0.

Therefore

Ej1C,1C=C2qn{uC:wt(u)=j}=C2qnAj(C).\langle \mathsf{E}_j\one_C,\one_C\rangle = \frac{\card{C}^2}{q^n} \card{\{u\in C^{\perp}:\wt(u)=j\}} = \frac{\card{C}^2}{q^n}A_j(C^{\perp}).

Since Ω=qn\card{\Omega}=q^n, the definition of the dual distribution gives

aj(C)=qnC2Ej1C,1C=Aj(C).a_j^{\ast}(C) = \frac{q^n}{\card{C}^2} \langle \mathsf{E}_j\one_C,\one_C\rangle = A_j(C^{\perp}).
End of proof

Characters appeared only briefly in this theorem. Their role was to check that, in the case of a linear code, the eigenspace side of the Hamming scheme is seeing the usual dual code. From the association-scheme viewpoint, CC^{\perp} first appears as the dual distribution. For a linear code, this dual distribution agrees with the usual weight distribution of the dual code.

The Coefficient-Level MacWilliams Identity

With the preparation so far, the MacWilliams identity is almost proved. First we write a transform formula for a general subset.

Proposition 10.1.

Let DΩD\subseteq \Omega be a non-empty subset. Then

aj(D)=1Di=0nai(D)Kj(i)(10.1)a_j^\ast(D) = \frac{1}{\card{D}} \sum_{i=0}^{n}a_i(D)\Kraw_j(i) \tag{10.1}

holds.

Proof

By the definition of the dual distribution,

aj(D)=ΩD2Ej1D,1D.a_j^\ast(D) = \frac{\card{\Omega}}{\card{D}^2} \langle \mathsf{E}_j\one_D,\one_D\rangle.

Substituting Theorem 7.1 and using Ω=qn\card{\Omega}=q^n, we get

aj(D)=qnD21qni=0nKj(i)Ai1D,1D=1D2i=0nKj(i)Ai1D,1D=1Di=0nKj(i)ai(D).\begin{aligned} a_j^\ast(D) &= \frac{q^n}{\card{D}^2} \left\langle \frac{1}{q^n}\sum_{i=0}^{n}\Kraw_j(i)\mathsf{A}_i\one_D, \one_D \right\rangle\\ &= \frac{1}{\card{D}^2} \sum_{i=0}^{n}\Kraw_j(i)\langle \mathsf{A}_i\one_D,\one_D\rangle\\ &= \frac{1}{\card{D}} \sum_{i=0}^{n}\Kraw_j(i)a_i(D). \end{aligned}
End of proof

For a linear code CC, by Theorem 8.3 and Theorem 9.2,

ai(C)=Ai(C),aj(C)=Aj(C).a_i(C)=A_i(C), \qquad a_j^\ast(C)=A_j(C^\perp).

Thus this general formula is exactly the coefficient-level MacWilliams identity.

Theorem 10.2 (The Coefficient-Level MacWilliams Identity).

Let CFqEC\leq\F_q^E be a linear code. Put Ai=Ai(C)A_i=A_i(C) and Bj=Aj(C)B_j=A_j(C^{\perp}). Then

Bj=1Ci=0nAiKj(i)(10.2)B_j = \frac{1}{\card{C}} \sum_{i=0}^{n}A_i\Kraw_j(i) \tag{10.2}

holds.

Proof

The structure of this proof has three stages.

  1. The weight distribution is expressed as the relation-side quantity

    Ai1C,1C.\langle \mathsf{A}_i\one_C,\one_C\rangle.
  2. The weight distribution of the dual code is expressed as the eigenspace-side quantity

    Ej1C,1C.\langle \mathsf{E}_j\one_C,\one_C\rangle.
  3. Expanding Ej\mathsf{E}_j as a linear combination of the Ai\mathsf{A}_i, Krawtchouk polynomials appear as the coefficients.

Thus the MacWilliams identity appears as

the change-of-basis formula between the two bases of the Bose–Mesner algebra: the adjacency-matrix basis and the primitive-idempotent basis.

The Polynomial Form of the MacWilliams Identity

We now derive the usual polynomial form of the MacWilliams identity from the coefficient-level formula.

Theorem 11.1 (MacWilliams Identity).

Let CFqEC\leq\F_q^E be a linear code. Then

WC(X,Y)=1CWC(X+(q1)Y,XY)W_{C^{\perp}}(X,Y) = \frac{1}{\card{C}} W_C\bigl(X+(q-1)Y,X-Y\bigr)

holds.

Proof

Put Ai=Ai(C)A_i=A_i(C) and Bj=Aj(C)B_j=A_j(C^{\perp}). By Theorem 10.2,

WC(X,Y)=j=0nBjXnjYj=1Cj=0ni=0nAiKj(i)XnjYj=1Ci=0nAij=0nKj(i)XnjYj.\begin{aligned} W_{C^{\perp}}(X,Y) &= \sum_{j=0}^{n}B_jX^{n-j}Y^j\\ &= \frac{1}{\card{C}} \sum_{j=0}^{n} \sum_{i=0}^{n}A_i\Kraw_j(i)X^{n-j}Y^j\\ &= \frac{1}{\card{C}} \sum_{i=0}^{n}A_i \sum_{j=0}^{n}\Kraw_j(i)X^{n-j}Y^j. \end{aligned}

In the generating function for Krawtchouk polynomials,

j=0nKj(i)zj=(1+(q1)z)ni(1z)i,\sum_{j=0}^{n}\Kraw_j(i)z^j = \bigl(1+(q-1)z\bigr)^{n-i}(1-z)^i,

multiply both sides by XnX^n and read zjz^j as XjYjX^{-j}Y^j. This gives

j=0nKj(i)XnjYj=(X+(q1)Y)ni(XY)i.\sum_{j=0}^{n}\Kraw_j(i)X^{n-j}Y^j = \bigl(X+(q-1)Y\bigr)^{n-i}(X-Y)^i.

This is a calculation of polynomial identities; we are not dividing by XX as a number. Therefore

WC(X,Y)=1Ci=0nAi(X+(q1)Y)ni(XY)i=1CWC(X+(q1)Y,XY).\begin{aligned} W_{C^{\perp}}(X,Y) &= \frac{1}{\card{C}} \sum_{i=0}^{n}A_i \bigl(X+(q-1)Y\bigr)^{n-i}(X-Y)^i\\ &= \frac{1}{\card{C}} W_C\bigl(X+(q-1)Y,X-Y\bigr). \end{aligned}

This proves the MacWilliams identity.

End of proof

In this proof, the change of variables

XX+(q1)Y,YXYX\mapsto X+(q-1)Y, \qquad Y\mapsto X-Y

came from the eigenvalue generating function of the Hamming scheme. Thus the change of variables is not merely an algebraic accident: it packages the eigenvalues of the matrices representing the distance relations.

A Small Example: H(2,2)H(2,2) and the Binary Repetition Code

Finally, we check the association-scheme viewpoint in a small example. Let Ω=F22\Omega=\F_2^2, and consider the Hamming scheme H(2,2)H(2,2). The vertices are

00,01,10,11.00,01,10,11.

The relations are divided by distance as

R0={(x,x):xΩ},R1={(x,y):dist(x,y)=1},R2={(x,y):dist(x,y)=2}.\begin{aligned} R_0&=\{(x,x):x\in \Omega\},\\ R_1&=\{(x,y):\dist(x,y)=1\},\\ R_2&=\{(x,y):\dist(x,y)=2\}. \end{aligned}

If the vertices are ordered as 00,01,10,1100,01,10,11, the corresponding adjacency matrices are

A0=I,A1=(0110100110010110),A2=(0001001001001000).\mathsf{A}_0=I, \qquad \mathsf{A}_1= \begin{pmatrix} 0&1&1&0\\ 1&0&0&1\\ 1&0&0&1\\ 0&1&1&0 \end{pmatrix}, \qquad \mathsf{A}_2= \begin{pmatrix} 0&0&0&1\\ 0&0&1&0\\ 0&1&0&0\\ 1&0&0&0 \end{pmatrix}.

For q=2q=2 and n=2n=2, the Krawtchouk table is

w=0w=1w=2j=0111j=1202j=2111.\begin{array}{c|rrr} & w=0 & w=1 & w=2 \\ \hline j=0 & 1 & 1 & 1 \\ j=1 & 2 & 0 & -2 \\ j=2 & 1 & -1 & 1 \end{array}.

This is the table of eigenvalues of Aj\mathsf{A}_j on the eigenspaces VwV_w. The row index jj is on the distance side, and the column index ww is on the eigenspace side.

Consider the binary repetition code

C={00,11}F22.C=\{00,11\}\leq\F_2^2.

This code is self-dual. Its weight distribution is

(A0,A1,A2)=(1,0,1).(A_0,A_1,A_2)=(1,0,1).

The coefficient-level MacWilliams identity says

Bj=1Ci=02AiKj(i).B_j = \frac{1}{\card{C}} \sum_{i=0}^{2}A_i\Kraw_j(i).

Here C=2\card{C}=2, A0=A2=1A_0=A_2=1, and A1=0A_1=0, so

Bj=12(Kj(0)+Kj(2)).B_j = \frac{1}{2}\bigl(\Kraw_j(0)+\Kraw_j(2)\bigr).

Using the values in the table above,

B0=12(1+1)=1,B1=12(22)=0,B2=12(1+1)=1.B_{0} = \frac{1}{2}(1 + 1) = 1, \qquad B_{1} = \frac{1}{2}(2 - 2) = 0, \qquad B_{2} = \frac{1}{2}(1 + 1) = 1.

Therefore

(B0,B1,B2)=(1,0,1),(B_{0}, B_{1}, B_{2}) = (1, 0, 1),

which indeed agrees with the weight distribution of C=CC^{\perp} = C. In association-scheme terms, this says that

a(C)=(1,0,1),a(C)=(1,0,1).a(C)=(1,0,1), \qquad a^\ast(C)=(1,0,1).

The latter is the dual distribution obtained from the sizes of the components of 1C\one_C after projecting it to V0,V1,V2V_0,V_1,V_2. Indeed, with vertex order 00,01,10,1100,01,10,11,

1C=(1,0,0,1),\one_{C} = (1,0,0,1),

and

1C=12(1,1,1,1)+12(1,1,1,1).\one_{C} = \frac{1}{2}(1,1,1,1) + \frac{1}{2}(1,-1,-1,1).

The first component is a constant vector, so it belongs to V0V_0. The second component can be written as

(1,1,1,1)=(1,1)(1,1),(1,-1,-1,1) = (1, -1) \otimes (1, -1),

so it belongs to V2V_{2}, and the V1V_{1} component is 00. The squared norms of these components correspond to 1,0,11,0,1, respectively, so the fact that the dual distribution is (1,0,1)(1,0,1) can be seen directly from this projection decomposition, not only from the Krawtchouk transform.

From the association-scheme viewpoint, this calculation looks as follows. First, the distance distribution of pairs inside CC is measured by the adjacency matrices A0,A1,A2\mathsf{A}_0,\mathsf{A}_1,\mathsf{A}_2. Next, the characteristic vector 1C\one_C is projected onto the eigenspaces V0V_{0}, V1V_{1}, V2V_{2}, and we look at the sizes of those projected components. The transform coefficients connecting these two pieces of information are precisely the Krawtchouk values appearing in the table.

What Association Schemes Did in This Proof

Let us organise the role that association schemes played in the proof.

First, the association scheme turned the distance structure of Hamming space into a matrix algebra. From the relation of distance ii, we formed the adjacency matrix Ai\mathsf{A}_i, and considered the Bose–Mesner algebra spanned by these matrices. This algebra contains the distance structure of Hamming space in a compressed form.

Second, two bases of the Bose–Mesner algebra appeared. One is the adjacency-matrix basis A0,,An\mathsf{A}_{0}, \dots, \mathsf{A}_{n}, which represents distance relations. The other is the primitive-idempotent basis E0,,En\mathsf{E}_{0}, \dots, \mathsf{E}_{n}, which represents projections onto eigenspaces. The MacWilliams identity appeared as the change of basis between these two bases.

Third, the Krawtchouk polynomials appeared as eigenvalues. The adjacency matrix Aj\mathsf{A}_{j} acted on the eigenspace VwV_{w} as multiplication by Kj(w)\Kraw_j(w). Thus the Krawtchouk polynomials are the table recording the eigenvalues of the Hamming scheme.

Fourth, the weight distribution of a code appeared as an inner distribution. For a linear code CC, the inner distribution

ai(C)=1CAi1C,1Ca_{i}(C) = \frac{1}{\card{C}}\langle \mathsf{A}_{i} \one_{C}, \one_{C} \rangle

agreed with the usual weight distribution Ai(C)A_{i}(C).

Fifth, the weight distribution of the dual code appeared as a dual distribution. Using primitive idempotents, define

aj(C)=qnC2Ej1C,1C.a_{j}^{\ast}(C) = \frac{q^{n}}{\card{C}^{2}} \langle \mathsf{E}_{j} \one_{C}, \one_{C} \rangle.

For a linear code, aj(C)=Aj(C)a_{j}^{\ast}(C) = A_{j}(C^{\perp}).

Thus the point of this proof can be summarised in one sentence:

The MacWilliams identity is the change of basis, in the Bose–Mesner algebra of the Hamming scheme, between the adjacency-matrix basis and the primitive-idempotent basis.

Concepts Seen in This Instalment

In this instalment, while aiming at a proof of the MacWilliams identity, we introduced the basic tools of association schemes. They can be summarised as follows.

Relation partition

This is a partition of Ω×Ω\Omega\times \Omega, for a finite set Ω\Omega, into relations R0,R1,,RdR_{0}, R_{1}, \dots, R_{d}. In the Hamming scheme, the partition is according to whether the distance between two words is 0,1,,n0, 1, \dots, n.

Adjacency matrix

This is the 00-11 matrix representing a relation RiR_{i}. Turning relations into matrices allows us to use tools from linear algebra.

Association scheme

This is a situation where a relation partition is sufficiently regular and the intersection numbers pijkp_{ij}^{k} are defined. In matrix terms, it is the situation where the adjacency matrices are closed under multiplication.

Bose–Mesner algebra

This is the matrix algebra spanned by the adjacency matrices A0,,Ad\mathsf{A}_{0}, \dots, \mathsf{A}_{d}. In a symmetric association scheme, it is generated by real symmetric, mutually commuting adjacency matrices, so these adjacency matrices can be simultaneously diagonalised. In the language of abstract algebra, it is a commutative semisimple algebra.

Primitive idempotents

These are the projections onto the common eigenspaces of the Bose–Mesner algebra. In the Hamming scheme, they appear as the projections E0,,En\mathsf{E}_{0}, \dots, \mathsf{E}_{n} onto V0,,VnV_{0}, \dots, V_{n}.

Hamming scheme

This is the association scheme on Ω=FqE\Omega = \F_{q}^{E} in which relations are divided according to the Hamming distance between two words. It is the example most directly connected with coding theory.

Krawtchouk polynomials

These are the eigenvalues of the adjacency matrix Aj\mathsf{A}_{j} of the Hamming scheme on the eigenspace VwV_{w}. The generating function is

j=0nKj(w)zj=(1+(q1)z)nw(1z)w.\sum_{j=0}^{n} \Kraw_{j}(w) z^{j} = \bigl( 1 + (q - 1)z \bigr)^{n - w}(1 - z)^{w}.
Inner distribution

This is the distribution counting how often two points in a subset DΩD\subseteq \Omega are in each relation RiR_{i}. For a linear code, it agrees with the usual weight distribution.

Dual distribution

This is the distribution obtained by projecting the characteristic vector by the primitive idempotents and measuring the size of each eigenspace component. For a linear code, it agrees with the usual weight distribution of the dual code.

From the association-scheme viewpoint, a code is treated not merely as a linear subspace, but as a subset of the finite set Hamming space, which has a distance structure. The MacWilliams identity appears at the point where the distance distribution of that subset is connected to the eigenspace decomposition of the Bose–Mesner algebra.

Looking Back at This Proof Family

As stated at the beginning, the proof in this note belongs to the

orthogonal-polynomial and association-scheme family.

On the surface, it is a proof using adjacency matrices and the Bose–Mesner algebra built from the Hamming distance. At a deeper level, the distance structure of Hamming space forms a commutative matrix algebra, and its eigenvalue theory controls the transform of weight distributions.

When Krawtchouk polynomials are brought to the foreground, the MacWilliams identity is

a Krawtchouk transform of the weight distribution.

From the association-scheme viewpoint, one can understand this one step more deeply as

the Krawtchouk transform is the change of basis in the Bose–Mesner algebra of the Hamming scheme.

The advantage of this viewpoint is that it places the MacWilliams identity in a broader framework. Not only in the Hamming scheme, but also in many association schemes such as the Johnson schemes, the Grassmann schemes, and the dual polar schemes, notions such as inner distributions, dual distributions, eigenvalues, and linear-programming bounds appear in the same form. Delsarte's linear programming method in coding theory is also naturally derived from this association-scheme perspective. For this broader framework, see Delsarte [Del73] and Delsarte–Levenshtein [DL98].

The point of the proof in this note is the following sentence.

When the Bose–Mesner algebra of the complete graph coloured by Hamming distance is diagonalised, the Krawtchouk polynomials appear as its eigenvalues, and the MacWilliams identity is obtained as the formula connecting the relation-side distribution and the eigenspace-side distribution.

Next Time

Next time, we look at the MacWilliams identity from the viewpoint of moment identities.

In this proof, we transformed the weight distribution itself using the adjacency matrices and primitive idempotents of the Hamming scheme. Next time, instead of transforming the weight distribution directly, we first focus on moments such as

i=0nAi(C)irori=0nAi(C)(ir).\sum_{i=0}^{n} A_{i}(C) i^{r} \quad\text{or}\quad \sum_{i=0}^{n} A_{i}(C) \binom{i}{r}.

Moments of the weight distribution can be computed by counting pairs consisting of a codeword and a coordinate subset in two ways. From this, the Pless power moment identities appear. Collecting sufficiently many moment identities lets us recover the whole MacWilliams transform of the weight distribution.

Thus the protagonists next time will be

weight distributions → moments → double counting → the MacWilliams identity

Even for the same MacWilliams identity, we will see a counting-theoretic form quite different from the eigenvalue and matrix-algebra viewpoint of this note.

References

  1. [Del73] P. Delsarte. An algebraic approach to the association schemes of coding theory. Philips Res. Rep. Suppl., no. 10, pp. vi+97, 1973 ↩1 ↩2 ↩3
  2. [DL98] Philippe Delsarte and Vladimir I. Levenshtein. Association schemes and coding theory. IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2477–2504, 1998. doi:10.1109/18.720545 ↩1 ↩2
  3. [BI84] Eiichi Bannai and Tatsuro Ito. Algebraic combinatorics. I. The Benjamin/Cummings Publishing Co., Inc., Menlo Park, CA, pp. xxiv+425, 1984
  4. [BCN89] A. E. Brouwer, A. M. Cohen, and A. Neumaier. Distance-regular graphs. Springer-Verlag, Berlin, vol. 18, pp. xviii+495, 1989. doi:10.1007/978-3-642-74341-2
  5. [MS77] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. I. North-Holland Publishing Co., Amsterdam-New York-Oxford, vol. Vol. 16, pp. i–xv and 1–369, 1977
  6. [Cam98] Paul Camion. Codes and association schemes: basic properties of association schemes relevant to coding. Handbook of coding theory, Vol. I, II, pp. 1441–1566, 1998 ↩1 ↩2
  7. [God93] C. D. Godsil. Algebraic combinatorics. Chapman & Hall, New York, pp. xvi+362, 1993

This series

A Series Learning through the MacWilliams Identity · Part 5 of 12

Back to series list

Disclaimer

Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.

Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.

If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.