Article and note
A Series Learning through the MacWilliams IdentityPart 5 of 12
An Introduction to Association Schemes through the MacWilliams Identity
- Published:
- Updated:
- Reading time:
- 42 min (about 9,141 words)
Introduction
One of the fundamental theorems in coding theory is the MacWilliams identity. Let be the coordinate set, let , and let be a linear code over the finite field . Consider its dual code
Here
First, define the weight enumerator of the linear code by
Also write the weight distribution as
Then
The MacWilliams identity is the formula saying that the weight enumerator of the dual code can be computed from the weight enumerator of as follows:
This is the MacWilliams identity.
In this series, I use proofs of the MacWilliams identity as a guide to an introduction to neighbouring areas and concepts. Accordingly, this series is aimed at readers who already know the following basic material in coding theory:
We assume familiarity, at least at a basic level, with
what a (finite) field is,
what a linear code over a finite field is,
what the Hamming weight is,
what the dual code is.
(There is no problem if you do not know a proof of the MacWilliams identity.)
This note does not assume Parts 1–4. Krawtchouk polynomials, tensor products, and additive characters appear along the way, but the parts needed in this note are explained when they are used. If you have read Part 4, the outlook will be clearer, but the note is written so that it can be read independently.
In this series, I look at proof methods for the MacWilliams identity under the following five broad families.
Fourier, character, and Poisson methods.
Möbius inversion, lattice-theoretic, and shortening/puncturing methods.
Orthogonal-polynomial and association-scheme methods.
Matroid and Tutte-polynomial methods.
Moment and double-counting methods.
In this note, I focus on the third of these: the orthogonal-polynomial and association-scheme approach. Written using Krawtchouk polynomials, the MacWilliams identity appears as
a Krawtchouk transform of the weight distribution.
Here the Krawtchouk transform means a transform of the form
Let me first say how to read the subscripts. If we write , this means the eigenvalue of the distance- adjacency matrix on the -th eigenspace. On the other hand, in the MacWilliams transform we write the dual-side weight as and the original weight as , so the coefficient appears in the form . It is the same Krawtchouk polynomial, but one has to read from context which subscript belongs to the distance side and which belongs to the eigenspace side. We will organise this point once more after introducing the eigenspaces of the Hamming scheme. In this note, I explain in the language of association schemes why these Krawtchouk polynomials are naturally attached to Hamming space. In other words, the aim of this note is to explain why the coefficients naturally appear as eigenvalues of the Hamming scheme. The goal of this instalment is to use a proof of the MacWilliams identity as a guide to an introduction to association schemes, especially the Hamming scheme and the Bose–Mesner algebra.
An association scheme abstracts the situation where binary relations on a finite set are divided into several types and these relations behave in a sufficiently regular way. In Hamming space, the relation between two words is classified according to which of
holds. Here is the Hamming distance. From these distance relations we form adjacency matrices; when we look at the commutative algebra spanned by them, Krawtchouk polynomials appear as its eigenvalues.
The route in this note is as follows.
relation partitions → adjacency matrices → association schemes → Bose–Mesner algebras → primitive idempotents → inner and dual distributions → the MacWilliams identity
The classical starting point for coding theory from the viewpoint of association schemes is Delsarte [Del73]. For a survey of the relation between association schemes and coding theory, including distance distributions, MacWilliams transforms, and linear programming, see Delsarte–Levenshtein [DL98]. For a systematic reference on association schemes, Bannai–Ito [BI84] is standard. For a detailed account including the relation with distance-regular graphs, see Brouwer–Cohen–Neumaier [BCN89]. For the MacWilliams identity and Krawtchouk polynomials in coding theory, MacWilliams–Sloane [MS77, Chapter 5] is also a classical standard reference. For basic material on association schemes used in coding theory, Camion [Cam98] is also useful.
Viewing Relations on a Finite Set as Matrices
Before entering association schemes, we look at how to represent relations on a finite set by matrices.
Let be a finite set. If we think of the elements of as vertices, then a subset can be viewed as a set of directed edges. The condition means that there is relation from to .
Definition 2.1.
For a relation on a finite set , define the matrix , whose rows and columns are indexed by , by
This is called the adjacency matrix of .
This definition generalises the adjacency matrix of a graph. For an ordinary simple graph, we record by and whether two vertices are joined by an edge. Here we are simply recording, in matrix form, whether a pair belongs to the relation .
Let be the set of all complex-valued functions on . Equivalently, one may view it as the complex vector space with basis vectors corresponding to the elements of . The matrix acts as a linear map on . For a function ,
Thus is the operator that “adds the values over all related to by ”.
Example 2.2 (The one-coordinate relation of a complete graph).
Let . Consider the two relations
The adjacency matrix of is the identity matrix , and the adjacency matrix of is . Here is the matrix all of whose entries are . This is the adjacency matrix of the complete graph .
In what follows, and denote the identity matrix and the all-one matrix of the size appropriate to the space on which they act. For instance, in the example above they are matrices indexed by . On the other hand, the and which later appear as operators on are matrices indexed by . Where necessary, I will specify whether an operator acts on one coordinate or on the full multi-coordinate space.
When we treat several relations at the same time, the important case is when they partition .
Definition 2.3.
Relations on a finite set form a relation partition of if
A relation partition corresponds to colouring the relation between any two points with exactly one colour. For this reason, association schemes are sometimes described as complete graphs with coloured edges. Strictly speaking, however, the diagonal pairs are also coloured with the colour , so it is natural to think of a complete graph with loops, whose edges have been coloured. Of course, arbitrary colouring is not enough. We require strong regularity in the colouring. That is the definition in the next section.
Association Schemes
In this note, we treat the most basic case, sufficient for coding theory: symmetric association schemes. There are non-symmetric association schemes in general, but Hamming schemes are symmetric, so it is clearest to begin with this case.
Definition 3.1.
Let be a finite set, and let be non-empty relations giving a relation partition
This partition is called a symmetric association scheme if it satisfies:
.
For each , if , then .
For any , if satisfy , then
is independent of the choice of and , and depends only on , , and .
Since each is assumed to be non-empty, there is no ambiguity in choosing in condition (A3).
The number appearing in condition (A3) is written and is called an intersection number.
In words, the definition means the following. Suppose that the relation between two points and is . Then the number of points which are in relation from the viewpoint of and in relation from the viewpoint of does not depend on the concrete choice of and , but only on the types of relations , , and . Thus we are requiring the colouring of the relations to be highly uniform.
In terms of adjacency matrices, this condition looks very natural. Write the adjacency matrix of as . Then the entry of the matrix product is
This counts the number of such that and . Therefore the association-scheme condition says that this product of matrices can again be written as a linear combination of the same adjacency matrices.
Theorem 3.2.
The adjacency matrices of a symmetric association scheme satisfy
Proof
Compare the entries of both sides. Suppose that . The entry of the left-hand side is the number of satisfying and . By the definition of an association scheme, this is equal to . On the other hand, since the relation containing is , the entry of the right-hand side is
Hence the entries of the two sides agree.
Thus the intersection-number condition guarantees that the operation “follow relation , then follow relation ” can again be expressed as a linear combination of the original relations . For this reason, a relation partition is not merely a classification; it produces a matrix algebra formed by the adjacency matrices.
Also, because the relations partition ,
Here is the all-one matrix. Moreover, since is the diagonal relation, .
Bose–Mesner Algebras
The adjacency matrices of an association scheme are closed under matrix multiplication. We therefore consider the vector space spanned by these matrices.
Definition 4.1.
Let be a symmetric association scheme, and let be its adjacency matrices. The complex vector space spanned by these matrices,
is called the Bose–Mesner algebra of the scheme.
By Theorem 3.2, is closed under matrix multiplication. It also has an identity element, since . Furthermore, since partition , the matrices have 's only in disjoint positions. Therefore, if
then by looking at an entry with we get . Hence are linearly independent, and . For a symmetric association scheme, the matrices are real symmetric matrices, and they also commute with one another. In general, two real symmetric matrices need not commute. The commutativity here follows because, by Theorem 3.2,
is a linear combination of symmetric matrices, and is therefore symmetric. Indeed,
Thus is a commutative complex matrix algebra spanned by the real symmetric, mutually commuting adjacency matrices . These adjacency matrices can be simultaneously diagonalised in the sense of linear algebra. From this simultaneous diagonalisation one obtains another basis of the Bose–Mesner algebra: the primitive idempotents. We now verify this fact using only finite-dimensional linear algebra.
Theorem 4.2.
In the Bose–Mesner algebra of a symmetric association scheme, there exist matrices such that:
().
.
.
form a basis of .
Each acts as a scalar multiple on the image of each .
No can be decomposed further, inside , as a non-trivial sum of orthogonal idempotents.
These are called the primitive idempotents.
Proof
As seen above, are mutually commuting real symmetric matrices. Hence, by the simultaneous diagonalisation theorem from linear algebra, decomposes as an orthogonal direct sum of their common eigenspaces.
Let denote the set of all common eigenvalue tuples. Thus an element is a tuple of the form
for which the corresponding common eigenspace
is non-zero. Simultaneous diagonalisation gives the orthogonal direct-sum decomposition
For each , write for the orthogonal projection onto .
Basic properties of the projections
From the orthogonal direct-sum decomposition,
holds. Also, since each acts as multiplication by on ,
Next we show that each actually belongs to .
That
Fix . If and , then the common eigenvalue tuples are different, so there exists an index such that . Consider the polynomial
Substitute the commuting matrices into this polynomial. On , each acts as multiplication by , so acts as multiplication by on . On the other hand, if , then on the factor corresponding to in the product becomes . Hence acts as multiplication by on .
Therefore
The left-hand side is built from sums and products of . Since is closed under matrix multiplication, it follows that .
We next determine the number of elements of .
The number of common eigenvalue tuples is
Let be the number of elements of . First, as shown above, the are non-zero orthogonal idempotents belonging to . In particular, they are linearly independent. Indeed, if
then restricting both sides to gives . Hence
We prove the opposite inequality. Any can be written in the form
Then on , acts as multiplication by . Thus the action of is completely determined by its eigenvalue on each .
Define a map by
If , then acts as on every . By the orthogonal direct-sum decomposition
this means that is the zero matrix. Thus is injective, and
Therefore
Since we already saw that are linearly independent,
Hence .
We can therefore number the elements of as . Define
Checking the listed properties
Since is an orthogonal projection, . Since these are projections onto distinct common eigenspaces,
holds. Since the common eigenspaces sum to the whole space,
also holds.
Moreover, are linearly independent elements of . Since , they form a basis of .
Finally, each acts on as multiplication by . Thus each acts as a scalar multiple on the image of each .
It remains to check that these idempotents are primitive inside .
Primitivity
Suppose that for some , is a decomposition as a sum of two orthogonal idempotents belonging to . Thus
Since and belong to , they act as scalar multiples on each common eigenspace . Write for the eigenvalue of on , and for the eigenvalue of on . Since and are idempotents,
Hence .
On the other hand, . If , then acts as on , so . Since ,
If , then acts as on , so . Hence
It follows that one of and acts as only on , and as on all other common eigenspaces. Thus one of them is . The other acts as on every common eigenspace, and is therefore the zero matrix.
Consequently
Thus cannot be decomposed further, inside , as a non-trivial sum of orthogonal idempotents.
This proves the theorem.
First let us check that this numbering is natural. In a symmetric association scheme, every adjacency matrix has constant row sum. Indeed, since for , the definition of the intersection number gives
By symmetry, is equivalent to , so
The right-hand side is the sum of the -row of , and the intersection number does not depend on . Thus the row sum of is constant. Therefore the space of constant functions is a common eigenspace for all the .
From now on, we number the projection onto the one-dimensional space of constant functions as . This projection is
Indeed, since is the matrix all of whose entries are , is the orthogonal projection onto the space of constant functions. Also,
so this projection also belongs to .
An idempotent is an element whose square is itself. A general idempotent matrix represents a projection along some direct-sum decomposition. The primitive idempotents here are also real symmetric, or self-adjoint, and hence are orthogonal projections with respect to the standard inner product. Thus primitive idempotents are the orthogonal projections onto the natural eigenspaces seen by the Bose–Mesner algebra. Here “primitive” means that, inside the Bose–Mesner algebra, the idempotent cannot be decomposed further as a non-trivial sum of orthogonal idempotents. It does not mean that the matrix has rank . Indeed, common eigenspaces need not be one-dimensional in general.
The adjacency matrices and the primitive idempotents are both bases of the Bose–Mesner algebra. Therefore each can be expressed as a linear combination of the other. We will write the concrete coefficients in the section on the Hamming scheme.
Thus, in an association scheme, we move between the following two bases.
basis representing relations: ↔ basis representing projections onto eigenspaces: .
In the association-scheme proof of the MacWilliams identity, this passage between the “relation side” and the “eigenspace side” is essential. The weight distribution of a code is defined on the relation side, while the dual distribution is defined by projection onto the eigenspace side. For a readable reference on the linear-algebraic treatment of Bose–Mesner algebras and primitive idempotents discussed in this section, see also Godsil [God93].
The Hamming Scheme
We now look at the Hamming scheme, the most important association scheme appearing in coding theory.
Let be a finite coordinate set, and let . This is a different symbol from the primitive idempotents which appear later; the fonts distinguish them. Let the vertex set be . Thus the vertices are -ary words of length . For two words , define the Hamming distance by
Definition 5.1.
For , define
The association scheme obtained from this relation partition is called the Hamming scheme, and is denoted by .
Write for the adjacency matrix of . Thus is the matrix joining two words whose distance is .
Theorem 5.2.
is a symmetric association scheme.
Proof
is the diagonal relation, and since the Hamming distance is symmetric, each is symmetric. Also, any two words have exactly one distance among , so partition . Each is non-empty as well. Indeed, take the zero word and a word which is non-zero in exactly coordinates; then .
We check the intersection-number condition. Suppose . We want to count the number of satisfying
Since , there are coordinates with , and coordinates with .
The contribution at each coordinate is summarised in the following table, where and denote the one-coordinate contributions to and :
| case | choices | |
|---|---|---|
Recording the contribution to by and the contribution to by , the first two rows show that the one-coordinate generating function for a coordinate with is
and the last three rows show that the one-coordinate generating function for a coordinate with is
Since the choices at different coordinates are independent, the total number is given by the product of these one-coordinate generating functions. Here the exponent of records the contribution to , and the exponent of records the contribution to . Thus the number of satisfying and can be written, for example, as
Here denotes extraction of the coefficient of in the expanded polynomial. When , the term expresses that there is no option “different from both”. This depends only on , , , , and , not on the concrete values of and . Hence the intersection-number condition holds.
In this proof we did not use an explicit formula for the intersection numbers. The important point is that the counting is governed not by the concrete positions of the two points and , but only by their distance . This is the homogeneity of Hamming space.
The row sum of is the number of words at distance from a fixed word. It is
This is called the valency of the relation . Indeed, there are ways to choose the coordinates at distance , and in each such coordinate there are choices different from the original value.
Eigenspaces of the Hamming Scheme
To understand the Bose–Mesner algebra of the Hamming scheme, we need to simultaneously diagonalise the adjacency matrices . This is where Krawtchouk polynomials appear.
We begin with one coordinate. Let the one-coordinate function space be . Inside there is the one-dimensional subspace consisting of constant functions. Also define
the subspace of functions whose values sum to . With respect to the standard inner product, we have the orthogonal decomposition .
Let be the one-coordinate adjacency matrix for “moving to a different value”. This is the adjacency matrix of the complete graph . The matrix acts on with eigenvalue , and on with eigenvalue . Indeed, on constant functions, from each point there are other points to go to, so the eigenvalue is . On the other hand, if , then
For this note, it is enough to think of the tensor product as the operation which makes a function of many variables by multiplying one-variable functions over the coordinates, and which lets one-coordinate operators act independently on each coordinate. No general theory of tensor products is needed to understand this article.
The multi-coordinate space is
If for each we choose a one-coordinate function , then the pure tensor represents the function
Also, the tensor product of one-coordinate operators should be understood as applying independently to the component in the -th coordinate. For example, when , the tensor represents the two-variable function
Also, acts by
so it is the operator which applies to the first coordinate and to the second coordinate. For a general , this is simply extended to all coordinates. Decomposing each coordinate as gives the orthogonal decomposition
Here denotes the set of coordinates in which we have chosen the non-constant direction.
Definition 6.1.
For , define
Then
For each direct-summand component with , the dimension is , and there are such sets . Thus
Intuitively, is the component with non-constant direction in coordinates. This decomposition will be the eigenspace decomposition of the Hamming scheme.
Next consider the generating function collecting the distance matrices . The and on the right-hand side are operators on the one-coordinate space . On the other hand, the on the left-hand side are operators on the multi-coordinate space . Taking a tensor product extends the one-coordinate action independently to all coordinates. In one coordinate, keeping the same value is the action , and moving to a different value is the action . Therefore, on the multi-coordinate space,
holds. Indeed, when the right-hand side is expanded, the coefficient of is exactly the operator which changes the value in precisely coordinates and keeps the remaining coordinates unchanged.
Definition 6.2.
For , define numbers by
These are called the values of the Krawtchouk polynomials associated with the -ary Hamming scheme.
Expanding the generating function gives
Here binomial coefficients outside their range are interpreted as .
This definition may look as though we are introducing Krawtchouk polynomials as polynomials defined by a generating function, but the meaning here is slightly different. Equation (6.2) is the eigenvalue generating function for the adjacency matrices of the Hamming scheme.
Theorem 6.3.
Let . The adjacency matrix of the Hamming scheme acts on the eigenspace as multiplication by .
Proof
An element of is a sum of tensors which have the direction in coordinates and the direction in the remaining coordinates. The one-coordinate matrix acts by on and by on . Therefore acts on as multiplication by
Comparing the coefficient of , the eigenvalue of is .
This theorem gives the following interpretation of the Krawtchouk polynomials.
A Krawtchouk polynomial is the eigenvalue of the distance- adjacency matrix of the Hamming scheme on the -th eigenspace .
As mentioned earlier, let us now organise the roles of the subscripts. The subscript of is the distance-side index, while the subscript of , and later of , is the eigenspace-side index. Thus denotes the eigenvalue of the distance- adjacency matrix on the -th eigenspace . On the other hand, when we later write
is on the eigenspace side and is on the distance side. In the Hamming scheme, these two sides are connected by Krawtchouk polynomials.
In particular, when ,
which takes distinct values for . Thus are already distinguished by alone. Consequently, they are the common eigenspaces for the Bose–Mesner algebra of the Hamming scheme, and the corresponding orthogonal projections are the primitive idempotents. Indeed, since the eigenvalues of on are distinct, Lagrange interpolation gives
This formula extracts the projection onto as a polynomial in . Therefore belongs to the Bose–Mesner algebra. In this sense, for the Hamming scheme, primitive idempotents can be understood explicitly without relying on the general theory of semisimple algebras.
This viewpoint is the heart of the association-scheme perspective. The polynomial does not come first; rather, when one simultaneously diagonalises the matrix algebra built from the distance relations, the Krawtchouk polynomials appear as its eigenvalues.
Writing the Primitive Idempotents Explicitly
Write for the orthogonal projection onto . Then are the primitive idempotents of the Bose–Mesner algebra of the Hamming scheme. That is,
and each acts on the image of as multiplication by . The image is exactly , and
Thus the primitive idempotents of the Hamming scheme are also generally not of rank .
We can write this projection explicitly as a linear combination of the adjacency matrices . For this, look at the one-coordinate projections. Let be the orthogonal projection onto , and let be the orthogonal projection onto . Then
In components, for ,
For , consider the projection obtained by applying in the coordinates of and in the remaining coordinates. Summing all such projections with gives . When we compute its entries, Krawtchouk polynomials appear again.
Theorem 7.1.
The primitive idempotents of the Hamming scheme can be written as
Proof
Take , and suppose . We compute the entry of . The projection is the sum, over subsets with , of the projections which apply in the coordinates of and in the remaining coordinates.
Putting these together as a generating function,
If , then
If , then
Since and differ in coordinates and agree in coordinates,
Hence
This means that the entry is at positions of distance . On the other hand,
is the matrix which places exactly in the entries of distance , since has exactly in positions of distance . Therefore
All entries agree, so the matrices are equal.
This formula is very important. The left-hand side is a projection onto an eigenspace. The right-hand side is a linear combination of adjacency matrices representing distance relations. Thus the Krawtchouk polynomials are
the coefficients connecting the relation-side basis and the eigenspace-side basis .
Let us relate this to the general theory. For a general association scheme, when we write
the coefficients are called the first eigenmatrix and the second eigenmatrix , respectively. In the notation for the Hamming scheme used in this note,
so this corresponds to and .
In the Hamming scheme, the same type of Krawtchouk polynomial appears in the two directions of this change of basis. However, the simple symmetry does not hold in general. More precisely, using the valency , we have the weighted symmetry
Inner Distributions: Viewing Codes from the Relation Side
We now return to codes. In an association scheme, for a subset of the finite set , we count which relations connect pairs of points inside that subset. This is the inner distribution. Inner and dual distributions are fundamental quantities in Delsarte's association-scheme approach to coding theory [Del73; Cam98].
Let us first organise some similar notation. is the adjacency matrix for distance , is the number of codewords of weight in a linear code , and is the inner distribution of a subset .
Definition 8.1.
For a subset , define its characteristic vector by
We use the standard inner product
Using the adjacency matrix , the number of ordered pairs in at distance is .
Definition 8.2.
Let be a non-empty subset. The inner distribution of is defined by
With this normalisation, . Indeed, since ,
Also,
so with this normalisation it is not a probability distribution. It is useful to think of as the average number of points of which are in relation to a fixed point of . In the Hamming scheme, this reads as the average number of points of at distance .
For a linear code, the inner distribution agrees with the usual weight distribution.
Theorem 8.3.
Let be a linear code. Write the usual weight distribution as . Then .
Proof
The quantity is the number of ordered pairs satisfying
Since is a linear code, the difference again belongs to . Also, .
Fix a codeword of weight . The pairs satisfying are obtained by choosing arbitrarily and putting , so there are exactly such pairs. Since there are codewords of weight ,
Hence .
This theorem shows that the language of association schemes contains the weight distribution of coding theory. For a general subset , is the distance distribution inside . For a linear code, taking differences returns us to the usual weight distribution.
Dual Distributions: Viewing Codes from the Eigenspace Side
The inner distribution was defined using the adjacency matrices . This is information on the relation side. In an association scheme, there is another distribution, defined using the primitive idempotents . This is the dual distribution.
Definition 9.1.
Let be a non-empty subset. The dual distribution of is defined by
Here
is the squared norm of the component of the characteristic vector . If we used this quantity as it stands, its scale would not match the weight distribution of the dual code in the case of a linear code. Thus we multiply by the factor , normalising it so that when is a linear code.
With this normalisation, . Indeed, the projection onto the space of constant functions is , so
Also,
Therefore the dual distribution is not a probability distribution with this normalisation either. For a linear code , this corresponds to and agrees with the sum of the weight distribution of the dual code. For a general subset , there is no dual code in the usual sense. Nevertheless, the distribution seen from the eigenspace side of the Bose–Mesner algebra can still be defined. Non-negativity follows from the fact that is an orthogonal projection.
As a warning, for a general subset , the number need not be an integer, and it is not directly counting the elements of anything. It is the size of the eigenspace component of the characteristic vector , normalised so that, in the case of a linear code, it agrees with the weight distribution of the dual code.
This definition may look a little abstract. But its meaning is simple. We decompose the characteristic vector into the eigenspaces of the Hamming scheme, and look at the size of each component. Since is the orthogonal projection onto ,
Thus the dual distribution is non-negative.
For a general subset , is the dual distribution in Delsarte's sense. For a linear code, it is genuinely the weight distribution of the dual code. This fact connects the association-scheme proof with ordinary coding theory.
In the next theorem we use additive characters just once. The purpose is to check that the dual distribution defined in the association scheme agrees, for a linear code, with the usual weight distribution of the dual code . Thus what is needed here is not the whole theory of characters, but only the basic orthogonality relation for additive characters of finite fields.
Theorem 9.2.
Let be a linear code. Then .
Proof
Fix a non-trivial additive character of the finite field. For example, when ,
gives a non-trivial additive character. However, we will not enter the details of this construction. In what follows, we use only the basic orthogonality relation for a non-trivial additive character:
For , put
This decomposes coordinatewise as
Thus, in one coordinate, if then the function lies in the constant direction, and if then it is naturally seen to lie in the direction where the sum of values is . This is why, later, corresponds to . We also use the fact that additive characters have complex absolute value , so
After normalisation, these functions form an orthonormal basis of . Indeed, for ,
If , this product is . If , then for some coordinate , and the corresponding factor is . Hence
is an orthonormal basis of .
In one coordinate, when , the function is constant, and when , the sum of its values is . Therefore belongs to if and only if . Thus is the projection onto the span of the characters of weight .
We now compute the norm of the weight- component of . By Parseval's identity for the orthonormal basis,
Here
The right-hand side is , where
If , then for all , so
On the other hand, if , then is a non-zero -linear map on and is therefore surjective onto . Each value occurs exactly times. Thus
Therefore
Since , the definition of the dual distribution gives
Characters appeared only briefly in this theorem. Their role was to check that, in the case of a linear code, the eigenspace side of the Hamming scheme is seeing the usual dual code. From the association-scheme viewpoint, first appears as the dual distribution. For a linear code, this dual distribution agrees with the usual weight distribution of the dual code.
The Coefficient-Level MacWilliams Identity
With the preparation so far, the MacWilliams identity is almost proved. First we write a transform formula for a general subset.
Proposition 10.1.
Let be a non-empty subset. Then
holds.
Proof
By the definition of the dual distribution,
Substituting Theorem 7.1 and using , we get
For a linear code , by Theorem 8.3 and Theorem 9.2,
Thus this general formula is exactly the coefficient-level MacWilliams identity.
Theorem 10.2 (The Coefficient-Level MacWilliams Identity).
Let be a linear code. Put and . Then
holds.
Proof
The structure of this proof has three stages.
The weight distribution is expressed as the relation-side quantity
The weight distribution of the dual code is expressed as the eigenspace-side quantity
Expanding as a linear combination of the , Krawtchouk polynomials appear as the coefficients.
Thus the MacWilliams identity appears as
the change-of-basis formula between the two bases of the Bose–Mesner algebra: the adjacency-matrix basis and the primitive-idempotent basis.
The Polynomial Form of the MacWilliams Identity
We now derive the usual polynomial form of the MacWilliams identity from the coefficient-level formula.
Theorem 11.1 (MacWilliams Identity).
Let be a linear code. Then
holds.
Proof
Put and . By Theorem 10.2,
In the generating function for Krawtchouk polynomials,
multiply both sides by and read as . This gives
This is a calculation of polynomial identities; we are not dividing by as a number. Therefore
This proves the MacWilliams identity.
In this proof, the change of variables
came from the eigenvalue generating function of the Hamming scheme. Thus the change of variables is not merely an algebraic accident: it packages the eigenvalues of the matrices representing the distance relations.
A Small Example: and the Binary Repetition Code
Finally, we check the association-scheme viewpoint in a small example. Let , and consider the Hamming scheme . The vertices are
The relations are divided by distance as
If the vertices are ordered as , the corresponding adjacency matrices are
For and , the Krawtchouk table is
This is the table of eigenvalues of on the eigenspaces . The row index is on the distance side, and the column index is on the eigenspace side.
Consider the binary repetition code
This code is self-dual. Its weight distribution is
The coefficient-level MacWilliams identity says
Here , , and , so
Using the values in the table above,
Therefore
which indeed agrees with the weight distribution of . In association-scheme terms, this says that
The latter is the dual distribution obtained from the sizes of the components of after projecting it to . Indeed, with vertex order ,
and
The first component is a constant vector, so it belongs to . The second component can be written as
so it belongs to , and the component is . The squared norms of these components correspond to , respectively, so the fact that the dual distribution is can be seen directly from this projection decomposition, not only from the Krawtchouk transform.
From the association-scheme viewpoint, this calculation looks as follows. First, the distance distribution of pairs inside is measured by the adjacency matrices . Next, the characteristic vector is projected onto the eigenspaces , , , and we look at the sizes of those projected components. The transform coefficients connecting these two pieces of information are precisely the Krawtchouk values appearing in the table.
What Association Schemes Did in This Proof
Let us organise the role that association schemes played in the proof.
First, the association scheme turned the distance structure of Hamming space into a matrix algebra. From the relation of distance , we formed the adjacency matrix , and considered the Bose–Mesner algebra spanned by these matrices. This algebra contains the distance structure of Hamming space in a compressed form.
Second, two bases of the Bose–Mesner algebra appeared. One is the adjacency-matrix basis , which represents distance relations. The other is the primitive-idempotent basis , which represents projections onto eigenspaces. The MacWilliams identity appeared as the change of basis between these two bases.
Third, the Krawtchouk polynomials appeared as eigenvalues. The adjacency matrix acted on the eigenspace as multiplication by . Thus the Krawtchouk polynomials are the table recording the eigenvalues of the Hamming scheme.
Fourth, the weight distribution of a code appeared as an inner distribution. For a linear code , the inner distribution
agreed with the usual weight distribution .
Fifth, the weight distribution of the dual code appeared as a dual distribution. Using primitive idempotents, define
For a linear code, .
Thus the point of this proof can be summarised in one sentence:
The MacWilliams identity is the change of basis, in the Bose–Mesner algebra of the Hamming scheme, between the adjacency-matrix basis and the primitive-idempotent basis.
Concepts Seen in This Instalment
In this instalment, while aiming at a proof of the MacWilliams identity, we introduced the basic tools of association schemes. They can be summarised as follows.
- Relation partition
This is a partition of , for a finite set , into relations . In the Hamming scheme, the partition is according to whether the distance between two words is .
- Adjacency matrix
This is the - matrix representing a relation . Turning relations into matrices allows us to use tools from linear algebra.
- Association scheme
This is a situation where a relation partition is sufficiently regular and the intersection numbers are defined. In matrix terms, it is the situation where the adjacency matrices are closed under multiplication.
- Bose–Mesner algebra
This is the matrix algebra spanned by the adjacency matrices . In a symmetric association scheme, it is generated by real symmetric, mutually commuting adjacency matrices, so these adjacency matrices can be simultaneously diagonalised. In the language of abstract algebra, it is a commutative semisimple algebra.
- Primitive idempotents
These are the projections onto the common eigenspaces of the Bose–Mesner algebra. In the Hamming scheme, they appear as the projections onto .
- Hamming scheme
This is the association scheme on in which relations are divided according to the Hamming distance between two words. It is the example most directly connected with coding theory.
- Krawtchouk polynomials
These are the eigenvalues of the adjacency matrix of the Hamming scheme on the eigenspace . The generating function is
- Inner distribution
This is the distribution counting how often two points in a subset are in each relation . For a linear code, it agrees with the usual weight distribution.
- Dual distribution
This is the distribution obtained by projecting the characteristic vector by the primitive idempotents and measuring the size of each eigenspace component. For a linear code, it agrees with the usual weight distribution of the dual code.
From the association-scheme viewpoint, a code is treated not merely as a linear subspace, but as a subset of the finite set Hamming space, which has a distance structure. The MacWilliams identity appears at the point where the distance distribution of that subset is connected to the eigenspace decomposition of the Bose–Mesner algebra.
Looking Back at This Proof Family
As stated at the beginning, the proof in this note belongs to the
orthogonal-polynomial and association-scheme family.
On the surface, it is a proof using adjacency matrices and the Bose–Mesner algebra built from the Hamming distance. At a deeper level, the distance structure of Hamming space forms a commutative matrix algebra, and its eigenvalue theory controls the transform of weight distributions.
When Krawtchouk polynomials are brought to the foreground, the MacWilliams identity is
a Krawtchouk transform of the weight distribution.
From the association-scheme viewpoint, one can understand this one step more deeply as
the Krawtchouk transform is the change of basis in the Bose–Mesner algebra of the Hamming scheme.
The advantage of this viewpoint is that it places the MacWilliams identity in a broader framework. Not only in the Hamming scheme, but also in many association schemes such as the Johnson schemes, the Grassmann schemes, and the dual polar schemes, notions such as inner distributions, dual distributions, eigenvalues, and linear-programming bounds appear in the same form. Delsarte's linear programming method in coding theory is also naturally derived from this association-scheme perspective. For this broader framework, see Delsarte [Del73] and Delsarte–Levenshtein [DL98].
The point of the proof in this note is the following sentence.
When the Bose–Mesner algebra of the complete graph coloured by Hamming distance is diagonalised, the Krawtchouk polynomials appear as its eigenvalues, and the MacWilliams identity is obtained as the formula connecting the relation-side distribution and the eigenspace-side distribution.
Next Time
Next time, we look at the MacWilliams identity from the viewpoint of moment identities.
In this proof, we transformed the weight distribution itself using the adjacency matrices and primitive idempotents of the Hamming scheme. Next time, instead of transforming the weight distribution directly, we first focus on moments such as
Moments of the weight distribution can be computed by counting pairs consisting of a codeword and a coordinate subset in two ways. From this, the Pless power moment identities appear. Collecting sufficiently many moment identities lets us recover the whole MacWilliams transform of the weight distribution.
Thus the protagonists next time will be
weight distributions → moments → double counting → the MacWilliams identity
Even for the same MacWilliams identity, we will see a counting-theoretic form quite different from the eigenvalue and matrix-algebra viewpoint of this note.
References
- [Del73] P. Delsarte. An algebraic approach to the association schemes of coding theory. Philips Res. Rep. Suppl., no. 10, pp. vi+97, 1973 ↩1 ↩2 ↩3
- [DL98] Philippe Delsarte and Vladimir I. Levenshtein. Association schemes and coding theory. IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2477–2504, 1998. doi:10.1109/18.720545 ↩1 ↩2
- [BI84] Eiichi Bannai and Tatsuro Ito. Algebraic combinatorics. I. The Benjamin/Cummings Publishing Co., Inc., Menlo Park, CA, pp. xxiv+425, 1984 ↩
- [BCN89] A. E. Brouwer, A. M. Cohen, and A. Neumaier. Distance-regular graphs. Springer-Verlag, Berlin, vol. 18, pp. xviii+495, 1989. doi:10.1007/978-3-642-74341-2 ↩
- [MS77] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. I. North-Holland Publishing Co., Amsterdam-New York-Oxford, vol. Vol. 16, pp. i–xv and 1–369, 1977 ↩
- [Cam98] Paul Camion. Codes and association schemes: basic properties of association schemes relevant to coding. Handbook of coding theory, Vol. I, II, pp. 1441–1566, 1998 ↩1 ↩2
- [God93] C. D. Godsil. Algebraic combinatorics. Chapman & Hall, New York, pp. xvi+362, 1993 ↩
This series
A Series Learning through the MacWilliams Identity · Part 5 of 12
Disclaimer
Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.
Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.
If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.