Back to resources

Article and note

A Series Learning through the MacWilliams IdentityPart 4 of 12

An Introduction to Krawtchouk Polynomials through the MacWilliams Identity

Among the five proof systems for the MacWilliams identity, this note focuses on the orthogonal-polynomial and association-scheme approach, and introduces weight distributions, Krawtchouk polynomials, generating functions, orthogonality, and the Krawtchouk transform.
Published:
Updated:
Reading time:
22 min (about 4,790 words)
Tagscoding theoryMacWilliams identityKrawtchouk polynomialsorthogonal polynomialsweight distributionfinite fieldsexpository note
Download PDF

Introduction

In coding theory, one of the basic theorems is the MacWilliams identity. Let EE be the coordinate set, let n=En=\card{E}, and let CFqEC \leq \mathbb{F}_{q}^{E} be a linear code over the finite field Fq\mathbb{F}_{q}. For its dual code

C={uFqE:uc=0 for all cC},C^{\perp} = \{ u \in \mathbb{F}_{q}^{E} : u \cdot c = 0 \text{ for all } c \in C \},

where

uc=eEuece,u\cdot c=\sum_{e\in E}u_e c_e,

the weight enumerator of the dual code can be computed from the weight enumerator of CC as follows:

WC(X,Y)=1CWC(X+(q1)Y,XY).W_{C^{\perp}}(X,Y) = \frac{1}{\card{C}} W_{C}\bigl(X + (q - 1)Y, X - Y \bigr).

This is the MacWilliams identity.

In this series, I use proofs of the MacWilliams identity as a guide to an introduction to neighbouring areas and concepts. Accordingly, this series is aimed at readers who already know at least the following basic material:

We assume that the reader already knows, at least at a basic level, the following notions:

  • what a (finite) field is,

  • what a linear code over a finite field is,

  • what the Hamming weight is,

  • what the dual code is.

(There is no problem if you do not know a proof of the MacWilliams identity.)

In this series, I look at proof methods for the MacWilliams identity under the following five broad families.

  1. Fourier, character, and Poisson methods.

  2. Möbius inversion, lattice theory, and shortening/puncturing methods.

  3. Orthogonal polynomials and association schemes.

  4. Matroids and Tutte polynomials.

  5. Moments and double-counting methods.

In this note, I focus on the third of these: the orthogonal-polynomial and association-scheme approach. However, this note does not yet enter the full theory of association schemes. Using character sums only as a minimal tool, we reread the MacWilliams identity as a Krawtchouk transform on weight distributions. This is an entrance to the orthogonal-polynomial and association-scheme viewpoint. The aim of this note is to use a proof of the MacWilliams identity as a guide to an introduction to Krawtchouk polynomials.

The MacWilliams identity is usually written as the polynomial identity

WC(X,Y)=1CWC(X+(q1)Y,XY).W_{C^{\perp}}(X,Y) = \frac{1}{\card{C}} W_C\bigl(X+(q-1)Y,X-Y\bigr).

In this note, we look at this identity at the coefficient level. Write the weight distribution of a code CC as

Aw(C)={cC:wt(c)=w}(0wn).A_{w}(C) = \card{\{ c \in C: \wt(c) = w \}} \qquad (0 \leq w \leq n).

Then the weight enumerator is

WC(X,Y)=w=0nAw(C)XnwYw.W_{C}(X, Y) = \sum_{w = 0}^{n} A_{w}(C) X^{n - w} Y^w.

The MacWilliams identity is also a transform which computes the weight distribution vector

(A0(C),A1(C),,An(C))(A_{0}(C^{\perp}), A_{1}(C^{\perp}), \dots, A_{n}(C^{\perp}))

of the dual code from the weight distribution vector

(A0(C),A1(C),,An(C))(A_{0}(C), A_{1}(C), \dots, A_{n}(C))

of CC. The coefficients appearing in this transform are the Krawtchouk polynomials.

The route in this note is as follows.

  • Look at weight distributions.

  • Define Krawtchouk polynomials by a generating function.

  • Become familiar with the explicit formula and small examples.

  • See their meaning as character sums.

  • Check orthogonality and the Krawtchouk transform.

  • Derive the coefficient-level MacWilliams identity.

  • Return to the usual polynomial form.

Krawtchouk polynomials are a representative example of discrete orthogonal polynomials. In the classical world of orthogonal polynomials, Hermite polynomials, Laguerre polynomials, Jacobi polynomials, and so on are well known. These have orthogonality with respect to integrals over continuous intervals. On the other hand, Krawtchouk polynomials are orthogonal with respect to sums over the finite set {0,1,,n}\{0,1,\dots,n\}. This feature, being “orthogonal polynomials on a finite set”, fits very well with the Hamming weight.

The original paper on Krawtchouk polynomials is Krawtchouk [Kra29]. For their systematic place as orthogonal polynomials, Koekoek–Lesky–Swarttouw [KLS10, Chapter 9] is a standard reference. For Krawtchouk polynomials and the MacWilliams identity in coding theory, MacWilliams–Sloane [MS77, Chapter 5] is the classical standard reference.

From weight enumerators to weight distributions

First, we reinterpret the weight enumerator as a coefficient vector. Let EE be a finite set and let n=En=\card{E}. For a word cFqEc\in\mathbb{F}_{q}^{E}, write its Hamming weight as

wt(c)={eE:ce0}.\wt(c)=\card{\{e\in E:c_e\neq 0\}}.

Definition 2.1.

For a linear code CFqEC\leq \mathbb{F}_{q}^{E}, define

Aw(C)={cC:wt(c)=w}(0wn).A_w(C) = \card{\{c\in C:\wt(c)=w\}} \qquad (0\leq w\leq n).

The sequence

(A0(C),A1(C),,An(C))(A_0(C),A_1(C),\dots,A_n(C))

is called the weight distribution of CC.

Using this notation, the weight enumerator can be written as

WC(X,Y)=w=0nAw(C)XnwYw.W_C(X,Y) = \sum_{w=0}^{n}A_w(C)X^{n-w}Y^w.

The weight enumerator packages the weight distribution as a single polynomial in two variables.

In order to write the MacWilliams identity at the coefficient level, write the weight distribution of the dual code as

BjAj(C)(0jn).B_j \coloneqq A_j(C^{\perp}) \qquad (0\leq j\leq n).

The aim is to express BjB_j in terms of the numbers Aw(C)A_w(C). The form will be

Bj=1Cw=0nAw(C)Kj(w).B_{j} = \frac{1}{\card{C}} \sum_{w=0}^{n}A_w(C)\Kraw_j(w).

Here Kj(w)\Kraw_j(w) is a value of a Krawtchouk polynomial.

Thus, in the proof in this note, we view the MacWilliams identity as

a linear transform on the weight distribution vector.

The entries of the matrix describing this linear transform are the Krawtchouk polynomials.

Definition of Krawtchouk polynomials

We now define Krawtchouk polynomials. In this note, we use the normalisation adapted to the qq-ary Hamming space which naturally appears in coding theory. There are several conventions in the literature for the placement of indices and for normalisation, but here we use the following definition.

Definition 3.1.

Fix nn and qq. For 0wn0\leq w\leq n, define the numbers K0(w),K1(w),,Kn(w)\Kraw_0(w),\Kraw_1(w),\dots,\Kraw_n(w) by the generating function

j=0nKj(w)zj=(1+(q1)z)nw(1z)w.(3.1)\sum_{j=0}^{n}\Kraw_{j}(w)z^{j} = \bigl( 1 + (q - 1)z \bigr)^{n - w}(1 - z)^{w}. \tag{3.1}

That is, Kj(w)\Kraw_j(w) is the coefficient of zjz^j when the right-hand side is expanded as a polynomial in zz. By the explicit formula, this value is realised by substituting ww into a polynomial. That polynomial is called the degree-jj qq-ary Krawtchouk polynomial Kj(x)=Kj(n,q)(x)\Kraw_j(x)=\Kraw_j^{(n,q)}(x).

Here the subscript jj means “which Krawtchouk polynomial”, while the ww inside the parentheses means the Hamming weight substituted into that polynomial. Thus Kj(w)\Kraw_j(w) is the value of the jj-th polynomial Kj(x)\Kraw_j(x) at x=wx=w. In coding theory, one mainly uses the values for w=0,1,,nw=0,1,\dots,n.

With this definition, we do not first read the right-hand side with xx as a general variable. Instead, we first define the values for the weights w=0,1,,nw=0,1,\dots,n. The next explicit formula shows that Kj(x)\Kraw_j(x) is genuinely a polynomial in xx of degree at most jj, and that substituting the integer ww satisfies 3.1. In coding theory, one mainly uses it by substituting weights x=wx=w.

Expanding 3.1 gives the following explicit formula.

Theorem 3.2.

For 0jn0\leq j\leq n,

Kj(x)==0j(1)(q1)j(x)(nxj)(3.2)\Kraw_j(x) = \sum_{\ell=0}^{j} (-1)^{\ell}(q-1)^{j-\ell} \binom{x}{\ell}\binom{n-x}{j-\ell} \tag{3.2}

holds.

Proof

First fix 0wn0\leq w\leq n, and expand the right-hand side of the generating function by the binomial theorem:

(1+(q1)z)nw(1z)w=(a0(nwa)(q1)aza)(b0(wb)(1)bzb).\begin{aligned} &\bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w \\ ={}& \left( \sum_{a\geq 0}\binom{n-w}{a}(q-1)^a z^a \right) \left( \sum_{b\geq 0}\binom{w}{b}(-1)^b z^b \right) . \end{aligned}

Taking the coefficient of zjz^j, we have a+b=ja+b=j, so putting b=b=\ell gives

=0j(1)(q1)j(w)(nwj).\sum_{\ell=0}^{j} (-1)^{\ell}(q-1)^{j-\ell} \binom{w}{\ell}\binom{n-w}{j-\ell}.

This is equal to the coefficient Kj(w)\Kraw_j(w) of zjz^j on the left-hand side of the generating function. Replacing ww in the right-hand side by the variable xx gives the polynomial in (3.2).

End of proof

Here (x)\binom{x}{\ell} is read as the polynomial in xx defined by

(x)=x(x1)(x+1)!.\binom{x}{\ell} = \frac{x(x-1)\cdots(x-\ell+1)}{\ell!}.

Similarly, (nxj)\binom{n-x}{j-\ell} is also a polynomial in xx. When an integer x=wx=w is substituted, it agrees with the usual value of the binomial coefficient. In this sense, (3.2) gives Kj(x)\Kraw_j(x) as a polynomial in xx.

This formula already shows a coding-theoretic meaning. Fix one word of weight ww. Among the nn coordinates, ww coordinates are non-zero and nwn-w coordinates are zero. When considering another word of weight jj, one encounters choices in which \ell of its non-zero coordinates overlap with the non-zero coordinates of the fixed word, and the remaining jj-\ell lie on the zero-coordinate side. Krawtchouk polynomials arise when a signed sum is inserted into this counting. The point to notice is that this is not merely a sum of cardinalities. We shall explain it later as a character sum using a non-trivial additive character ψ\psi: at the positions overlapping with the non-zero coordinates of the fixed word, instead of simply counting the non-zero value in q1q-1 ways, the oscillating sum

aFq×ψ(acr)=1\sum_{a\in\mathbb{F}_{q}^{\times}}\psi(ac_r)=-1

appears. Therefore the contribution of the overlapping coordinates is not (q1)(q-1)^{\ell} but (1)(-1)^{\ell}. Viewing this value as a function of ww gives Kj(w)\Kraw_j(w), and the explicit formula expresses it as the polynomial Kj(x)\Kraw_j(x). This signed counting leads to the later interpretation as a character sum.

First examples

To become familiar with the definition, let us compute a few low-degree polynomials. The constant term of the generating function

j=0nKj(w)zj=(1+(q1)z)nw(1z)w\sum_{j=0}^{n}\Kraw_j(w)z^j = \bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w

for weight ww is 11, so

K0(x)=1.\Kraw_0(x)=1.

Next, looking at the coefficient of zz, for weight ww we get

K1(w)=(q1)(nw)w.\Kraw_1(w) = (q-1)(n-w)-w.

Written as a polynomial in xx, this is

K1(x)=(q1)(nx)x=(q1)nqx.\Kraw_1(x) = (q-1)(n-x)-x = (q-1)n-qx.

Thus K1(x)\Kraw_1(x) is a linear polynomial in xx.

In particular, in the binary case, that is, when q=2q=2, we have

K1(x)=n2x.\Kraw_1(x)=n-2x.

This is a quantity which appears very often in the binary Hamming space. A word of weight ww has nwn-w zero coordinates and ww non-zero coordinates. In the binary case, a non-zero coordinate contributes 1-1 with sign, while a zero coordinate contributes +1+1, so the difference

(nw)w=n2w(n-w)-w=n-2w

appears.

Example 4.1 (The binary case with n=3n=3).

Let q=2q=2 and n=3n=3. The generating function is

j=03Kj(w)zj=(1+z)3w(1z)w.\sum_{j=0}^{3}\Kraw_j(w)z^j = (1+z)^{3-w}(1-z)^w.

The values for x=0,1,2,3x=0,1,2,3 are as follows:

x=0x=1x=2x=3K0(x)1111K1(x)3113K2(x)3113K3(x)1111\begin{array}{c|rrrr} & x=0 & x=1 & x=2 & x=3 \\ \hline \Kraw_0(x) & 1 & 1 & 1 & 1 \\ \Kraw_1(x) & 3 & 1 & -1 & -3 \\ \Kraw_2(x) & 3 & -1 & -1 & 3 \\ \Kraw_3(x) & 1 & -1 & 1 & -1 \end{array}

We shall use this table later in the example of the binary repetition code. Here, notice that after fixing jj, we get a function of xx. For instance, K2(x)\Kraw_2(x) takes the values 3,1,1,33,-1,-1,3 at x=0,1,2,3x=0,1,2,3.

Combinatorial meaning of Krawtchouk polynomials

Krawtchouk polynomials are not merely polynomials defined by a generating function. In the Hamming space, they appear as the following natural sums.

Here, fix one non-trivial additive character of the finite field Fq\mathbb{F}_{q},

ψ ⁣:FqC×.\psi\colon \mathbb{F}_{q}\to \mathbb{C}^{\times}.

That is, ψ(a+b)=ψ(a)ψ(b)\psi(a+b)=\psi(a)\psi(b) holds, and ψ\psi is not identically 11. Finite fields have such additive characters. Indeed, if q=pmq=p^m and L ⁣:FqFpL\colon \mathbb{F}_{q}\to\mathbb{F}_{p} is a non-zero Fp\mathbb{F}_{p}-linear map, then

ψ(a)=exp(2πiL(a)/p)\psi(a)=\exp(2\pi i L(a)/p)

is a non-trivial additive character. Here L(a)FpL(a)\in\mathbb{F}_{p} is read through the representative 0,1,,p10,1,\dots,p-1. Character theory is not the main subject of this note, but in order to see how Krawtchouk polynomials arise from the Hamming space, we use just this one tool. We do not assume the general theory of characters, and prove only the properties needed below. Concretely, we use only the one-coordinate character sum proved here and the formula used later to extract the indicator function of the dual code. Thus the proof in this section is arranged so that it can be followed without systematic knowledge of character theory.

The needed property of character sums is the following single fact.

Lemma 5.1.

For bFqb\in\mathbb{F}_{q},

aFqψ(ab)={q,b=0,0,b0\sum_{a\in\mathbb{F}_{q}}\psi(ab) = \begin{cases} q, & b=0,\\ 0, & b\neq 0 \end{cases}

holds.

Proof

If b=0b=0, then ψ(ab)=ψ(0)=1\psi(ab)=\psi(0)=1 for every aa, so the sum is qq. Suppose b0b\neq 0. Then aaba\mapsto ab is a bijection of Fq\mathbb{F}_{q}, so

aFqψ(ab)=tFqψ(t).\sum_{a\in\mathbb{F}_{q}}\psi(ab) = \sum_{t\in\mathbb{F}_{q}}\psi(t).

Put the right-hand side equal to SS. Since ψ\psi is non-trivial, there is some dFqd\in\mathbb{F}_{q} such that ψ(d)1\psi(d)\neq 1. As tt runs through all of Fq\mathbb{F}_{q}, so does t+dt+d, hence

S=tFqψ(t+d)=tFqψ(t)ψ(d)=ψ(d)S.S = \sum_{t\in\mathbb{F}_{q}}\psi(t+d) = \sum_{t\in\mathbb{F}_{q}}\psi(t)\psi(d) = \psi(d)S.

Since ψ(d)1\psi(d)\neq 1, we get S=0S=0.

End of proof

Theorem 5.2.

Let cFqEc\in\mathbb{F}_{q}^{E} be a word of weight ww. Then, for every 0jn0\leq j\leq n,

uFqEwt(u)=jψ(uc)=Kj(w)(5.1)\sum_{\substack{u\in\mathbb{F}_{q}^{E}\\ \wt(u)=j}} \psi(u\cdot c) = \Kraw_j(w) \tag{5.1}

holds.

Proof

Compute the generating function obtained by collecting the left-hand side over all jj:

j=0n(uFqEwt(u)=jψ(uc))zj=uFqEψ(uc)zwt(u)=uFqErEψ(urcr)z1ur0=rE(aFqψ(acr)z1a0).\begin{aligned} \sum_{j=0}^{n} \left( \sum_{\substack{u\in\mathbb{F}_{q}^{E}\\ \wt(u)=j}} \psi(u\cdot c) \right)z^j &= \sum_{u\in\mathbb{F}_{q}^{E}} \psi(u\cdot c)z^{\wt(u)} \\ &= \sum_{u\in\mathbb{F}_{q}^{E}} \prod_{r\in E}\psi(u_r c_r)z^{\mathbf{1}_{u_r\neq 0}} \\ &= \prod_{r\in E} \left( \sum_{a\in\mathbb{F}_{q}} \psi(a c_r)z^{\mathbf{1}_{a\neq 0}} \right). \end{aligned}

Here 1ur0\mathbf{1}_{u_r\neq 0} is 11 when ur0u_r\neq 0, and 00 otherwise.

For a coordinate rr, if cr=0c_r=0, then

aFqψ(acr)z1a0=1+(q1)z.\sum_{a\in\mathbb{F}_{q}} \psi(a c_r)z^{\mathbf{1}_{a\neq 0}} = 1+(q-1)z.

On the other hand, if cr0c_r\neq 0, then aacra\mapsto ac_r is a bijection of Fq\mathbb{F}_{q}, and by Lemma 5.1,

aFqψ(acr)=0.\sum_{a\in\mathbb{F}_{q}}\psi(ac_r)=0.

Therefore

aFq×ψ(acr)=1.\sum_{a\in\mathbb{F}_{q}^{\times}}\psi(ac_r)=-1.

Hence

aFqψ(acr)z1a0=1z.\sum_{a\in\mathbb{F}_{q}} \psi(a c_r)z^{\mathbf{1}_{a\neq 0}} = 1-z.

Since cc has weight ww, there are ww coordinates with cr0c_r\neq 0 and nwn-w coordinates with cr=0c_r=0. Hence the generating function above is

(1+(q1)z)nw(1z)w.\bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w.

This is equal to the generating function of the Krawtchouk polynomials,

j=0nKj(w)zj.\sum_{j=0}^{n}\Kraw_j(w)z^j.

Comparing the coefficients of zjz^j gives the claim.

End of proof

This theorem expresses well the coding-theoretic meaning of Krawtchouk polynomials. Kj(w)\Kraw_j(w) is the sum of the character values ψ(uc)\psi(u\cdot c) over the words uu of weight jj, after fixing a word cc of weight ww. In other words, Krawtchouk polynomials are

oscillating sums obtained by viewing the weight-jj layer of the Hamming space from a word of weight ww.

Here we have explained this using character sums, but if one only looks at (3.2), it is also a sum of binomial coefficients counting the ways in which coordinates overlap. Krawtchouk polynomials as orthogonal polynomials bridge these two viewpoints.

Krawtchouk polynomials as orthogonal polynomials

We now see why Krawtchouk polynomials are called “orthogonal polynomials”. Orthogonality means, roughly, that the inner product of polynomials of different degrees is 00. For Krawtchouk polynomials, the inner product is defined by a sum over the finite set

{0,1,,n}.\{0,1,\dots,n\}.

For 0wn0\leq w\leq n, put

vw=(nw)(q1)w.v_w = \binom{n}{w}(q-1)^w.

This is the number of words of weight ww in FqE\mathbb{F}_{q}^{E}. Indeed, there are (nw)\binom{n}{w} ways to choose the ww coordinates which are non-zero, and each non-zero coordinate has q1q-1 possible values.

Definition 6.1.

For functions f,g ⁣:{0,1,,n}Cf,g\colon \{0,1,\dots,n\}\to\mathbb{C}, define

f,g=w=0n(nw)(q1)wf(w)g(w).\langle f,g\rangle = \sum_{w=0}^{n}\binom{n}{w}(q-1)^w f(w)\overline{g(w)}.

This inner product has as weights the sizes of the weight layers of the Hamming space. Krawtchouk polynomials are orthogonal with respect to this inner product. Since the values of Krawtchouk polynomials are real, the orthogonality formula is the same whether or not one writes complex conjugation.

Theorem 6.2 (Orthogonality of Krawtchouk polynomials).

For 0r,sn0\leq r,s\leq n,

w=0n(nw)(q1)wKr(w)Ks(w)=qn(nr)(q1)rδr,s(6.1)\sum_{w=0}^{n} \binom{n}{w}(q-1)^w \Kraw_r(w)\Kraw_s(w) = q^n \binom{n}{r}(q-1)^r \delta_{r,s} \tag{6.1}

holds. Here δr,s\delta_{r,s} is the Kronecker delta.

Proof

Compute the generating function in two variables:

H(z,t)w=0n(nw)(q1)w(r=0nKr(w)zr)(s=0nKs(w)ts)=w=0n(nw)(q1)w(1+(q1)z)nw(1z)w(1+(q1)t)nw(1t)w=w=0n(nw)((1+(q1)z)(1+(q1)t))nw((q1)(1z)(1t))w=((1+(q1)z)(1+(q1)t)+(q1)(1z)(1t))n.\begin{aligned} &H(z,t) \\ \coloneqq{}& \sum_{w=0}^{n}\binom{n}{w}(q-1)^w \left(\sum_{r=0}^{n}\Kraw_r(w)z^r\right) \left(\sum_{s=0}^{n}\Kraw_s(w)t^s\right) \\ &= \sum_{w=0}^{n}\binom{n}{w}(q-1)^w \bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w \bigl(1+(q-1)t\bigr)^{n-w}(1-t)^w \\ &= \sum_{w=0}^{n}\binom{n}{w} \left((1+(q-1)z)(1+(q-1)t)\right)^{n-w} \left((q-1)(1-z)(1-t)\right)^w \\ &= \left( (1+(q-1)z)(1+(q-1)t) +(q-1)(1-z)(1-t) \right)^n . \end{aligned}

Simplifying the expression inside the parentheses gives

(1+(q1)z)(1+(q1)t)+(q1)(1z)(1t)=q(1+(q1)zt).\begin{aligned} &(1+(q-1)z)(1+(q-1)t) +(q-1)(1-z)(1-t) \\ &\qquad= q\bigl(1+(q-1)zt\bigr). \end{aligned}

Therefore

H(z,t)=qn(1+(q1)zt)n=qnr=0n(nr)(q1)rzrtr.H(z,t) = q^n\bigl(1+(q-1)zt\bigr)^n = q^n\sum_{r=0}^{n}\binom{n}{r}(q-1)^r z^r t^r.

On the other hand, by the definition of H(z,t)H(z,t), the coefficient of zrtsz^r t^s is the left-hand side

w=0n(nw)(q1)wKr(w)Ks(w).\sum_{w=0}^{n}\binom{n}{w}(q-1)^w\Kraw_r(w)\Kraw_s(w).

Comparing coefficients gives (6.1).

End of proof

This theorem shows that K0,K1,,Kn\Kraw_0,\Kraw_1,\dots,\Kraw_n form an orthogonal basis for the function space on the finite set {0,1,,n}\{0,1,\dots,n\}. This is the important point. Krawtchouk polynomials are not merely convenient coefficients; they are the orthogonal basis naturally attached to the weight layers of the Hamming space.

The Krawtchouk transform

Using Krawtchouk polynomials, one can define a transform on vectors of length n+1n+1. This is the Krawtchouk transform.

Definition 7.1.

For a vector a=(a0,a1,,an)a=(a_0,a_1,\dots,a_n), define its Krawtchouk transform by

(Ka)j=w=0nawKj(w)(0jn).(\mathcal{K}a)_j = \sum_{w=0}^{n}a_w\Kraw_j(w) \qquad (0\leq j\leq n).

In matrix form, the Krawtchouk transform is the operation of multiplying by the (n+1)×(n+1)(n+1)\times(n+1) matrix

K=(Kj(w))0j,wn.K=(\Kraw_j(w))_{0\leq j,w\leq n}.

This matrix is sometimes called the Krawtchouk matrix.

The orthogonality shown just above concerns the inner product with weights

vw=(nw)(q1)w,v_w=\binom{n}{w}(q-1)^w,

and the sum has the form

wvwKr(w)Ks(w).\sum_w v_w\Kraw_r(w)\Kraw_s(w).

On the other hand, the self-inverse property shown here concerns multiplying the Krawtchouk matrix itself twice, and the sum has the form

jKr(j)Kj(w).\sum_j \Kraw_r(j)\Kraw_j(w).

The two are closely related, but the summation index and the placement of the weights differ, so we keep them separate here. Orthogonality is a formula for the weighted inner product of the functions Kr,Ks\Kraw_r,\Kraw_s, while self-inverseness is a formula for the product of the Krawtchouk matrix K=(Kj(w))K=(\Kraw_j(w)).

The Krawtchouk transform is its own inverse, up to normalisation.

Theorem 7.2.

For every vector a=(a0,,an)a=(a_0,\dots,a_n),

K2a=qna\mathcal{K}^{2}a=q^n a

holds. That is, applying the Krawtchouk transform twice multiplies the vector by qnq^n.

Proof

It is enough to show that, for all 0r,wn0\leq r,w\leq n,

j=0nKr(j)Kj(w)=qnδr,w(7.1)\sum_{j=0}^{n}\Kraw_r(j)\Kraw_j(w) = q^n\delta_{r,w} \tag{7.1}

holds. For fixed ww, collect the left-hand side into a generating function in rr:

r=0n(j=0nKr(j)Kj(w))tr=j=0nKj(w)(r=0nKr(j)tr)=j=0nKj(w)(1+(q1)t)nj(1t)j=(1+(q1)t)nj=0nKj(w)(1t1+(q1)t)j.\begin{aligned} \sum_{r=0}^{n} \left( \sum_{j=0}^{n}\Kraw_r(j)\Kraw_j(w) \right)t^r &= \sum_{j=0}^{n}\Kraw_j(w) \left( \sum_{r=0}^{n}\Kraw_r(j)t^r \right) \\ &= \sum_{j=0}^{n}\Kraw_j(w) \bigl(1+(q-1)t\bigr)^{n-j}(1-t)^j \\ &= \bigl(1+(q-1)t\bigr)^n \sum_{j=0}^{n}\Kraw_j(w) \left(\frac{1-t}{1+(q-1)t}\right)^j . \end{aligned}

Substitute

s=1t1+(q1)ts=\frac{1-t}{1+(q-1)t}

into the generating function of the Krawtchouk polynomials:

j=0nKj(w)sj=(1+(q1)s)nw(1s)w.\begin{aligned} \sum_{j=0}^{n}\Kraw_j(w)s^j &= \bigl(1+(q-1)s\bigr)^{n-w}(1-s)^w. \end{aligned}

Here

1+(q1)s=q1+(q1)t,1s=qt1+(q1)t.1+(q-1)s = \frac{q}{1+(q-1)t}, \qquad 1-s = \frac{qt}{1+(q-1)t}.

Therefore

(1+(q1)t)nj=0nKj(w)sj=(1+(q1)t)n(q1+(q1)t)nw(qt1+(q1)t)w=qntw.\begin{aligned} &\bigl(1+(q-1)t\bigr)^n \sum_{j=0}^{n}\Kraw_j(w)s^j \\ ={}& \bigl(1+(q-1)t\bigr)^n \left(\frac{q}{1+(q-1)t}\right)^{n-w} \left(\frac{qt}{1+(q-1)t}\right)^w \\ ={}& q^n t^w. \end{aligned}

Hence the coefficient of trt^r is qnq^n when r=wr=w, and 00 otherwise. This is (7.1), and the claim follows.

End of proof

This self-inverse property corresponds to the symmetry of the MacWilliams identity. Indeed, taking the dual of CC once more gives (C)=C(C^{\perp})^{\perp}=C. Also, for a linear code over a finite field,

CC=qn.\card{C}\card{C^{\perp}}=q^n.

This follows from dimC+dimC=n\dim C+\dim C^{\perp}=n. Thus the fact that the transform on weight distributions becomes multiplication by qnq^n after applying it twice is compatible with returning to the original code after taking the dual twice.

The weight distribution of the dual code and the Krawtchouk transform

We now prove the MacWilliams identity using Krawtchouk polynomials. First we prove the coefficient-level formula.

In this section too, fix one non-trivial additive character ψ\psi of the finite field Fq\mathbb{F}_{q}. We do not develop character theory itself in detail, but use only the following orthogonal-complement extraction formula.

Lemma 8.1.

For every uFqEu\in\mathbb{F}_{q}^{E},

1CcCψ(uc)={1,uC,0,uC(8.1)\frac{1}{\card{C}} \sum_{c\in C}\psi(u\cdot c) = \begin{cases} 1, & u\in C^{\perp},\\ 0, & u\notin C^{\perp} \end{cases} \tag{8.1}

holds.

Proof

If uCu\in C^{\perp}, then uc=0u\cdot c=0 for every cCc\in C, so each term is ψ(0)=1\psi(0)=1, and the sum is C\card{C}.

Suppose uCu\notin C^{\perp}. Then there exists c0Cc_0\in C such that d=uc00d=u\cdot c_0\neq 0. Since d0d\neq 0, the map aada\mapsto ad is a bijection of Fq\mathbb{F}_{q}. Since ψ\psi is non-trivial, there is some aFqa\in\mathbb{F}_{q} such that ψ(ad)1\psi(ad)\neq 1.

Put

S=cCψ(uc).S=\sum_{c\in C}\psi(u\cdot c).

Since CC is a linear code, ac0Cac_0\in C, and as cc runs through all of CC, c+ac0c+ac_0 also runs through all of CC exactly once. Therefore

S=cCψ(u(c+ac0))=cCψ(uc)ψ(a(uc0))=ψ(ad)S.\begin{aligned} S &= \sum_{c\in C}\psi(u\cdot(c+ac_0)) \\ &= \sum_{c\in C}\psi(u\cdot c)\psi(a(u\cdot c_0)) \\ &= \psi(ad)S. \end{aligned}

Since ψ(ad)1\psi(ad)\neq 1, we have S=0S=0.

End of proof

This lemma lets us convert sums over CC^{\perp} into sums over CC. We then substitute the Krawtchouk-polynomial sum (5.1).

Theorem 8.2 (Coefficient-level MacWilliams identity).

Let CFqEC\leq\mathbb{F}_{q}^{E} be a linear code. Put Aw=Aw(C)A_w=A_w(C) and Bj=Aj(C)B_j=A_j(C^{\perp}). Then, for every 0jn0\leq j\leq n,

Bj=1Cw=0nAwKj(w)(8.2)B_j = \frac{1}{\card{C}} \sum_{w=0}^{n}A_w\Kraw_j(w) \tag{8.2}

holds.

Proof

Writing the number of words of weight jj in the dual code using the indicator function, we have

Bj=uFqEwt(u)=j1C(u).B_j = \sum_{\substack{u\in\mathbb{F}_{q}^{E}\\\wt(u)=j}} \mathbf{1}_{C^{\perp}}(u).

Substituting Lemma 8.1, we get

Bj=1CuFqEwt(u)=jcCψ(uc)=1CcCuFqEwt(u)=jψ(uc).\begin{aligned} B_j &= \frac{1}{\card{C}} \sum_{\substack{u\in\mathbb{F}_{q}^{E}\\\wt(u)=j}} \sum_{c\in C}\psi(u\cdot c) \\ &= \frac{1}{\card{C}} \sum_{c\in C} \sum_{\substack{u\in\mathbb{F}_{q}^{E}\\\wt(u)=j}} \psi(u\cdot c). \end{aligned}

By Theorem 5.2, the inner sum is equal to

Kj(wt(c)).\Kraw_j(\wt(c)).

Therefore

Bj=1CcCKj(wt(c)).B_j = \frac{1}{\card{C}} \sum_{c\in C}\Kraw_j(\wt(c)).

Since there are AwA_w codewords of weight ww, this becomes

Bj=1Cw=0nAwKj(w).B_j = \frac{1}{\card{C}} \sum_{w=0}^{n}A_w\Kraw_j(w).
End of proof

This theorem is the coefficient version of the MacWilliams identity through Krawtchouk polynomials. In other words, the weight distribution of the dual code is the Krawtchouk transform of the weight distribution of the original code, divided by C\card{C}:

(A0(C),,An(C))=1CK(A0(C),,An(C)).(A_0(C^{\perp}),\dots,A_n(C^{\perp})) = \frac{1}{\card{C}}\mathcal{K}(A_0(C),\dots,A_n(C)).

Returning to the polynomial form of the MacWilliams identity

From the coefficient-level formula, we derive the usual polynomial form of the MacWilliams identity.

Theorem 9.1 (MacWilliams identity).

Let CFqEC\leq\mathbb{F}_{q}^{E} be a linear code. Then

WC(X,Y)=1CWC(X+(q1)Y,XY)W_{C^{\perp}}(X,Y) = \frac{1}{\card{C}} W_C\bigl(X+(q-1)Y,X-Y\bigr)

holds.

Proof

Put Aw=Aw(C)A_w=A_w(C) and Bj=Aj(C)B_j=A_j(C^{\perp}). By the coefficient-level formula (8.2),

WC(X,Y)=j=0nBjXnjYj=1Cj=0nw=0nAwKj(w)XnjYj=1Cw=0nAwj=0nKj(w)XnjYj.\begin{aligned} W_{C^{\perp}}(X,Y) &= \sum_{j=0}^{n}B_jX^{n-j}Y^j \\ &= \frac{1}{\card{C}} \sum_{j=0}^{n} \sum_{w=0}^{n}A_w\Kraw_j(w)X^{n-j}Y^j \\ &= \frac{1}{\card{C}} \sum_{w=0}^{n}A_w \sum_{j=0}^{n}\Kraw_j(w)X^{n-j}Y^j. \end{aligned}

Now use the generating function of the Krawtchouk polynomials. The expression Y/XY/X here is a formal calculation, and ultimately means that both sides agree as polynomials in X,YX,Y.

j=0nKj(w)XnjYj=Xnj=0nKj(w)(YX)j=Xn(1+(q1)YX)nw(1YX)w=(X+(q1)Y)nw(XY)w.\begin{aligned} \sum_{j=0}^{n}\Kraw_j(w)X^{n-j}Y^j &= X^n\sum_{j=0}^{n}\Kraw_j(w)\left(\frac{Y}{X}\right)^j \\ &= X^n \left(1+(q-1)\frac{Y}{X}\right)^{n-w} \left(1-\frac{Y}{X}\right)^w \\ &= \bigl(X+(q-1)Y\bigr)^{n-w}(X-Y)^w. \end{aligned}

Therefore

WC(X,Y)=1Cw=0nAw(X+(q1)Y)nw(XY)w=1CWC(X+(q1)Y,XY).\begin{aligned} W_{C^{\perp}}(X,Y) &= \frac{1}{\card{C}} \sum_{w=0}^{n}A_w \bigl(X+(q-1)Y\bigr)^{n-w}(X-Y)^w \\ &= \frac{1}{\card{C}} W_C\bigl(X+(q-1)Y,X-Y\bigr). \end{aligned}

This proves the MacWilliams identity.

End of proof

In this proof, the variable substitution in the MacWilliams identity,

XX+(q1)Y,YXY,X\mapsto X+(q-1)Y, \qquad Y\mapsto X-Y,

appeared from the generating function of the Krawtchouk polynomials. Thus the MacWilliams identity can be viewed as

the Krawtchouk transform repackaged by the generating function called the weight enumerator.

A small example: the binary repetition code of length 33

Let us check the coefficient-level formula in a small example. Consider the repetition code over F23\mathbb{F}_{2}^{3},

C={000,111}.C=\{ 000, 111 \}.

Its weight distribution is

(A0,A1,A2,A3)=(1,0,0,1).(A_0,A_1,A_2,A_3)=(1,0,0,1).

The dual code is

C={uF23:u1+u2+u3=0},C^{\perp} = \{u\in\mathbb{F}_{2}^{3}:u_1+u_2+u_3=0\},

namely the set of all words of even weight. Therefore

C={000,011,101,110},C^{\perp}=\{000,011,101,110\},

and its weight distribution is

(B0,B1,B2,B3)=(1,0,3,0).(B_0,B_1,B_2,B_3)=(1,0,3,0).

We check that this comes from the Krawtchouk transform.

By Example 4.1, the Krawtchouk table for q=2q=2 and n=3n=3 is

w=0w=1w=2w=3K0(w)1111K1(w)3113K2(w)3113K3(w)1111\begin{array}{c|rrrr} & w=0 & w=1 & w=2 & w=3 \\ \hline \Kraw_0(w) & 1 & 1 & 1 & 1 \\ \Kraw_1(w) & 3 & 1 & -1 & -3 \\ \Kraw_2(w) & 3 & -1 & -1 & 3 \\ \Kraw_3(w) & 1 & -1 & 1 & -1 \end{array}

The coefficient-level MacWilliams identity says

Bj=1Cw=03AwKj(w).B_j = \frac{1}{\card{C}} \sum_{w=0}^{3}A_w\Kraw_j(w).

Here C=2\card{C}=2, and A0=A3=1A_0=A_3=1, A1=A2=0A_1=A_2=0, so

Bj=12(Kj(0)+Kj(3)).B_j = \frac{1}{2}\bigl(\Kraw_j(0)+\Kraw_j(3)\bigr).

From the table,

B0=12(1+1)=1,B1=12(33)=0,B2=12(3+3)=3,B3=12(11)=0.\begin{aligned} B_0 &= \frac{1}{2}(1+1)=1,\\ B_1 &= \frac{1}{2}(3-3)=0,\\ B_2 &= \frac{1}{2}(3+3)=3,\\ B_3 &= \frac{1}{2}(1-1)=0. \end{aligned}

Therefore

(B0,B1,B2,B3)=(1,0,3,0),(B_0,B_1,B_2,B_3)=(1,0,3,0),

which indeed agrees with the weight distribution of CC^{\perp}.

In polynomial form,

WC(X,Y)=X3+Y3,W_C(X,Y)=X^3+Y^3,

and the right-hand side of the MacWilliams identity is

12WC(X+Y,XY)=12((X+Y)3+(XY)3)=X3+3XY2.\begin{aligned} \frac{1}{2}W_C(X+Y,X-Y) &= \frac{1}{2}\left((X+Y)^3+(X-Y)^3\right)\\ &= X^3+3XY^2. \end{aligned}

This agrees with

WC(X,Y)=X3+3XY2.W_{C^{\perp}}(X,Y)=X^3+3XY^2.

In this example, multiplying each row jj of the Krawtchouk table by the weight distribution (1,0,0,1)(1,0,0,1) gives the jj-th coefficient of the weight distribution of the dual code.

What Krawtchouk polynomials were doing in this proof

Let us organise the roles played by Krawtchouk polynomials in the proof.

First, Krawtchouk polynomials were the transform coefficients for weight distributions. Writing the weight distribution of CC as Aw=Aw(C)A_w=A_w(C) and that of CC^{\perp} as Bj=Aj(C)B_j=A_j(C^{\perp}), we had

Bj=1Cw=0nAwKj(w).B_j = \frac{1}{\card{C}} \sum_{w=0}^{n}A_w\Kraw_j(w).

This is the MacWilliams identity viewed at the coefficient level.

Second, Krawtchouk polynomials were oscillating sums over weight layers of the Hamming space. After fixing a word cc of weight ww, we had

uFqEwt(u)=jψ(uc)=Kj(w).\sum_{\substack{u\in\mathbb{F}_{q}^{E}\\\wt(u)=j}} \psi(u\cdot c) = \Kraw_j(w).

Thus Kj(w)\Kraw_j(w) is the sum obtained by viewing the layer of weight jj from a word of weight ww.

Third, Krawtchouk polynomials formed an orthogonal basis. On the finite set {0,,n}\{0,\dots,n\}, with weights

(nw)(q1)w,\binom{n}{w}(q-1)^w,

the polynomials K0,,Kn\Kraw_0,\dots,\Kraw_n are mutually orthogonal. This shows that Krawtchouk polynomials are not merely computational coefficients, but orthogonal polynomials attached to the distance structure of the Hamming space.

Fourth, the generating function produced the MacWilliams variable substitution. Formally substitute z=Y/Xz=Y/X into

j=0nKj(w)zj=(1+(q1)z)nw(1z)w.\sum_{j=0}^{n}\Kraw_j(w)z^j = \bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w.

The Y/XY/X here is a formal substitution, and the final result is an equality of polynomials in X,YX,Y:

j=0nKj(w)XnjYj=(X+(q1)Y)nw(XY)w.\sum_{j=0}^{n}\Kraw_j(w)X^{n-j}Y^j = \bigl(X+(q-1)Y\bigr)^{n-w}(X-Y)^w.

This is the variable substitution in the polynomial form of the MacWilliams identity.

Thus, in the proof in this note, Krawtchouk polynomials played four roles:

  1. transform coefficients for computing the weight distribution of the dual code,

  2. oscillating sums over weight layers of the Hamming space,

  3. orthogonal polynomials on a finite set,

  4. the generating function which produces the MacWilliams variable substitution.

Concepts seen in this part

In this part, while aiming at a proof of the MacWilliams identity, we introduced the basic tools around Krawtchouk polynomials. Here is the summary.

Weight distribution

This is the sequence defined for a code CC by

Aw(C)={cC:wt(c)=w}.A_w(C)=\card{\{c\in C:\wt(c)=w\}}.

The weight enumerator packages the weight distribution as a polynomial in two variables.

Krawtchouk polynomials

First define their values at weights w=0,1,,nw=0,1,\dots,n by the generating function

j=0nKj(w)zj=(1+(q1)z)nw(1z)w.\sum_{j=0}^{n}\Kraw_j(w)z^j = \bigl(1+(q-1)z\bigr)^{n-w}(1-z)^w.

By the explicit formula, these values come from polynomials Kj(x)\Kraw_j(x) in xx. In coding theory, one mainly uses them by substituting weights 0,1,,n0,1,\dots,n for xx.

Explicit formula

Krawtchouk polynomials can be written as

Kj(x)==0j(1)(q1)j(x)(nxj).\Kraw_j(x) = \sum_{\ell=0}^{j} (-1)^{\ell}(q-1)^{j-\ell} \binom{x}{\ell}\binom{n-x}{j-\ell}.

This can be interpreted as a sum in which binomial coefficients counting the ways coordinates overlap are combined with signed contributions coming from character values.

Orthogonality

K0,,Kn\Kraw_0,\dots,\Kraw_n are orthogonal on the finite set {0,,n}\{0,\dots,n\} with weights

(nw)(q1)w.\binom{n}{w}(q-1)^w.

This weight is the number of words of weight ww in the Hamming space.

Krawtchouk transform

This is the transform defined for a vector a=(a0,,an)a=(a_0,\dots,a_n) by

(Ka)j=w=0nawKj(w).(\mathcal{K}a)_j=\sum_{w=0}^{n}a_w\Kraw_j(w).

Applying this transform twice multiplies the vector by qnq^n.

Coefficient-level MacWilliams identity

If Aw=Aw(C)A_w=A_w(C) and Bj=Aj(C)B_j=A_j(C^{\perp}), then

Bj=1Cw=0nAwKj(w).B_j = \frac{1}{\card{C}} \sum_{w=0}^{n}A_w\Kraw_j(w).

This means that the weight distribution of the dual code is obtained by applying the Krawtchouk transform to the weight distribution of the original code.

At first, orthogonal polynomials may look like analytic objects. However, Krawtchouk polynomials are highly discrete orthogonal polynomials: their orthogonality is expressed entirely by sums over a finite set. This is why they fit so well with Hamming spaces and coding theory.

Looking back at the proof family of this note

As stated at the beginning, the proof in this note can be read as an entrance to

the orthogonal-polynomial and association-scheme approach.

On the surface, it is a coefficient calculation using Krawtchouk polynomials. We also used character sums as a minimal tool to prove the coefficient-level MacWilliams identity. But at a deeper level, the orthogonal polynomials attached to the distance structure of the Hamming space control the transform of weight distributions.

The main point of the proof in this note can be summarised in one sentence:

The MacWilliams identity is the Krawtchouk transform on weight distributions.

From the Fourier viewpoint, one handles a transform on the whole of FqE\mathbb{F}_{q}^{E}. In the viewpoint of this note, that information has been compressed down to “weights only”. Instead of distinguishing all words, we look only at the coefficients for weights 0,1,,n0,1,\dots,n. In this compressed world, Krawtchouk polynomials appear as the shadow of the Fourier-type transform.

This viewpoint naturally leads, in a more advanced direction, to association schemes. In the Hamming space, pairs of words can be divided according to whether their distance is 0,1,,n0,1,\dots,n. The commutative algebra built from these relations is the Bose–Mesner algebra of the Hamming association scheme. The eigenvalues which appear there are precisely the Krawtchouk polynomials. Delsarte [Del73] is the classical starting point for Delsarte's association-scheme approach to coding theory, and Delsarte–Levenshtein [DL98] surveys the relationship between association schemes and coding theory. Levenshtein [Lev95] is also a representative reference for the appearance of Krawtchouk polynomials in bounds for codes and designs in the Hamming space.

Further direction: towards association schemes

We have now completed the viewpoint on the MacWilliams identity using Krawtchouk polynomials, which was the aim of this note. From here, as a more advanced viewpoint, let us introduce the direction of reinterpreting the Krawtchouk polynomials appearing in this note in the language of association schemes.

In the proof in this note, we defined Krawtchouk polynomials by a generating function and used them as transform coefficients for weight distributions. But why are these polynomials naturally attached to the Hamming space? The answer lies in the matrices built from distance relations in the Hamming space.

Take the set of all words of length nn as the vertex set, and write MjM_j for the matrix which records whether the distance between two words is jj. Here MjM_j is the adjacency matrix of the distance relation. These matrices form a mutually commuting algebra. When this algebra is simultaneously diagonalised, Krawtchouk polynomials appear as its eigenvalues.

Thus, in the more advanced viewpoint, the main flow is

Hamming distance → adjacency matrices → Bose–Mesner algebra → Krawtchouk polynomials as eigenvalues

The MacWilliams identity appears in this eigenvalue theory as the dual transform of weight distributions.

Even for the same Krawtchouk polynomials, the view changes substantially depending on whether one sees them, as in this note, as “orthogonal polynomials defined by a generating function”, or more advancedly as “eigenvalues of the Hamming scheme”. This difference is the natural entrance to association schemes.

References

  1. [Kra29] M. Krawtchouk. Sur une généralisation des polynômes d'Hermite. Comptes rendus hebdomadaires des séances de l'Académie des sciences, vol. 189, pp. 620–622, 1929
  2. [KLS10] Roelof Koekoek, Peter A. Lesky, and René F. Swarttouw. Hypergeometric orthogonal polynomials and their q-analogues. Springer-Verlag, Berlin, pp. xx+578, 2010. doi:10.1007/978-3-642-05014-5
  3. [MS77] F. J. MacWilliams and N. J. A. Sloane. The theory of error-correcting codes. I. North-Holland Publishing Co., Amsterdam-New York-Oxford, vol. Vol. 16, pp. i–xv and 1–369, 1977
  4. [Del73] P. Delsarte. An algebraic approach to the association schemes of coding theory. Philips Res. Rep. Suppl., no. 10, pp. vi+97, 1973
  5. [DL98] Philippe Delsarte and Vladimir I. Levenshtein. Association schemes and coding theory. IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2477–2504, 1998. doi:10.1109/18.720545
  6. [Lev95] Vladimir I. Levenshtein. Krawtchouk polynomials and universal bounds for codes and designs in Hamming spaces. IEEE Trans. Inform. Theory, vol. 41, no. 5, pp. 1303–1321, 1995. doi:10.1109/18.412678

This series

A Series Learning through the MacWilliams Identity · Part 4 of 12

Back to series list

Disclaimer

Articles on this site are based on the operator's personal understanding, investigation, and research notes. I try to keep the content accurate, but it may contain errors or incomplete explanations. I do not guarantee its accuracy, completeness, usefulness, or currentness.

Please use the information on this site at your own judgment and responsibility. To the extent permitted by law, the operator is not liable for damages, losses, or disadvantages arising from using, or being unable to use, information on this site.

If you notice an error, unclear explanation, broken link, or insufficient citation, please contact the operator. I will review the content and, when appropriate, correct, update, or remove it.