## Backwards random codebook generation

$$X \longrightarrow \fbox{$\phantom{\int}P_{Y|X}\phantom{\int}$}\longrightarrow Y$$ The information capacity of this channel is $C=\max_{P_X} I(X;Y)$. Any rate $C-\varepsilon$ can be achieved by fixing a large-enough blocklength $n$ and associating each message $m\in \{1,\dots,2^{n(C-\varepsilon)}\}$ with a codeword $x^n_m\in \mathcal{X}^n$, each component of $x^n_m$ i.i.d. $\sim P^\ast_X$ where $P^\ast_X$ is the distribution that maximizes $I(X;Y)$. The distributions these codewords … Read more