Shane Chu

Convolutions

The textbook definition of (linear) convolution is as follows: a convolution of two vectors fRn\bm f\in\mathbb R^n and xRm\bm x\in\mathbb R^m is a vector fxRn+m1\bm f * \bm x\in\mathbb R^{n+m-1}, where

(fx)[k]=i+j=k+1f[i]x[j],(\bm f * \bm x)[k] = \sum_{i+j=k+1}f[i] x[j],

i.e., the kk-th entry of the vector fx\bm f * \bm x is the sum of all product f[i]x[j]\bm f[i]\bm x[j] such that the indices i,ji,j satisfy the condition i+j=k+1i+j=k+1.

This definition doesn't show convolution is related to applcations (it's more useful for proof exercises). Ok, what is an intuitive way to think about convolution?

Note that convolution is a linear operation, for which we can write it down as a matrix-vector multiplication. To make things concrete, suppose f=(f1,f2,f3)\bm f=(f_1,f_2,f_3)^\top and x=(x1,x2,x3,x4)\bm x=(x_1,x_2,x_3,x_4)^\top, then we can write

fx=(f1000f2f100f3f2f100f3f2f100f3f2000f3)(x1x2x3x4),\bm f * \bm x = \begin{pmatrix} f_1 & 0 & 0 & 0 \\ f_2 & f_1 & 0 & 0 \\ f_3 & f_2 & f_1 & 0 \\ 0 & f_3 & f_2 & f_1 \\ 0 & 0 & f_3 & f_2 \\ 0 & 0 & 0 & f_3 \end{pmatrix}\begin{pmatrix}x_1 \\ x_2 \\ x_3 \\ x_4 \end{pmatrix},

which is the same as what's going on in equation (1).

What's interesting from this perspective is when the vector x\bm x is sparse, i.e. x\bm x has mostly zero entries. Let's say x=(0,1,0,0)\bm x=(0,1,0,0)^\top. Substitute this to the above, we have that

fx=(f1000f2f100f3f2f100f3f2f100f3f2000f3)(0100)=(0f1f2f300)\bm f * \bm x = \begin{pmatrix} f_1 & 0 & 0 & 0 \\ f_2 & f_1 & 0 & 0 \\ f_3 & f_2 & f_1 & 0 \\ 0 & f_3 & f_2 & f_1 \\ 0 & 0 & f_3 & f_2 \\ 0 & 0 & 0 & f_3 \end{pmatrix}\begin{pmatrix}0 \\ 1 \\ 0 \\ 0 \end{pmatrix}=\begin{pmatrix} 0 \\ f_1 \\ f_2 \\ f_3 \\ 0 \\ 0 \end{pmatrix}

What does this say? If we think about that fx\bm f * \bm x as some kind of signal, then this is saying that f\bm f as a feature is presented in the span from the 2nd to the 4th entries.

We can leverage this intuition to think about applications related to bioinformatics. If the signal is a DNA string, and the feature represents a motif, e.g., it could be a k-mer, or a position frequency matrix, etc., then we have something meaningful: the sparse vector x\bm x indicates where the motif f\bm f is presented in the DNA string. Moreover, the non-zero entry in x\bm x doesn't need to be 11. It could be an arbitrary value signifying the strength of the presense of f\bm f in the DNA string.

The description above is just an analogy. To develop this idea further, we look at a classic problem in signal processing: convolutional dictionary learning.

Contact me by E-mail | Github | Linkedin
This work is licensed under CC BY-SA 4.0. Last modified: January 15, 2025.
Website built with Franklin.jl and the Julia language.