Wide-Field Direct Radio Imaging

After implementation of the EPIC correlator on the Long Wavelength Array in New Mexico, my attention turned to understanding how to solve the problem of imaging wide fields of view with a direct imaging telescope such as EPIC. In radio interferometry this is a serious problem for arrays that are not built on a flat surface (this is impossible for truly gargantuan arrays like the Square Kilometer Array due to the curvature of the Earth) or are observing a field that is not currently at zenith. This can be understood from the interferometry measurement equation: $$I(l,m,n) = \iiint V(u,v,w) \exp \big[ 2\pi i \big(ul + vm + w \big( 1-\sqrt{1-l^2 – m^2} \big) \big) \big] du\: dv\: dw$$ The $ w$ term represents the non-coplanar term with $u,v$ representing a flat plane. $ u,v,w $ form a set of orthonormal bases.

This problem has been well studied for years in interferometric imaging using the normal data product outputted by a radio correlator: the visibility. (See Thompson, Moran and Swenson for an introduction). Standard practice involves convolving the visibility with a $ w$-kernel which represents a fourier transform of the fresnel kernel $$G(l,m,w) = \exp \big[ 2 \pi i w \big(1-\sqrt{1-l^2 -m^2} \big) \big]$$. This opposite can be done by the multiplication-convolution theorem in a method known as $w$-stacking. Both are quite accurate methods which have been optimised well with modern data reduction packages such as CASA or wsclean.

Unfortunately for direct radio imaging, that does not form the visibilities directly, these methods do not fit as easily. Mathematically they are both exactly the same as the direct imaging formula can be re-arranged (by the multiplication-convolution theorem again) into visibility imaging and vice versa. However direct imaging is held to a hard real-time requirement; if it is not possible to do the correction for the $w$ term in real-time, then the whole direct-imaging system fails.

Now my motivation here comes from a great idea that Landman Bester came up with off the cuff one day on a sweltering Cambridge afternoon: “Perhaps you could use a DFT?” mentioning the direct implementation if the discrete fourier transform. At first I balked, however it turns out after some investigations by me and Prof. Steve Gull, it is absolutely possible, and could not only be possible but extremely effective, however we realised the maths needs to be re-arranged to reveal this property. The key to this solution is not coming up with new maths (the theory has been set in stone for decades), but re-arranging the maths in a way that is optimal for modern compute hardware; it is a classic case of being where the rubber meets the road, or rather working out how it should meet the road.

The secret sauce is to realise a DFT is simply a linear operator with a singular mathematical operation: a matrix multiplication: $$ S = \| DX \|^2 $$ where S represents a vector of sky pixels of size $ N_s$, and X a vector of electric fields at each time-step of size $N_a$, the number of antennas. $D$ is the DFT matrix which is of the form $$ D = \frac{1}{N}\begin{bmatrix} \omega_{0,0} & \omega_{0,j} & \dots & \omega_{0,N_A} \\ \omega_{i,0} & \ddots & & \omega_{i,N_A} \\ \vdots & & \ddots & \vdots \\ \omega_{N_K,0} & \dots & & \omega_{N_K,N_A} \\ \end{bmatrix} \mbox{,} $$ where $$ w_{i,j} = \exp \big[ -2 \pi i k_i \cdot r_j \big] $$ with $ i $ representing the index of the sky cosine co-ordinate being sampled, and $j$ the index of the antenna. Time batching is also possible by promoting X, and thus S, to matrices with $ N_T $ columns.

The overall implementation and benchmarks showing this works is covered in a new paper: arxiv:1909.03973