Written by: Joppe Van Rumst
Canonical correlation analysis (CCA) is a state-of-the-art classification method for SSVEP. The goal of the method is to find the optimal linear transformation for each of the stimulated frequencies such that the correlation between two matrices, the signal and an assumption matrix for a given frequency, is optimized. The optimized correlations between an unseen signal and the assumption matrices for all frequencies can then be used to determine which frequency was attended. CCA was first applied by Lin et al.. At that time, it outperformed the best methods for SSVEP classification, such as power spectrum density analysis.
In order to use CCA, two matrices of which we want to calculate their correlation are defined. In this case, the first matrix is the multichannel EEG signal, and the second matrix contains the assumptions. These assumptions are the fundamental frequencies of one of the different targets presented with both a sine and cosine representation.
For better accuracy, the sine and cosine representation of the harmonics of the target signal could be added to the assumption matrix.
The EEG signals and the assumption matrix make a weighted linear combination. So the weights will, on the one hand, linearly combine the different EEG channels into a scalar value and, on the other hand, combine the sine and cosine components of the target signals and harmonics. Afterwards, the weighted values are summed up to get a scalar value of the multichannel EEG signal and a scalar value of the assumptions. These weights change to maximize the correlation between the two scalar values. This process is repeated for every target frequency, and the target with the highest correlation would be the target where the subject is gazing at.
The maths behind the method can be best explained by the following figure: from Pan et al..
Figure 1. CCA scheme. (Pan et al., 2011)
In the following derivations, three variables are defined:
To classify SSVEP signals with CCA, we construct a CCA model
Now we define the weights:
The idea of CCA is to find
The correlation value is saved for all the different stimulation frequencies. The one with the highest correlation value is the winner.
Figure 2. Extended CCA scheme. (Nakanishi et al., 2015)
To explain extended CCA, you first must understand the basic principles of Individual template CCA (IT-CC). This method was first introduced to detect temporal features of EEG signals using the canonical correlation between the test data and individual template signals. Here, the correlation is not calculated between the signal and precalculated template signals but between the signal and template signals determined from responses obtained in a training phase.
In the case of SSVEP, the template signal is calculated for each frequency. For a given set of
$$ \rho_f = \max{W_{x,f},W_{y,f}}\frac{E[W_{x,f}^\intercal X \bar{X}f^\intercal W{y,f}]}{\sqrt{E[W_{x,f}^\intercal XX^intercal W_{x,f}]E[W_{y,f}^\intercal \bar{\mathcal X_n} \bar{X}f^\intercal W{y,f}]}} $$
Extended CCA is a combination of CCA and IT-CCA. Correlation coefficients between projections of a test set
where
The
The following flowchart can explain the implementation. The code implementing this method can be found here, which contains both regular CCA and extended CCA. In the current implementation, only regular CCA is used.
Figure 3. CCA implementation scheme.
The filtered data from the preprocessing, together with a template containing sine and cosine signals from one reference frequency and its harmonics, is put into the CCA module. The CCA module is imported from the Scikit-Learn library. This function returns the corresponding weighting vectors explained above. Afterwards, we apply these weighting vectors to the template and the data. Finally, we can calculate the correlation between the signals and the template. This value is stored. The process is repeated for every reference signal. The reference with the highest correlation value is picked as the winner.
The dots indicate how we could upgrade the regular CCA to the extended CCA. We could increase the method's accuracy by adding training data to the template matrix. This data is first averaged for each frequency while separating the different channels. The final template will have dimensions (`number of frequencies x number of channels x number of samples).