Example Case: Strong Locomotion Artifacts in ICA Decomposition

Example Case: Strong Locomotion Artifacts in ICA Decomposition

This post is still being written and revised, so please take any information you can use, but beware of its alpha status.

On this page I take an in depth look at an a-typical ICA decomposition of a data set consisting of concatenated recordings of a single subject across four conditions (sitting, standing, walking and cycling). Even though EEG researchers more experienced with pre-processing data for ICA probably would no longer face the situation below, I believe it is interesting to do a closer examination of how this decomposition came about, as it reveals a lot about what ICA ‘does’ in subjective terms, and particularly what challenges one faces when applying the algorithm to mobile brain EEG recordings. Moreover, my aim was to find out – given that getting movement artifacts in the mobile brain imaging data is inevitable – what data analyses one can still safely perform

Concatenating the data was recommended to me by Prof. Dr. Klaus Gramann of the BeMoBI(L) Lab at Technische Universität Berlin, who argued that if the primary aim is to uncover (assumed) spatially stable neural sources of activity, it should not matter if different conditions (recorded in a single subject) are fed into the same ICA. It should be noted that ICA is said to require relative stationarity of the signals it is given, which in our study is naturally strongly affected by the different passive and active conditions, which introduce not only movement-related muscle activity and movement artifacts, but also movement-related brain activity. However, once enough channel data is available (and the degrees of freedom increase accordingly) it should present less of an issue.

Why are degrees of freedom important for ICA? And how much 'freedom' is enough?

The higher the degrees of freedom in the data (in EEG defined by the number of channels), the higher number of components an ICA can produce, and the more freedom the algorithm has to sort parts of the signal into different components.

Imagine a noisy linear mixture of three independent, equally strong waveforms that would have to be explained by using signal mixtures registered by four sensors. Assuming the noise is independent for each channel, but explains little variance in the signal, ICA should have no problem finding the three main components, and likely ends up filling up the fourth component with mostly noise registered at a single channel (the remaining, low intensity noise gets mixed in with every other component). Now assume the opposite situation, where one has a noisy linear mixture of four waveforms that one has recorded with three sensors and then attempts to decompose into three components. This time, it is not only the low intensity noise that gets mixed into one or multiple other components, but a whole, high energy waveform. This results in a situation where one cannot reliably draw conclusions from the algorithm’s outcome, as it will be anyone’s guess how the remaining signal of the unaccounted-for component was divided among the three ‘winners’.

One of the consequences of this is also that ICA decompositions of data containing high energy artifactual components drastically reduces the ‘freedom’ for components originating from neural sources.

 

In fact, with the relatively short duration of each condition in this particular experiment (around 7-9 minutes) concatenation of the data proved essential to get clean dipolar results among the first 30-40 components (see below) – which is indicative of a successful blind source separation (Delorme, Palmer, Onton, Oostenveld & Makeig, 2012). We assumed the neural sources of the signals of interest remained stable across conditions, as our focus has been on early visual processing.

Subject ‘Ex’

Back projected 2D scalp maps of components 1-40 of example subject (“Ex”) – Screenshot from EEGLab.

The experiment from which this data was taken (‘BM2’) had participants perform a simple visual task that required fast left or right thumb-presses in response to features of laterally presented single target pop-out stimuli among a circular array of distractors. Data was 1Hz High-Pass filtered and visually inspected to remove a-typical signals and excessive EMG activity. It was then re-referenced to an average reference and fed into the AMICA algorithm (single model; max 2000 iterations) with degrees of freedom corrected for inclusion of the original reference channel and, wherever applicable, interpolated channels (-in this case none).

As expected, the first IC that came out of the ICA contained nicely isolated blink related signals, judged by its clean projection to frontal sites. However, going through the following maps, it immediately became apparent that electrode CP2 must have picked up some remarkable (and strong!) signals. It not only takes center-stage in the back projections of both components 2 and 4 (respectively explaining away the second and fourth largest amounts of variance/energy in the signal), but it also managed to make cameo appearances in maps 6, 9, 11, 13, 15, 19, 23, somewhat in 37 and finally 38.

BM2 Subject ‘Ex’ – Detailed properties of IC’s 2 and 4 displaying clearly that almost all the energy contained in them originates from only one condition. (Beware that the scales in both figures differ.) – Screenshots from EEGLab.

Generally, when back projected scalp maps reveal such focal components, with virtually all activity projecting to a single electrode, this is tends to be mere noise resulting from a bad connection/high impedance. This can cause random spiking patterns that correlate with no other sensor level activity, and as such are highly  temporally and spatially isolated. In this case, there was clearly more going on, as becomes clear from the detailed plots above.

Another look at all the back projected components reveals that a clear eye-movement (HEOG) component is missing, with no clean laterally oriented dipole at fronto-lateral electrode sites to be found. Instead we see numerous components that account for only parts of the frontal eye-movement activity, and that are grouped with activity at CP2 (ICs 6 and 9 mostly, but also 11, 13, 15, 27). Being familiar with the nature of ICA, this tells us that the eye-movement activity was sufficiently correlated with parts of the signal recorded at CP2 to not be considered independent. This is an important thing to keep in mind, which I’ll return to later. In short, this must be the result of our movement conditions, that caused not just cable and sensor movement across the head, but naturally also triggered the vestibulo-ocular reflex while subjects kept their gaze aimed at the fixation cross.

Raw and Component Data Scrolls

BM2 Subject ‘Ex’ – Raw channel data segment recorded during walking – Screenshot from EEGLab.

Looking at the raw data of the walking condition, it is very clear that particularly sensor CP2 recorded large, very stable movement artifacts. (Likely, this was because there was tension on the cable, causing the cable bunch to pull on the electrode with every step.) What is also apparent is that the amplitude of that signal is immense, and reaches peak-to-peak levels normally only measured in blinks.

These two highly consistent patterns appear like an easy case for ICA to disentangle into separate components, very similar to some examples others have given of what the the technique is capable of. Interesting to note here is that based on the raw data alone, one would be able to identify blink occurrences from nearly every single channel displayed above. On the contrary, the artifactual movement signal can only be easily made out from channel CP2. (It is also clearly visible at all the frontal channels, which are not displayed here.)

Yet, even though our human eyes are so quick to identify this channel as an ‘isolated’ case (-which would normally end up with its removal before the ICA), as had already become apparent from the 2D scalp maps, the ICA decomposed data gave a different perspective on things.

BM2 Subject ‘Ex’ – Same time window of the same data decomposed with ICA, displaying components 1-16 – Screenshot from EEGLab.

This is the same time window of the same data decomposed with ICA. My eyes, and quite likely yours, are immediately drawn to components 2, 4 and 13, that subjectively appear to display highly similar signals, all presumably coming from the same source that generated the signal at CP2.

Having already observed in the raw data that the strong waveform recorded at CP2 appeared to be an isolated case, what happened? The presumed single, ‘isolated’ signal recorded at CP2 has been divided among multiple components, whereas the blink artifacts (that were so prevalent across all channels in the raw data) are neatly reduced to a single component/source, IC 1.

It seems the ICA ‘was led astray’, for instead of reducing signals of no interest to us (movement) to a single or few components, it actually appears to have given a detailed deconstruction of them. The very strong movement-related signal at CP2 must have contained multiple rhythms that were all strong and independent enough to each make it to the scoreboard on their own. This explains ICs 2 and 4 loading almost exclusively on the same electrode.

IC 13 however, actually originates from frontal channels that picked up very similar, yet still statistically independent movement artifacts.

 

[to be continued]

 

Conclusions

  • Eye movements are necessarily strongly tied to locomotive bodily movements (vestibulo-ocular reflex).
  • Due to their tendency to correlate with each other, ICA may end up grouping EOG signals with (repetitive, locomotion related) movement artifacts. This greatly complicates using the accepted practice of subtracting solely eye-movement related components from mobile EEG data.

Where from here?

One can decide to do very drastic cleaning of the data, by removing or interpolating any channels that contain any direct and indirect movement related artifacts. However, considering the relatively low dimensionality of data available in our case, as well as the fact that the (most) problematic channel here was located close to one of the sites of interest, we ended up excluding only data from the walking condition from the ICA, in order to be able to keep the channel in the data sets of other conditions. This may have the consequence that certain signal sources are ignored by the ICA (mostly muscular, but of course also neural motor-activity), which is naturally not the preferred solution.

One solution that has been proposed for this kind of scenario is to transplant the ICA weights from one successfully decomposed data subset to another where ICA would be led astray by such activity.

However, considering that the active bicycle condition as well as the standing condition must have seen activity arising from the various neck muscles, it might still occur that the sources are in fact identified correctly.

Future Questions:

How independent is neural motor activity?

Can we feed EEG data with controlled amounts of EMG into an ICA to do a successful source separation/localization, only to then use the weight matrices on another data set to deal with stronger motion and muscle artifacts?