Wang Neuroscience Lab

Projects

Our lab is looking for research participants. Individuals with autism spectrum disorder are particularly welcome to participate. We conduct both eye tracking and fMRI (functional magnetic resonance imaging) experiments to study visual attention and social perception. If you are interested in participating in our research, please contact Shuo Wang.

Investigating the Neural Circuits for Social Attention in Humans

Selective visual attention is one of the most fundamental cognitive functions in humans. Although there is a plethora of literature that uses neuroimaging to probe the neural mechanisms of visual attention, very few studies have investigated visual attention at the single-neuron level in humans. 

Our lab conducts one of the very first studies to investigate visual attention at the single-unit level in humans. Importantly, we are investigating a network of brain regions critical for selective visual attention, especially attention to social stimuli (i.e., social attention), and we are investigating both goal-driven and stimulus-driven attention. The primary objectives of our research are two-fold: (1) to characterize social attention signals in the medial temporal lobe (amygdala and hippocampus) and prefrontal cortex (in particular the orbitofrontal cortex—a relatively unexplored brain region for social attention); and (2) to analyze attention signals from different brain regions using functional connectivity analysis. Our research will provide the most comprehensive analysis of the neural circuits underlying social attention in humans. The outcomes of our research will shed light on the neural mechanisms of impaired visual attention in patients with psychiatric and neurological disorders, such as autism spectrum disorder (ASD) and Attention-Deficit / Hyperactivity Disorder (ADHD).

In our recent publications (Wang et al., Curr Biol, 2018; Wang et al., Brain, 2019), we have shown a distinct population of target-selective neurons in both the medial temporal lobe (MTL) and medial frontal cortex (MFC) whose response signals whether the currently fixated stimulus is a target or distractor. This target-selective response is invariant to visual category and predicts whether a target is detected or missed behaviorally during a given fixation. The response latencies, relative to fixation onset, of MFC target-selective neurons precede those in the MTL by 200 ms, suggesting a frontal origin for the target signal. The human MTL thus represents not only fixed stimulus identity, but also task-specified stimulus relevance due to top-down goal relevance.

Frontal target neurons respond before MTL target neurons. (A, B) Example target neurons from the pre-SMA. (C, D) Two target neurons simultaneously recorded in the pre-SMA and MTL. (E) Cumulative firing rate for target neurons from the pre-SMA (dotted lines; n=31 neurons) and MTL (solid lines; n=27 neurons). Shaded area denotes ± SEM across neurons. Red: fixations on targets. Blue: fixations on distractors. Top bars show clusters of time points with a significant difference. Arrows indicate the first time point of the significant cluster. Magenta: MTL neurons. Black: pre-SMA neurons. (F) Difference in cumulative firing rate (calculated from [E]).

Investigating the Single-Neuron Mechanisms of Face Coding and Social Perception

How the brain encodes different face identities is one of the most fundamental and intriguing questions in neuroscience. There are currently two extreme hypotheses: (1) the exemplar-based model proposes that neurons respond in a remarkably selective and abstract manner to particular persons or objects, whereas (2) the axis-based model (a.k.a. feature-based model) posits that neurons distinguish facial features along specific axes (e.g., shape and skin color) in face space. However, a third under-explored coding scheme, the manifold-based coding, may exist in which neurons may encode the perceptual distance (i.e., similarity) between examples of faces at a macro level regardless of their individual features that may distinguish them at a micro level. 

Our lab is aiming to conduct one of the first studies to investigate face representation and coding in the human medial temporal lobe (MTL) at the single-neuron level. To the best of our knowledge, this will be the first study to directly compare different hypothesized neural coding schemes in the human MTL at the single-neuron level and also the first study to employ deep learning to study single-neuron responses in humans. Our single-neuron recordings will enable us to construct, validate, and explain neural face models to derive a general neural representation of faces. We will then use a deep neural network to explore the different neural face models listed above and thus be able to identify the predominant neural coding scheme in the human MTL. Together, our state-of-the-art human single-neuron recordings, powered by the latest image processing tools, will provide the most comprehensive and detailed analysis of neural representations of faces in humans with the highest possible spatial and temporal resolution to date.

Our recent data have shown that some MTL neurons are selective to multiple different face identities on the basis of shared visual features that form clusters in the representation of a deep neural network trained to recognize faces. Contrary to prevailing views, we find that these neurons represent an individual’s face with feature-based encoding, rather than through association with concepts. The response of feature neurons did not depend on face identity, race, gender, or familiarity; and the region of feature space to which they are tuned predicted their response to new face stimuli. Our results provide critical evidence bridging the perception-driven representation of facial features in the higher visual cortex and the memory-driven representation of semantics in the MTL, which may form the basis for declarative memory.

We have also found a neuronal social trait space for first impressions in the human amygdala and hippocampus, which may have a behavioral consequence likely involved in the abnormal processing of social information in autism. Our results suggest that there exists a neuronal population code for a comprehensive social trait representation in the human amygdala and hippocampus that underlies spontaneous first impressions.

Feature-based neuronal coding of face identities. (A) Task. We employed a one-back task, in which patients responded whenever an identical famous face was repeated. Each face was presented for 1s, followed by a jittered inter-stimulus-interval (ISI) of 0.5 to 0.75s. (B) Percentage of single-identity (SI) and multiple-identity (MI) neurons in the entire neuronal population. Stacked bar shows MI neurons that encoded visually similar identities (i.e., demonstrating feature-based coding; red) or not (blue). (C, D) Population decoding of face identity. (C) Decoding performance was primarily driven by identity neurons. Shaded area denotes ±SEM across bootstraps. The horizontal dotted gray line indicates the chance level (2%). The top bars illustrate the time points with a significant above-chance decoding performance (bootstrap, P < 0.05, corrected by FDR for Q < 0.05). (D) MI neurons had a significantly better decoding performance than SI neurons. The top bar illustrates the time points with a significant difference between MI and SI neurons (bootstrap, P < 0.05, corrected by FDR for Q < 0.05). (E) Web-association score for MI neurons. For each neuron, we calculated a mean association score between the pairs of stimuli that the neuron was selective to (S-S), and between the pairs of stimuli where the neuron was selective to one of them but not selective (NS) to the other (S-NS). Error bars denote ±SEM across neurons. Left: MI neurons that encoded visually similar identities (i.e., with feature-based coding). Right: MI neurons that did not show feature-based coding. For neither case, MI neurons encoded conceptually related identities. (F-M) Two example neurons that encoded visually similar identities. (F, J) Neuronal responses to 500 faces (50 identities). Trials are aligned to face stimulus onset (gray line) and are grouped by individual identity. (G, K) Projection of the firing rate onto the feature space. Each color represents a different identity (names shown in the legend). The size of the dot indicates the firing rate. (H, L) Estimate of the spike density in the feature space. By comparing observed (upper) vs. permuted (lower) responses, we could identify a region where the observed neuronal response was significantly higher in the feature space. This region was defined as the tuning region of a neuron. (I, M) The tuning region of the neuron in the feature space (delineated by the red outline).
Neuronal social trait space. (A) Task. We employed a simple one-back task, in which patients responded whenever an identical face stimulus was repeated. Each face was presented for 1s, followed by a jittered inter-stimulus-interval (ISI) of 0.5 to 0.75 s. (B) Distribution of face images in the social trait space based on their consensus social trait ratings after dimension reduction using t-distributed stochastic neighbor embedding (t-SNE). (C) Correlation between dissimilarity matrices (DMs). The social trait DM (left matrix) was correlated with the neural response DM (right matrix). Color coding shows dissimilarity values. (D-H) Observed vs. permuted correlation coefficient between DMs. The correspondence between DMs was assessed using permutation tests with 1000 runs. The magenta line indicates the observed correlation coefficient between DMs. The null distribution of correlation coefficients (shown in gray histogram) was calculated by permutation tests of shuffling the face identities (1000 runs). (D) All face-responsive neurons (n = 74). (E) Amygdala face-responsive neurons (n = 36). (F) Hippocampal face-responsive neurons (n = 38). (G) Social trait space constructed using Caucasian faces only (n = 74). (H) Social trait space constructed using African American faces only (n = 74). (I) Temporal dynamics of correlation between DMs. Bin size is 500 ms and step size is 50 ms. The first bin is from −500 ms to 0 ms (bin center: −250 ms) relative to stimulus onset, and the last bin is from 1000 ms to 1500 ms (bin center: 1250 ms) after stimulus onset. Dotted horizontal lines indicate the chance level and dashed horizontal lines indicate the ±Standard Deviation (SD) of the null distribution. The top asterisks illustrate the time points with a significant correlation between DMs (permutation test against null distribution, P < 0.05, corrected by false discovery rate [FDR] Q < 0.05).

Investigating the Neural Basis for Saliency and Memory Using Natural Scene Stimuli

One key research project in our lab is to use complex natural scene images to study saliency, attention, learning, and memory. In our previous studies, we annotated more than 5,000 objects in 700 well characterized images and recorded eye movements when participants looked at these images (Wang et al., Neuron, 2015). 

We will extend the same task to single-neuron recordings to investigate the neural correlates of saliency. We will also add three important components to this free viewing task: (1) we will repeat the images once or twice to explore a repetition effect (c.f. Jutras et al., PNAS, 2013), (2) we will ask neurosurgical patients to memorize the images during the first session (learning session) and test memory on the next day (recognition session) to explore a memory effect, and (3) we will explore memory encoding with overnight recording. Moreover, to probe the neural basis for altered saliency representation in autism (c.f. Wang et al., Neuron, 2015), we will analyze whether neurons are tuned to different saliency values and whether AQ/SRS scores correlate with the firing rate of fixations on semantic attributes such as faces.

One highlight of this project is to construct a “neuronal saliency map“. A saliency model can be constructed for each single neuron and population of neurons. By replacing the eye movement fixation density map by neuronal firing rate density map, the saliency weights can be calculated for neurons. This neuronal saliency map will reflect the tuning of a single neuron or a population of neurons when multiple saliency factors are considered simultaneously. Notably, the saliency weights can be easily compared between brain areas (e.g., OFC vs. amygdala) and between groups (e.g., ASD vs. controls; see [10] as an example for fixation saliency weights). The distribution of saliency weights across neurons can reflect the population coding scheme in a brain area. Lastly, the effect of such neuronal saliency maps can be readily validated: we can use this saliency map to predict the location of the next fixation and are thus able to quantify the prediction accuracy. A real-time decoder can be constructed.

Model-based eye tracking. We applied a linear support vector machine (SVM) classifier to evaluate the contribution of five general factors in gaze allocation. Feature maps are extracted from the input images and included three levels of features (pixel, object, and semantic-level) together with the image center and the background. We apply a random sampling to collect the training data and train on the ground-truth actual fixation data. The classifier outputs are the saliency weights, showing the relative importance of each feature in predicting gaze allocation.

Investigating the Behavioral and Neural Underpinnings for Aberrant Social Behavior in Autism

People with autism spectrum disorder (ASD) are characterized by impairments in social and communicative behavior and a restricted range of interests and behaviors (DSM-5, 2013). An overarching hypothesis is that the brains of people with ASD have abnormal representations of what is salient, with consequences for attention that are reflected in eye movements, learning, and behavior. In our prior studies, we have shown that people with autism have atypical face processing and atypical social attention. On the one hand, people with autism show reduced specificity in emotion judgment (Wang and Adolphs, Neuropsychologia, 2017), which might be due to abnormal amygdala responses to eyes vs. mouth (Rutishauser et al., Neuron, 2013). On the other hand, people with autism show atypical bottom-up attention to social stimuli during free viewing of natural scene images (Wang et al., Neuron, 2015), impaired top-down attention to social targets during visual search (Wang et al., Neuropsychologia, 2014), and abnormal photos taken for other people, a combination of both bottom-up and top-down attention (Wang et al., Curr Biol, 2016). However, the underlying mechanisms for these profound social dysfunctions in autism remain largely unknown.

Our lab focuses on two core social dysfunctions in autism: impaired face processing and impaired visual attention. The central hypothesis is that people with autism have altered saliency representation compared to controls. A key neural structure hypothesized to underlie the deficits in autism is the amygdala. Using a powerful combination of neuroscience techniques including single-neuron recording, functional magnetic resonance imaging (fMRI), and amygdala lesion patients, which converged on the amygdala, as well as high-resolution eye tracking and computational modeling approaches suitable for big data analysis, we investigate the following questions:

(1) What are the neural underpinnings for aberrant neural face representation in autism?

(2) What are the individual differences in autism when viewing natural scenes and faces? What are the underlying psychological and personality factors? Can we do large-population screening of autism given such individual differences?

(3) What are the neural mechanisms for goal-directed and stimulus-driven social attention and is there a difference in people with autism?

(4) Can we use deep learning and computer vision to fully capture autism behavior? Can we design an efficient tool to facilitate autism early diagnosis?

More in Wang Neuroscience Lab