Prof. Dr. Pia Knoeferle
Profil
Forschungsthemen8
ARchitekturen und Mechanismen der Sprachverarbeitung
Quelle ↗409-02-A · SoftwaretechnikFörderer: DFG sonstige Programme Zeitraum: 09/2018 - 03/2019 Projektleitung: Prof. Dr. Pia Knoeferle
Effekte von Lebenszeit- und Faktenwissen im Sprachverstehen
Quelle ↗Förderer: DFG Sachbeihilfe Zeitraum: 12/2019 - 12/2025 Projektleitung: Prof. Dr. Pia Knoeferle
Embodied and situated language processing / The Attentive listener in the visual world 2019
Quelle ↗Förderer: DFG sonstige Programme Zeitraum: 08/2019 - 08/2019 Projektleitung: Prof. Dr. Pia Knoeferle
Labor Know-How als gemeinsame Ressource
Quelle ↗Förderer: Berlin University Alliance (BUA) Zeitraum: 02/2021 - 07/2026 Projektleitung: Prof. Dr. Pia Knoeferle, Prof. Dr. Elke Greifeneder
SFB 1412/1: Situation-Register-Kongruenz, Morphosyntax und Verb-Argument-Verletzungen: Echtzeit- und Post-Sentence-Verstehen (TP C03)
Quelle ↗Förderer: DFG Sonderforschungsbereich Zeitraum: 01/2020 - 12/2023 Projektleitung: Prof. Dr. Pia Knoeferle, Dr. Katja Maquate
SFB 1412/2: Registerverständnis in Echtzeit bei mehrsprachigen Jugendlichen (TP C03)
Quelle ↗Förderer: DFG Sonderforschungsbereich Zeitraum: 01/2024 - 12/2027 Projektleitung: Prof. Dr. Pia Knoeferle, Dr. Katja Maquate, PD Dr. Natalia Gagarina
SPP 1727: Fokus und thematische Rollenzuweisung: Ein Vergleich zwischen Ungarisch und Deutsch im kindlichen Sprachverstehen
Quelle ↗Förderer: DFG Schwerpunktprogramm Zeitraum: 11/2015 - 05/2019 Projektleitung: Prof. Dr. Pia Knoeferle
UniSoMedSci: Uniting laboratory procedures across the social and medical sciences
Quelle ↗Förderer: Berlin University Alliance (BUA) Zeitraum: 07/2022 - 06/2023 Projektleitung: Prof. Dr. Pia Knoeferle, Prof. Dr. Christine Mooshammer, Prof. Dr. Agnes Kristina Villwock
Mögliche Industrie-Partner10
Stand: 26.4.2026, 19:48:44 (Top-K=20, Min-Cosine=0.4)
- 47 Treffer60.3%
- Embodied Audition for RobotSP60.3%
- Embodied Audition for RobotS
- 107 Treffer60.0%
- Realizing Leibniz's Dream: Child Languages as a Mirror of the Mind (LeibnizDream)P60.0%
- Realizing Leibniz's Dream: Child Languages as a Mirror of the Mind (LeibnizDream)
- 114 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 109 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 108 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 112 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 102 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 109 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 115 Treffer59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)P59.0%
- EU: Context Sensitive Multisensory Object Recognition (HBP)
- 30 Treffer58.9%
- Translation for Massive Open Online CoursesP58.9%
- Translation for Massive Open Online Courses
Publikationen25
Top 25 nach Zitationen — Quelle: OpenAlex (BAAI/bge-m3 embedded für Matching).
Cognitive Science · 281 Zitationen · DOI
Two studies investigated the interaction between utterance and scene processing by monitoring eye movements in agent-action-patient events, while participants listened to related utterances. The aim of Experiment 1 was to determine if and when depicted events are used for thematic role assignment and structural disambiguation of temporarily ambiguous English sentences. Shortly after the verb identified relevant depicted actions, eye movements in the event scenes revealed disambiguation. Experiment 2 investigated the relative importance of linguistic/world knowledge and scene information. When the verb identified either only the stereotypical agent of a (nondepicted) action, or the (nonstereotypical) agent of a depicted action as relevant, verb-based thematic knowledge and depicted action each rapidly influenced comprehension. In contrast, when the verb identified both of these agents as relevant, the gaze pattern suggested a preferred reliance of comprehension on depicted events over stereotypical thematic knowledge for thematic interpretation. We relate our findings to language comprehension and acquisition theories.
Cognition · 273 Zitationen · DOI
Journal of Memory and Language · 150 Zitationen · DOI
Behavior Research Methods · 91 Zitationen · DOI
In this paper, we discuss key characteristics and typical experimental designs of the visual-world paradigm and compare different methods of analysing eye-movement data. We discuss the nature of the eye-movement data from a visual-world study and provide data analysis tutorials on ANOVA, t-tests, linear mixed-effects model, growth curve analysis, cluster-based permutation analysis, bootstrapped differences of timeseries, generalised additive modelling, and divergence point analysis to enable psycholinguists to apply each analytical method to their own data. We discuss advantages and disadvantages of each method and offer recommendations about how to select an appropriate method depending on the research question and the experimental design.
Cognitive Science · 91 Zitationen · DOI
Evidence from numerous studies using the visual world paradigm has revealed both that spoken language can rapidly guide attention in a related visual scene and that scene information can immediately influence comprehension processes. These findings motivated the coordinated interplay account (Knoeferle & Crocker, 2006) of situated comprehension, which claims that utterance-mediated attention crucially underlies this closely coordinated interaction of language and scene processing. We present a recurrent sigma-pi neural network that models the rapid use of scene information, exploiting an utterance-mediated attentional mechanism that directly instantiates the CIA. The model is shown to achieve high levels of performance (both with and without scene contexts), while also exhibiting hallmark behaviors of situated comprehension, such as incremental processing, anticipation of appropriate role fillers, as well as the immediate use, and priority, of depicted event information through the coordinated use of utterance-mediated attention to the scene.
Frontiers in Psychology · 74 Zitationen · DOI
During comprehension, a listener can rapidly follow a frontally seated speaker's gaze to an object before its mention, a behavior which can shorten latencies in speeded sentence verification. However, the robustness of gaze-following, its interaction with core comprehension processes such as syntactic structuring, and the persistence of its effects are unclear. In two "visual-world" eye-tracking experiments participants watched a video of a speaker, seated at an angle, describing transitive (non-depicted) actions between two of three Second Life characters on a computer screen. Sentences were in German and had either subject(NP1)-verb-object(NP2) or object(NP1)-verb-subject(NP2) structure; the speaker either shifted gaze to the NP2 character or was obscured. Several seconds later, participants verified either the sentence referents or their role relations. When participants had seen the speaker's gaze shift, they anticipated the NP2 character before its mention and earlier than when the speaker was obscured. This effect was more pronounced for SVO than OVS sentences in both tasks. Interactions of speaker gaze and sentence structure were more pervasive in role-relations verification: participants verified the role relations faster for SVO than OVS sentences, and faster when they had seen the speaker shift gaze than when the speaker was obscured. When sentence and template role-relations matched, gaze-following even eliminated the SVO-OVS response-time differences. Thus, gaze-following is robust even when the speaker is seated at an angle to the listener; it varies depending on the syntactic structure and thematic role relations conveyed by a sentence; and its effects can extend to delayed post-sentence comprehension processes. These results suggest that speaker gaze effects contribute pervasively to visual attention and comprehension processes and should thus be accommodated by accounts of situated language comprehension.
Journal of Experimental Psychology Applied · 70 Zitationen · DOI
Building on models of crossmodal attention, the present research proposes that brand search is inherently multisensory, in that the consumers' visual search for a specific brand can be facilitated by semantically related stimuli that are presented in another sensory modality. A series of 5 experiments demonstrates that the presentation of spatially nonpredictive auditory stimuli associated with products (e.g., usage sounds or product-related jingles) can crossmodally facilitate consumers' visual search for, and selection of, products. Eye-tracking data (Experiment 2) revealed that the crossmodal effect of auditory cues on visual search manifested itself not only in RTs, but also in the earliest stages of visual attentional processing, thus suggesting that the semantic information embedded within sounds can modulate the perceptual saliency of the target products' visual representations. Crossmodal facilitation was even observed for newly learnt associations between unfamiliar brands and sonic logos, implicating multisensory short-term learning in establishing audiovisual semantic associations. The facilitation effect was stronger when searching complex rather than simple visual displays, thus suggesting a modulatory role of perceptual load. (PsycINFO Database Record
Psychophysiology · 66 Zitationen · DOI
To re-establish picture-sentence verification-discredited possibly for its over-reliance on post-sentence response time (RT) measures-as a task for situated comprehension, we collected event-related brain potentials (ERPs) as participants read a subject-verb-object sentence, and RTs indicating whether or not the verb matched a previously depicted action. For mismatches (vs. matches), speeded RTs were longer, verb N400s over centro-parietal scalp larger, and ERPs to the object noun more negative. RTs (congruence effect) correlated inversely with the centro-parietal verb N400s, and positively with the object ERP congruence effects. Verb N400s, object ERPs, and verbal working memory scores predicted more variance in RT effects (50%) than N400s alone. Thus, (1) verification processing is not all post-sentence; (2) simple priming cannot account for these results; and (3) verification tasks can inform studies of situated comprehension.
Brain and Language · 66 Zitationen · DOI
Cerebral Cortex · 45 Zitationen · DOI
A central topic in sentence comprehension research is the kinds of information and mechanisms involved in resolving temporary ambiguity regarding the syntactic structure of a sentence. Gaze patterns in scenes during spoken sentence comprehension have provided strong evidence that visual scenes trigger rapid syntactic reanalysis. However, they have also been interpreted as reflecting nonlinguistic, visual processes. Furthermore, little is known as to whether similar processes of syntactic revision are triggered by linguistic versus scene cues. To better understand how scenes influence comprehension and its time course, we recorded event-related potentials (ERPs) during the comprehension of spoken sentences that relate to depicted events. Prior electrophysiological research has observed a P600 when structural disambiguation toward a noncanonical structure occurred during reading and in the absence of scenes. We observed an ERP component with a similar latency, polarity, and distribution when depicted events disambiguated toward a noncanonical structure. The distributional similarities further suggest that scenes are on a par with linguistic contexts in triggering syntactic revision. Our findings confirm the interpretation of previous eye movement studies and highlight the benefits of combining ERP and eye-tracking measures to ascertain the neuronal processes enabled by, and the locus of attention in, visual contexts.
Quarterly Journal of Experimental Psychology · 43 Zitationen · DOI
Reading times for the second conjunct of and-coordinated clauses are faster when the second conjunct parallels the first conjunct in its syntactic or semantic (animacy) structure than when its structure differs (Frazier, Munn, & Clifton, 2000; Frazier, Taft, Roeper, & Clifton, 1984). What remains unclear, however, is the time course of parallelism effects, their scope, and the kinds of linguistic information to which they are sensitive. Findings from the first two eye-tracking experiments revealed incremental constituent order parallelism across the board-both during structural disambiguation (Experiment 1) and in sentences with unambiguously case-marked constituent order (Experiment 2), as well as for both marked and unmarked constituent orders (Experiments 1 and 2). Findings from Experiment 3 revealed effects of both constituent order and subtle semantic (noun phrase similarity) parallelism. Together our findings provide evidence for an across-the-board account of parallelism for processing and-coordinated clauses, in which both constituent order and semantic aspects of representations contribute towards incremental parallelism effects. We discuss our findings in the context of existing findings on parallelism and priming, as well as mechanisms of sentence processing.
Frontiers in Psychology · 38 Zitationen · DOI
More and more findings suggest a tight temporal coupling between (non-linguistic) socially-interpreted context and language processing. Still, real-time language processing accounts remain largely elusive with respect to the influence of biological (e.g., sex or age) and experiential (i.e., world and moral knowledge) comprehender characteristics and the influence of the ‘socially-interpreted’ context, as for instance provided by the speaker. This context could include actions, facial expressions, a speaker’s voice or gaze, and gestures among others. We review findings from social psychology, sociolinguistics and psycholinguistics to highlight the relevance of (the interplay between) the socially-interpreted context and comprehender characteristics for language processing. The review informs the extension of an extant real-time processing account (already featuring a coordinated interplay between language comprehension and the non-linguistic visual context) with a variable (‘ProCom’) that captures characteristics of the language user and with the comprehender’s speaker representation. Extending the CIA to the sCIA (social Coordinated Interplay Account) is the first step towards a real-time language comprehension account which might eventually accommodate the socially situated communicative interplay between comprehenders and speakers or actors.
38 Zitationen · DOI
Empathy can be defined as the ability to perceive and understand others' emotional states. Neuropsychological evidence has shown that humans empathize with each other to different degrees depending on factors such as their mood, personality, and social relationships. Although artificial agents have been endowed with features such as affect, personality, and the ability to build social relationships, little attention has been devoted to the role of such features as factors that can modulate their empathic behavior. In this paper, we present and discuss the results of an empirical evaluation of a computational model of empathy which allows a virtual human to exhibit different degrees of empathy. Our model is supported by psychological models of empathy and is applied and evaluated in the context of a conversational agent scenario.
Effects of Speaker Emotional Facial Expression and Listener Age on Incremental Sentence Processing
2013PLoS ONE · 33 Zitationen · DOI
We report two visual-world eye-tracking experiments that investigated how and with which time course emotional information from a speaker's face affects younger (N = 32, Mean age = 23) and older (N = 32, Mean age = 64) listeners' visual attention and language comprehension as they processed emotional sentences in a visual context. The age manipulation tested predictions by socio-emotional selectivity theory of a positivity effect in older adults. After viewing the emotional face of a speaker (happy or sad) on a computer display, participants were presented simultaneously with two pictures depicting opposite-valence events (positive and negative; IAPS database) while they listened to a sentence referring to one of the events. Participants' eye fixations on the pictures while processing the sentence were increased when the speaker's face was (vs. wasn't) emotionally congruent with the sentence. The enhancement occurred from the early stages of referential disambiguation and was modulated by age. For the older adults it was more pronounced with positive faces, and for the younger ones with negative faces. These findings demonstrate for the first time that emotional facial expressions, similarly to previously-studied speaker cues such as eye gaze and gestures, are rapidly integrated into sentence processing. They also provide new evidence for positivity effects in older adults during situated sentence processing.
Frontiers in Psychology · 30 Zitationen · DOI
Eye-tracking findings suggest people prefer to ground their spoken language comprehension by focusing on recently seen events more than anticipating future events: When the verb in NP1-VERB-ADV-NP2 sentences was referentially ambiguous between a recently depicted and an equally plausible future clipart action, listeners fixated the target of the recent action more often at the verb than the object that hadn't yet been acted upon. We examined whether this inspection preference generalizes to real-world events, and whether it is (vs. isn't) modulated by how often people see recent and future events acted out. In a first eye-tracking study, the experimenter performed an action (e.g., sugaring pancakes), and then a spoken sentence either referred to that action or to an equally plausible future action (e.g., sugaring strawberries). At the verb, people more often inspected the pancakes (the recent target) than the strawberries (the future target), thus replicating the recent-event preference with these real-world actions. Adverb tense, indicating a future versus past event, had no effect on participants' visual attention. In a second study we increased the frequency of future actions such that participants saw 50/50 future and recent actions. During the verb people mostly inspected the recent action target, but subsequently they began to rely on tense, and anticipated the future target more often for future than past tense adverbs. A corpus study showed that the verbs and adverbs indicating past versus future actions were equally frequent, suggesting long-term frequency biases did not cause the recent-event preference. Thus, (a) recent real-world actions can rapidly influence comprehension (as indexed by eye gaze to objects), and (b) people prefer to first inspect a recent action target (vs. an object that will soon be acted upon), even when past and future actions occur with equal frequency. A simple frequency-of-experience account cannot accommodate these findings.
Acta Psychologica · 26 Zitationen · DOI
Language and Linguistics Compass · 25 Zitationen · DOI
Abstract Over the past two decades, ‘visually situated’ language comprehension (the interplay between language comprehension, attention, and non‐linguistic visual context) has emerged as an increasingly active area of research. One important result in this area is that both linguistic and world knowledge, as well as visual cues, can rapidly inform the unfolding interpretation as reflected by comprehenders' eye movements to objects during spoken language comprehension. However, upon closer inspection, temporal delays of object‐directed gaze are not infrequent and emerge for the processing of non‐canonical (vs. canonical) structures, for scalar implicatures and for recently learned world–language associations. While it may further be tempting to assume that the different knowledge sources and visual cues are on a par in guiding visual attention, comprehenders' eye movements in many instances reveal a robust referential priority (more looks go to the referent of a word than to other objects). Should this priority be taken as a trivial observation? In the present article, we argue that the tension between this referential priority and other world–language relations constitutes an important constraint on the linking hypotheses and mechanisms implicated in situated language comprehension and should be considered when conceptualizing models and accounts of visually situated language comprehension.
PLoS ONE · 20 Zitationen · DOI
Spatial terms such as "above", "in front of", and "on the left of" are all essential for describing the location of one object relative to another object in everyday communication. Apprehending such spatial relations involves relating linguistic to object representations by means of attention. This requires at least one attentional shift, and models such as the Attentional Vector Sum (AVS) predict the direction of that attention shift, from the sausage to the box for spatial utterances such as "The box is above the sausage". To the extent that this prediction generalizes to overt gaze shifts, a listener's visual attention should shift from the sausage to the box. However, listeners tend to rapidly look at referents in their order of mention and even anticipate them based on linguistic cues, a behavior that predicts a converse attentional shift from the box to the sausage. Four eye-tracking experiments assessed the role of overt attention in spatial language comprehension by examining to which extent visual attention is guided by words in the utterance and to which extent it also shifts "against the grain" of the unfolding sentence. The outcome suggests that comprehenders' visual attention is predominantly guided by their interpretation of the spatial description. Visual shifts against the grain occurred only when comprehenders had some extra time, and their absence did not affect comprehension accuracy. However, the timing of this reverse gaze shift on a trial correlated with that trial's verification time. Thus, while the timing of these gaze shifts is subtly related to the verification time, their presence is not necessary for successful verification of spatial relations.
Elsevier eBooks · 20 Zitationen · DOI
Journal of Cultural Cognitive Science · 18 Zitationen · DOI
Predicting variability in context effects is a timely enterprise considering that psycho- and neurolinguistic research has assessed how language processing depends on the perceived context, the body, and long-term linguistic knowledge of the language user. The current evidence suggests that some context effects may be systematically more robust than others and that language user characteristics are an influential modulator of context-sensitive comprehension. Reviewing psycholinguistic evidence, I argue for constrained contextual variability. Variability in context effects is predicted by characteristics of the language user and world-language relations. But extant findings also suggest generalizability beyond such variation, thus imposing constraint on theoretical prediction of context effects via relative (not absolute) processing preferences.
Cognition · 18 Zitationen · DOI
eScholarship (California Digital Library) · 16 Zitationen
Prior research has shown that adults can make rapid use of visual context information (e.g., visual referential contrast and depicted agent-action-patient events) for syntactic structuring and disambiguation.By contrast, little is known about how visual context influences children's language comprehension, and some results even suggest children cannot use visual referential context for syntactic structuring (e.g., Trueswell et al., 1999).We examined whether children (unlike adults) also struggle to use other kinds of information in visual context (e.g., depicted events) for real-time language comprehension.In two eye-tracking studies we directly compared real-time effects of depicted events on children's (Exp1) vs. adults' (Exp2) processing of spoken German subject-verb-object (SVO) and object-verb-subject (OVS) sentences.Both of these word orders are grammatical, but OVS is a non-canonical structure.Five-year olds are at chance in understanding even unambiguous OVS sentences in the absence of visual context (Dittmar et al., 2008).If children can use depicted events rapidly for syntactic structuring, we should find similar visual context effects for them as have been reported for adults (Knoeferle et al., 2005), and similar gaze pattern as for the adults in the present studies.Gaze pattern in the present studies suggested that events depicting who-does-what-to-whom incrementally influenced both adults' and 5-year-olds' visual attention and thematic role assignment.Depicted-event information helped children to get rid of their initial preference for the preferred SVO structure when interpreting OVS sentences.However, visual context effects were subtly delayed in children (vs.adults), and varied as a function of their accuracy and cognitive capacity.
Bilingualism Language and Cognition · 13 Zitationen · DOI
Abstract To test effects of German on anticipation in Vietnamese, we recorded eye-movements during comprehension and manipulated i) verb constraints (different vs. similar in German and Vietnamese) and ii) classifier constraints (absent in German). In each of two experiments, participants listened to Vietnamese sentences like “Mai mặc một chiếc áo.” (‘Mai wears a [classifier] shirt.’), while viewing four objects. Between experiments, we contrasted bilingual background: L1 Vietnamese–L2 German late bilinguals (Experiment 1) and heritage speakers of Vietnamese in Germany (Experiment 2). Both groups anticipated verb-compatible and classifier-compatible objects upon hearing the verb/classifier. However, when the (verb) constraints differed (e.g., Vietnamese: mặc ‘wear (a shirt/#earrings)’ – German: tragen ‘wear (a shirt/earrings)’), the heritage speakers were distracted by the object (earrings) compatible with the German (but not the Vietnamese) verb constraints. These results demonstrate that competing information in the two languages can interfere with anticipation in heritage speakers.
Acta Psychologica · 13 Zitationen · DOI
Cambridge University Press eBooks · 12 Zitationen · DOI
The present chapter reviews the literature on visually situated language comprehension against the background that most theories of real-time sentence comprehension have ignored rich non-linguistic contexts. However, listeners' eye movements to objects during spoken language comprehension, as well as their event-related brain potentials (ERPs), have revealed that non-linguistic cues play an important role for real-time comprehension. In fact, referential processes are rapid and central in visually situated spoken language comprehension and even abstract words are rapidly grounded in objects through semantic associations. Similar ERP responses for non-linguistic and linguistic effects on comprehension suggest these two information sources are on a par in informing language comprehension. ERPs further revealed that non-linguistic cues affect lexical‒semantic as well as compositional processes, thus further cementing the role of rich non-linguistic context in language comprehension. However, there is also considerable ambiguity in the linking between comprehension processes and each of these two measures (eye movements and ERPs). Combining eye-tracking and event-related brain potentials would improve the interpretation of individual measures and thus insights into visually situated language comprehension.
Kooperationen3
Bestätigte Forscher↔Partner-Paare aus HU-FIS — Gold-Standard-Positive für das Matching.
UniSoMedSci: Uniting laboratory procedures across the social and medical sciences
university
SFB 1412/2: Registerverständnis in Echtzeit bei mehrsprachigen Jugendlichen (TP C03)
other
SPP 1727: Fokus und thematische Rollenzuweisung: Ein Vergleich zwischen Ungarisch und Deutsch im kindlichen Sprachverstehen
university
Stammdaten
Identität, Organisation und Kontakt aus HU-FIS.
- Name
- Prof. Dr. Pia Knoeferle
- Titel
- Prof. Dr.
- Fakultät
- Sprach- und literaturwissenschaftliche Fakultät
- Institut
- Institut für deutsche Sprache und Linguistik
- Arbeitsgruppe
- Sprachwissenschaft des Deutschen: Psycholinguistik
- Telefon
- +49 30 2093-85010
- HU-FIS-Profil
- Quelle ↗
- Zuletzt gescrapt
- 26.4.2026, 01:07:41