Skip to main content
eScholarship
Open Access Publications from the University of California

About

The annual meeting of the Cognitive Science Society is aimed at basic and applied cognitive science research. The conference hosts the latest theories and data from the world's best cognitive science researchers. Each year, in addition to submitted papers, researchers are invited to highlight some aspect of cognitive science.

Workshop

Minds in the Making: Cognitive Science and Design Thinking

All around us are traces of human design, from color-coded subway maps that facilitate navigation to furniture that balances form and function. The human capacity for creation has long fascinated cognitive scientists. Early studies of innovation highlighted the role of problem-solving, elucidating the roles of search and heuristics (Simon, 1996; Newell, 1972). Research on object perception and tool use enhanced our understanding of how humans interact with and manipulate their environment (Gibson, 1977; Norman, 1999). Subsequently, research in the visual and spatial domains uncovered key abstractions supporting reasoning, communication, and expression through visual forms, such as mental models, diagrams, and spatial analogies (Hegarty, 2011; Tversky, 2010; Goel, 1995).

Information Theory and Cognitive Science

This workshop focuses on information theory and cognition. The goal is to create a multidisciplinary space for discussing the most recent advances at the intersection of information theory and cognitive science and to explore how this emerging research area can help the field advance toward a more comprehensive and principled mathematical theory of human cognition.

Putting it together: Interactions between domains of cognition

One of the oldest and deepest questions across the cognitive sciences concerns the architecture of the mind (Fodor, 1983). Which cognitive capacities are supported by domain-general computations that apply to a broad set of inputs, and which are supported by domain-specific computations, that are specialized for a particular mental function? This question has motivated a broad set of research programs, ranging from the evolution and development of human knowledge (Tomasello, Melis, Tennie, Wyman, & Herrmann, 2012), to the specialization of neural functions (Kanwisher, 2010), to building machines to think and learn in the same ways that we can (Lake, Ullman, Tenenbaum, & Gershman, 2017).

Reasoning Across Minds and Machines

Reasoning is one of the hallmarks of both natural and artificial intelligence. Understanding how reasoning operates in the human mind is crucial in cognitive science. Despite a long history of research on human reasoning in cognitive science—ranging from heuristics (Tversky & Kahneman, 1974) to mental models (Johnson-Laird, 1983), and from Bayesian modeling (Oaksford & Chater, 2007; Griffiths, Chater, & Tenenbaum, 2024) to neuroscience (Goel & Dolan, 2003)—little is known about how humans reason so flexibly in real life and how reasoning contributes to high-level cognitive functions including planning, social interaction, complex problem-solving, and open-ended mental exploration. Previous studies face challenges that hinder a deeper understanding of reasoning, including the difficulty of designing well-balanced experimental paradigms that maintain both control and ecological validity, efficient data collection and analysis beyond pure behavioral measures (e.g., Think-Aloud text data (Simon & Ericsson, 1984)), and understanding complex interactions between reasoning and other high-level cognitive functions, such as memory, theory of mind, and language.

Meta-reasoning: Deciding which game to play, which problem to solve, and when to quit

People are general purpose problem solvers. We obtain food and shelter, manage companies, solve moral dilemmas, spend years toiling away at thorny math problems, and even adopt arbitrary problems through puzzles and games. The cognitive flexibility which allows us to represent and reason about such a wide range of problems, often referenced as a distinguishing feature of human intelligence (Tomasello, 2022), presents us with an especially ubiquitous one: deciding which problem to solve. The meta-level problem of what problem to choose exists, in part, because people have limited problem solving resources (Griffiths et al., 2020). While this challenge has been examined through various lenses across cognitive science, implicit in many of these perspectives is the notion of bounded rationality. Given our limited time and energy, how do we decide which problems are worthwhile and when we should quit to pursue something new?

Behavioral Network Science

Structure matters in cognitive science. Whether we are asking about memory retrieval, semantic representations, categorization, language acquisition, learning from complex information, aging, or creativity, cognitive scientists often find themselves forced to reckon with structure. Network science offers a quantitative approach for doing this by allowing us to ask questions about the relationships between various entities at scales ranging from dyads, to communities, to entire systems. In this case, the entities are the nodes in the network and the relationships are the edges between them. Exploring how this plays out in actual practice is incredibly varied, aesthetically and intellectually beautiful, and deeply rewarding, allowing us to develop and test hypotheses about cognition that are not otherwise possible. As a metric ruler measures length, allowing us to compare human height with the Burj Khalifa, network science measures structure, allowing us to compare the structure of our environments with the structure of our cognitive representations, how those representations change across the lifespan, and how different processes interacting with those structures generate behavior.

Workshop: Succeeding in the Start-up Ecosystem

This workshop aims to guide cognitive science PhD students on starting a company within the start-up ecosystem., with an emphasis on how to use venture capital (VC). It will cover essential concepts, stages, decision points, and skills needed to improve chances for success. The workshop will also compare VC funding phases with those in academic research to make the concepts and processes more accessible. This workshop will begin by helping students calibrate their motivations and timelines for creating a startup company or joining one. After a review of the typical startup lifecycle by a venture capitalist, three startup founders will explain their career trajectories and critical decision points. They will then introduce important decisions about product, market, software, and funding strategies. Students will then participate in breakout sessions in one of these four areas. After the breakout sessions, the workshop will conclude with open discussion with the presenters

Building computational models of social cognition in memo

One of the most influential computational paradigms in modern cognitive science is the Bayesian modeling of social cognition. This paradigm models people's intuitions about other agents in terms of recursive probabilistic reasoning: agents are treated as approximately-rational decision-makers, who make Bayesian inferences about *other* agents' mental states from their observable behavior. Computational models designed in this tradition have been used in seminal work on theory-of-mind, language and communication, emotion understanding, and many other key areas of interest in cognitive science.

All you ever wanted to ask about Dynamic Field Theory

\section{How does the mind emerge from the brain? } A neural process theory of cognition is one of the central goals l of cognitive science. Connectionism was developed in pursuit of that goal leading to a range of proposed frameworks such as LISA and DORA \cite{DoumasEtAl2022}, SPAUN \cite{StewartEliasmith2012} and many others \cite{KotserubaTsotso2020}. The rise of neural AI reframes the question of how neural accounts may reach higher cognition \cite{SmolenskyEtAl2022,OReillyEtAl2022}. But what is meant by neural process theory ? Which neural principles would form the basis for such a theory? And what is it a theory of? What properties of cognition must such a theory address? Dynamic Field Theory (DFT) gives specific answers to these questions, postulating that the dynamics of neural populations are the level of neural processing that most closely reflects the laws of cognition. It emphasizes the emergence of cognition from the sensory-motor domain, so that a theory of cognition must address both acting and thinking.

Symposia

Defending Science in 2025

Higher education and science are facing unprecedented challenges. What can we do as scientists? This symposium aims to highlight ongoing efforts to protect our research, institutions, and academic rights, provide concrete actions we can take, and collectively brainstorm a strategy going forward.

Cognitively Inspired Interpretability in Large Neural Networks

Large Language Models (LLMs) and Vision Language Models (VLMs) have become a dominant force in artificial intelligence and have already made a major impact on the cognitive sciences, but debate persists concerning the extent to which they possess emergent cognitive capacities. Investigation of these systems at the level of behavioral outputs has led to conflicting findings, and the question of how these outputs are generated (at a mechanistic or algorithmic level) remains open. Yet, the abilities they do exhibit behaviorally offer an unprecedented opportunity to answer longstanding questions about how neural networks could, even in principle, achieve abilities that are thought to require structured representations—such as syntactic, combinatorial, and variable-binding operations. In this symposium, we highlight a recent body of work that addresses this gap in understanding by investigating the internal mechanisms that support cognitive processing in LLMs and other large-scale neural networks. The symposium brings together researchers with backgrounds in both computer science and psychology, exploring ways in which mechanistic interpretability research and cognitive science can mutually inform one another.

Minds at School: Advancing cognitive science by measuring and modeling human learning in situ

Unlike in other animals that might reach full maturity within a few months or years, human cognitive development follows an unusually protracted timeline. In many contemporary societies, people might require several years of scaffolded learning opportunities to develop the full suite of cognitive skills and abilities they are expected to have as adults. This extended period of development reflects our species' unique capacity for cumulative cultural learning: humans have evolved specialized cognitive mechanisms that enable us to learn from, communicate with, and teach others across the lifespan (Csibra & Gergely, 2009; Tomasello, 2016).

Advancing The Cognitive Science of Online Political Discourse

The public sphere has shifted online, fundamentally altering the way political ideas are shared, debated, and contested. From topic-based chatrooms and social media threads to viral memes and videos, the internet has created new modes of political communication. To grasp how political thought is constructed in this digital landscape, we need to examine both the content of political messages and the structures that shape their impact. Studying this question is difficult not just because of the complexity of measuring and intervening on human political cognition in a noisy naturalistic setting but because of, for example, the proliferation of bots in online spaces and the frequency of insincere bad-faith posting and commentary. The research presented at this symposium advances our understanding of online political discourse by applying multiple methodologies, including statistical modeling and natural language processing of social media corpora, along with novel experiments on persuasion and how to counter misinformation.

Perception as a Foundation for Common-Sense Theories of the World

Humans have the remarkable ability to construct commonsense theories about the world through perception. From inferring the physical relationships between objects to understanding social interactions and predicting future events, perception plays a foundational role in shaping how we learn and reason. But how do systems for perception and higher level cognitive theories interact? Are perception and common sense knowledge distinct, such that perception simply provides information for other cognitive systems to act on (Firestone & Scholl, 2016)? Or are they interlinked in ways that allow common-sense theories to affect what we perceive, or represent the world for other commonsense domains (Levin, Baker, & Banaji, 2016)?

The Emergence of Social Adaptation: How Children (and other Primates) Learn and Apply Social Norms to Navigate the World More Efficiently

Successfully navigating the social world requires individuals to flexibly adapt their behavior to different situational demands. Yet, the social world is also governed by broad behavioral rules: norms that prescribe or proscribe behaviors in certain situations. This poses a particularly interesting problem for early development: learning the foundational structure of norms may require strict adherence, but effective social functioning involves flexibility. Here, we refer to the ability to flexibly navigate social norms as social adaptation.

I Know I Should: Normative Competence From Biology To AI

How do people understand what is normatively expected from them? And how does normative cognition motivate action? A distinct feature of human sociality is our capacity to sense the appropriate thing to say in a certain context or the action that should be done and to care enough to do it. Such competence with social norms stabilizes joint action between dyads and enables cooperation among thousands, but whether it is the product of domain-general learning mechanisms or dedicated cognitive ones is still unclear (Heyes, 2024). Moreover, as the many blunders made by contemporary AI systems like LLMs show, autonomous AI agents too need to acquire some level of normative competence to be reliable partners. However whether this is possible with current architectures is still debated (Browning, 2024). The aim of this symposium is to bring together researchers to discuss the biological, cognitive, computational and interactive bases of normative motivation and judgment.

The role of language in human and machine intelligence

We use language to communicate our thoughts. But is language merely the expression of thoughts, which are themselves produced by other, nonlinguistic parts of our minds? Or does language play a more transformative role in human cognition, allowing us to have thoughts that we otherwise could (or would) not have? Recent developments in artificial intelligence and cognitive science have reinvigorated this old question. Could language hold the key to the emergence of both artificial intelligence and important aspects of human intelligence? The four contributions in this symposium address this question by drawing on behavioral and neural evidence from people, and the remarkable recent developments in AI which appear to show that artificial neural networks trained on language come to have an astonishing range of abilities. Despite the diversity of the speakers' perspectives, the four contributions paint a coherent (if complex) picture: The abilities of large language models (LLMs) serve as an existence proof of just how much can—in principle—be learned from language. LLMs also act as a stress test of cognitive theories. The evidence of neural dissociation between linguistic and conceptual processing points to the multiple realizability of human-like cognition. Finally, there is an acknowledged need for systematic research on how the successes and failures of LLMs inform our understanding of human cognition.

Naturalistic observation of language development outside the home

How do children learn to talk to others? Mastery of language means being able to communicate with a wide array of interlocutors (Schieffelin & Ochs, 1986). Yet researchers have tended to treat parent-child interaction as the paradigmatic site of language learning, neglecting how children learn to use language with other people in their lives. As a result, we have come to not only lack basic facts about the full distribution of children's language input and use; we miss precisely those contexts that call for complex conversational skills, such as adapting to novel interlocutors and maintaining conversations without parent scaffolding.

The cognitive science of caregiving

Caregiving is essential to human survival and flourishing, yet it has been largely overlooked across scientific disciplines, including economics, philosophy, politics, and, importantly – cognitive science. Caregiving remains poorly understood, in part, because it does not fit neatly within traditional frameworks of human cognition and behavior (Gopnik, 2023). Take, for instance, theories of morality and cooperation (Kleiman-Weiner, Saxe, & Tenenbaum, 2017; Powell, 2022). Unlike the principle of universalism, which emphasizes impartiality, caregiving involves prioritizing the needs of specific individuals over others (Gilligan, 1993). Caregiving also directly challenges the utilitarian principle of "the greatest good for the greatest number", since it involves actions that benefit others at significant personal cost. Furthermore, caregiving diverges from the principle of reciprocity – a cornerstone of human cooperation – because caregivers generally do not expect anything in return for their actions (Fiske, 1992).

New Perspectives in Computational Modeling of Human Attention

This symposium will present a set of four talks and a panel discussion that will together take the audience inside a scientific revolution that has been (largely quietly) unfolding in the field of attention: A set of recent computational modeling approaches that allow us to think about human attention in fundamentally new ways. In cognitive science, studies of attention stand out in at least two dimensions. First, and most bluntly, it is an outright confusing area to work in. "Attention" is a term ascribed to many sorts of mechanisms and phenomena. Case in point: there are at least three papers all published in 2024 presenting ongoing active (and, surprisingly, topically, largely non-overlapping) debates: Rosenholtz (2024), Theeuwes (2024), and Wu (2024). Second, attention stands out in the extent of the gap between the rich empirical phenomena integrated into conceptual theories, versus formal computational models, with most influential models dating back at least a decade (e.g., Bruce & Tsotsos, 2005; Bundesen et al., 2015; Reynolds & Heeger, 2009; Wolfe, Cave, & Franzel, 1994; Dosher & Lu, 2000), rather than keeping up with the advances in experimental work. This 90-minute-long gathering will show how the field of attention has been radically changing along both dimensions --- how models of attention have been carving new and productive ways of better drawing the contours of what attention is and enabling progress toward a more integrated research landscape of experiments and modeling.

Papers with Oral Presentation

Systematic Bias in Large Language Models: Discrepant Response Patterns in Binary vs. Continuous Judgment Tasks

Large Language Models (LLMs) are increasingly used in tasks such as psychological text analysis and decision-making in automated workflows. However, their reliability remains a concern due to potential biases inherited from their training process. In this study, we examine how different response format—binary versus continuous— may systematically influence LLMs' judgments. In a value statement judgments task and a text sentiment analysis task, we prompted LLMs to simulate human responses and tested both formats across several models, including both open-source and commercial models. Our findings revealed a consistent negative bias: LLMs were more likely to deliver "negative" judgments in binary formats compared to continuous ones. Control experiments further revealed that this pattern holds across both tasks. Our results highlight the importance of considering response format when applying LLMs to decision tasks, as small changes in task design can introduce systematic biases.

Thinking fast, slow, and everywhere in between in humans and language models

How do humans adapt how they reason to varying circumstances? Prior research has argued that reasoning comes in two types: a fast, intuitive type and a slow, deliberate type. Are these the only options, or can people adjust their reasoning continuously by trading off speed and accuracy within individual reasoning steps? We investigate this possibility in an experiment where participants were trained on relationships between local variables in a simple causal model, then asked to make predictions about all pairs of variables. Participants in one condition had a 5-second time limit. We found main effects of time pressure and locality, but only a small interaction in the direction opposite to our hypothesis. We present a process-level model of this phenomenon using early readouts from transformer language models. Our findings are consistent with people reasoning step by step, but accepting a higher error rate at each step under time pressure.

Using Goal-Incidental Attributes to Assess the Relationship Between Selective Attention and Attribute Centrality

As we interact with the world, we learn how to allocate our attention effectively to achieve our goals. In this study, participants' eye movements were tracked as they engaged in a same-different task with novel stimuli. Participants then completed a goal-directed task using physical copies of the items. In the task, the goal-relevance of the attributes varied. This was followed by a second block of the same-different task. We analyzed how attention to the goal-relevant, goal-incidental, and goal-irrelevant attributes of the items shifted from the initial block of the same-different task to the second. The gaze patterns clearly distinguished between the goal-relevant and goal-irrelevant attributes. However, there was no distinction between the goal-relevant and goal-incidental attributes. We discuss the implications for understanding how goal-directed interactions shape attentional allocation and how this relates to what people learn about the items.

Soft production preferences emerge from a bottleneck on memory

Soft production preferences are a key feature of incremental language production, yet they lack a well-defined unified explanatory theory. Here, we propose an information-theoretic theory of availability effects grounded in the notion of lossy-context working memory, which takes the form of a cost function that can be applied to any computational-level model of language production. We show that production policies that minimize this cost function naturally give rise to key soft preferences observed in empirical data, including frequency bias, heavy-NP shift, and agreement attraction. We then show a novel prediction made by the model regarding the entropy of arguments' thematic roles, and show that this effect holds in corpus data.

The role of contrast in category learning

Word meanings are contrastive. When we are told that some- thing is a square, we are also told that it is not a triangle. How- ever, words may be learned in different contrasts. One person might learn about squares in contrast with circles. Does this mean that the two have different representations of "square?" To answer this question, participants learned to label novel shapes with novel labels in a category learning task. Critically, we manipulated the contrast participants received during learn- ing: an A-shape is specifically not a B or an A-shape is specifi- cally not a D. Afterwards, we tested participants' knowledge of the learned categories using explicit categorization tasks and similarity judgments. Contrast during learning mattered. Shapes from contrasted categories were categorized more ac- curately, were less confusable and rated as less similar.

Longitudinal stability of the effect of depression on social decision-making

Depression affects everyday decision-making, yet it remains unclear if such effects depend on the decision context or fluctuate over time. In this repeated-measures study, online participants completed a social exchange task (ultimatum game) and a non-social reversal learning task at baseline (n=236) and 1-month later (n=131). Mood symptoms were assessed using the Beck Depression Inventory and the Positive Valence System Scale. Psychiatric symptoms were stable over time, and dropout was unrelated to symptom severity. Mixed-effects regression revealed consistent behavioral effects of depressive symptoms—while controlling for anhedonia—across time points. Specifically, greater depressive severity predicted slower reaction times and reduced acceptance of unfair offers in the ultimatum game. An interaction between depressive and anhedonic symptoms on mood ratings emerged at baseline but did not replicate at follow-up. There were no consistent significant effects of depression on the non-social reversal learning task across time points. These findings highlight the longitudinal stability of depressive symptoms on social decision-making.

AIPsychoBench: Understanding the Psychometric Differences between LLMs and Humans

Large Language Models (LLMs) with hundreds of billions of parameters have exhibited human-like intelligence by learning from vast amounts of internet-scale data. However, the uninterpretability of large-scale neural networks raises concerns about the reliability of LLM. Studies have attempted to assess the psychometric properties of LLMs by borrowing concepts from human psychology to enhance their interpretability, but they fail to account for the fundamental differences between LLMs and humans. This results in high rejection rates when human scales are reused directly. Furthermore, these scales do not support the measurement of LLM psychological property variations in different languages. This paper introduces AIPsychoBench, a specialized benchmark tailored to assess the psychological properties of LLM. It uses a lightweight role-playing prompt to bypass LLM alignment, improving the average effective response rate from 70.12% to 90.40%. Meanwhile, the average biases are only 3.3% (positive) and 2.1% (negative), which are significantly lower than the biases of 9.8% and 6.9%, respectively, caused by traditional jailbreak prompts. Furthermore, among the total of 112 psychometric subcategories, the score deviations for seven languages compared to English ranged from 5% to 20.2% in 43 subcategories, providing the first comprehensive evidence of the linguistic impact on the psychometrics of LLM.

Enhancing Cognitive Game Tracing via Diverse Information and Time-aware Modeling

With the surge in cognitive gaming data, understanding players' learning patterns and cognitive growth has become increasingly important. These data offer valuable opportunities to study individual cognitive development during learning. However, the diversity of player profiles and the complexity of gaming tasks pose significant challenges for accurate skill prediction. Specifically, the heterogeneity of player profiles leads to diverse and complex learning trajectories; data sparsity and temporal dynamics further exacerbate these challenges. To address these challenges, we propose the MFCGT (Multi-feature Forget Cognitive Game Tracing) model. First, we perform multi-feature selection to extract key features from player behavior data to reduce noise and improve prediction accuracy. Second, we introduce a time-aware decay mechanism that simulates skill degradation using an exponential decay function, ensuring the model captures the impact of forgetting on learning trajectories. Finally, we incorporate an attention mechanism to dynamically identify the most relevant historical performance for the current task, thereby enhancing the model's predictive capability. The experiment results show that MFCGT significantly outperforms traditional models in skill prediction tasks. Additionally, MFCGT effectively captures players' learning dynamics and forgetting effects, providing more accurate learning predictions.

Meaning adaptation in the discourse dynamics of imprecision

Speakers often communicate imprecisely, using expressions that are strictly speaking false yet felicitous. The degree to which imprecision is tolerated in discourse is governed by the Standard of Precision (SoP). While it is known that contextual factors can modulate the SoP, less is known about the discourse dynamics of imprecision. Previous accounts have claimed that implicit negotiations of the SoP are unidirectional: they only work upward, not downward. Here, we investigated whether parallel asymmetries arise when comprehenders adapt their SoPs in response to an interlocutor's precision preferences. Results show bidirectional SoP adaptation effects: exposure to lower standards increased tolerance for imprecision, while exposure to higher standards reinforced stricter thresholds. These updates persisted beyond the dialogue, suggesting that exposure to (im)precise speakers modulates not only interpretations within a discourse, but also beyond the conversation. More broadly, our study provides a novel framework for studying real-time dynamic meaning negotiations during conversation.

Relational Information Predicts Human Behavior and Neural Responses to Complex Social Scenes

Understanding social scenes depends on tracking relational visual information, which is prioritized behaviorally and represented in the superior temporal sulcus (STS), a region involved in processing social scenes. Despite its importance, relational information has been underutilized in computational models of social vision. In this study, we evaluate two neural network models—SocialGNN and RNN Edge—that explicitly incorporate relational cues, and compare their performance to state-of-the-art (SOTA) AI vision models. SocialGNN utilizes a graph neural network to organize input information about each video frame into a graph structure with nodes representing faces and key objects, and edges encoding relational information such as gaze direction and physical contact. RNN Edge is an even simpler model that processes only relational information without node features or graph-based structures. These models were tested on behavioral and neural data from 3-second natural videos of two people engaged in everyday activities, as well as on the PHASE dataset, a collection of 2D animations depicting agent-object interactions inspired by Heider and Simmel. Across both datasets, SocialGNN and RNN Edge achieved strong performance in predicting human behavioral ratings of social interactions and were comparable to SOTA AI models in behavioral encoding tasks, despite being trained on significantly less data and with simpler architectures. Notably, the success of RNN Edge suggests that additional visual features and the graph-based framework of SocialGNN do not significantly enhance performance, underscoring the primacy of gaze and physical contact as essential relational cues. These findings emphasize the importance of integrating relational information into computational models to develop better models of social perception and human-aligned AI.

The scope of generic generalizations: developmental changes in judgments about contextually-restricted generics.

Generalizations are powerful tools that agents rely on to predict and control their environments. However, some generalizations are restricted to "sociocultural bubbles" (e.g. "women have trouble getting tenure in math"). How are such patterns communicated? We report one interdisciplinary study — bridging philosophy, linguistics, and cognitive developmental psychology — which examined the developing capacity for contextual restriction of generics in 4-7-year-olds and adults (N=200). We provided context cues signaling that the speaker used a generic generalization to convey a broad vs. contextually-restricted regularity, and measured endorsement of generics attributing properties prevalent globally vs. within "bubbles". Adults endorsed generics flexibly, tracking context cues, but younger children struggled, over-attributing socially-contingent properties to the group beyond the "bubble", on par with context-general regularities. This reveals a troubling discrepancy between children and adults' interpretations of generics, opening the door for miscommunication. We discuss strategies to mitigate this in educational and family communication settings.

Are Expressions for Music Emotions the Same Across Cultures?

Music evokes profound emotions, yet the universality of emotional descriptors across languages remains debated. A key challenge in cross-cultural research on music emotion is biased stimulus selection and manual curation of taxonomies, predominantly relying on Western music and languages. To address this, we propose a balanced experimental design with nine online experiments in Brazil, the US, and South Korea, involving N=672 participants. First, we sample a balanced set of popular music from these countries. Using an open-ended tagging pipeline, we then gather emotion terms to create culture-specific taxonomies. Finally, using these bottom-up taxonomies, participants rate emotions of each song. This allows us to map emotional similarities within and across cultures. Results show consistency in high arousal, high valence emotions but greater variability in others. Notably, machine translations were often inadequate to capture music-specific meanings. These findings together highlight the need for a domain-sensitive, open-ended, bottom-up emotion elicitation approach to reduce cultural biases in emotion research.

Initiation Asymmetry in the Ontogenesis of Social Routines: In Conversation, Caregivers Scaffold 1-Year Olds to Respond, but 2-year Olds Initiate

Social routine words – e.g. yes, no, hi, bye, okay, and thank you – are among the first words children learn across different languages and they help constitute a foundational set of actions for conducting social interactions. Despite this, we know little about how these words are acquired. In this paper we begin by showing that social routine words are systematically acquired earlier than statistical models of word acquisition predict. Furthermore, we argue this gap is due to a selective focus of word learning science on reference – how words are mapped onto concepts and events – a relationship which is absent for social routine words. Rather than looking at the properties of words per se, we address this gap by instead looking at how children become conversational partners. Enroute to becoming a conversational partner, a child must orient themselves to others' expectations about the position & composition of their turns relative to their social partner's. We hypothesize that second pair parts (turns which respond, e.g. agreeing, acknowledging, reciprocating a greeting, etc.) afford more scaffolding from caregivers than first pair parts, and therefore words used to compose such responses are more learnable. To support this hypothesis, we sampled 1,442 conversational turns from 5 mother-child dyads (12mo-28mo) and manually labeled the position of these turns within adjacency pairs. We found that 12-month-old's talk is made mainly in response to caregivers who initiate adjacency pairs, but this initiation asymmetry in conversation disappears by 28-months. Because social routine words either stereotypically or frequently are used to compose second position turns, this pattern of initiation asymmetry could explain their early acquisition. More generally, our observation likely reflects a transition from scaffolding to children's active learning.

Probing and Inducing Combinational Creativity in Vision-Language Models

The ability to combine existing concepts into novel ideas stands as a fundamental hallmark of human intelligence. Recent advances in Vision-Language Models (VLMs) like GPT-4V and DALLE-3 have sparked debate about whether their outputs reflect combinational creativity—defined by M. A. Boden (1998) as synthesizing novel ideas through combining existing concepts—or sophisticated pattern matching of training data. Drawing inspiration from cognitive science, we investigate the combinational creativity of VLMs from the lens of concept blending. We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels: identifying input spaces, extracting shared attributes, and deriving novel semantic implications. To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework. Through extensive experiments, we demonstrate that in comprehension tasks, the best VLMs have surpassed average human performance while falling short of expert-level understanding; in generation tasks, incorporating our IEI framework into the generation pipeline significantly enhances the creative quality of VLMs' outputs. Our findings establish both a theoretical foundation for evaluating artificial creativity and practical guidelines for improving creative generation in VLMs.

Staring Down the Elevator Shaft: Postural Responses to Virtual Heights in an Indoor Environment

Postural control strategies for upright stance adapt when balance is threatened. We investigate behavioral indicators for control strategy change in Virtual Reality (VR). Previous VR research has shown increased postural sway during virtual height exposure, but most studies focus on outdoor-like environments with extensive visual cues that may influence balance. In contrast to these outdoor studies, our indoor VR results indicate that virtual height exposure increases the mean power frequency (MPF) of sway while reducing anterior-posterior (AP) sway range. We also find an anterior shift of the Center of Pressure (CoP) when there are vertical drops both on the front and back. These findings suggest a strong context-dependence of the strategy humans employ to counteract perceived threat and heightened neuromotor control for balance stabilization.

Alignment of Representational Complexity as a Latent Control Parameter in Referential Communication

Despite large inter-individual differences in experience and conceptual structures, humans converge on referents largely underspecified by the signals exchanged during communication. Many neural networks have become extremely sensitive to context-dependent relationships between signals, but remain relatively blind to their referents outside the signal space. Here, we study how human interlocutors dynamically coordinate both their signal and referential spaces over extended communicative interactions. We identify a latent control parameter, representational complexity, that may regulate referential coordination in human communication. Using a custom hierarchical Transformer model, we generate movement- and interaction-level embeddings from neurotypical (NT) and autistic (ASC) dyads engaged in an experimental semiotic task. By leveraging ASC-related communicative variance, we identify changes in parameters that track the representational dimensionality of signals and referents as communication unfolds. Movement-level embeddings (within-trial dependencies) could not differentiate the two groups, indicating comparable communicative behaviors. In contrast, interactionlevel embeddings (across-trial dependencies) distinguished ASC from NT dyads with high accuracy. Crucially, representational complexity, i.e. dyadic alignment in the interaction-level intrinsic dimensionality used to encode communicative histories, tracked referential coordination demands, with greater misalignment in ASC dyads under referential volatility. These findings suggest that referential alignment is an interaction-level process, driven by the dynamic adaptation of representational complexity, rather than statistical relationships between signals alone.

Emotion influences behavioral outcomes and attention during goal-directed reading

Recent studies on the interaction between emotion and reading comprehension provide a murky picture with contradictory claims. Here, we offer one explanation for this phenomenon. By utilizing an advanced eye movement analysis method EMHMM with co-clustering, we discovered two representative attention patterns from eye fixations: an expanded, globalized attention pattern and a more focused, localized attention pattern. The subsequent analysis shows that these two attention patterns were differentially associated with comprehension accuracy in questions that required either globalized (summative questions) or localized (detailed questions) attentional needs. Moreover, the emotional state influenced the use of the two attentional strategies, as well as reading performance, measured as accuracy and reading time. Our findings demonstrate how emotion may have facilitated or interfered with cognitive processing during reading comprehension that requires different attentional needs, which provides valuable insight into the intertwining relationship between emotions and other cognitive functions.

DynamicRL: Data-Driven Estimation of Trial-by-Trial Reinforcement Learning Parameters

In uncertain and dynamic environments, biological agents must adapt their decision-making strategies to maximize rewards. Traditional reinforcement learning (RL) models typically assume that such adaptation is governed by dynamic value updates controlled by fixed parameters or predefined schedules. However, these assumptions limit the models' ability to capture the flexible and context-sensitive nature of biological decision-making. To overcome this limitation, we introduce \textit{DynamicRL}, a novel framework that estimates RL parameters from behavioral data on a trial-by-trial basis. We demonstrate that DynamicRL substantially improves the predictive performance of standard RL models across eight decision-making tasks, thereby reducing scientific regret. DynamicRL captures the rich temporal variability inherent in decision-making behavior, achieving predictive performance comparable to that of recurrent neural networks trained directly on the data, while preserving the interpretability and theoretical grounding of RL models. Moreover, it enables the examination of how agents dynamically adjust RL parameters in response to environmental changes, offering insights into the cognitive mechanisms underlying such adaptations. Thus, DynamicRL serves as a efficient data-driven framework for estimating RL parameters, facilitating fine-grained behavioral analysis with potential applications in computational psychiatry and neuroscience.

Raising Eyebrows and Raising Pitch: How Non-Verbal Uncertainty Cues Influence Assessments of Probability Phrases

In this pre-registered study, we examined how non-verbal uncertainty cues, namely intonation and facial expressions, influence perceived speaker certainty and the interpretation of verbal probability phrases (e.g., "almost never," "probably," "certain"). Prior research on such phrases has focused on written cues, whereas communication often includes auditory and visual signals. Using a 2x2 within-subjects experimental design (N=100), we found that rising intonation and marked facial expressions independently reduced perceived speaker certainty. In most, but not all conditions, these cues led to greater variability in how participants assigned numerical values to verbal probability expressions (e.g., interpreting "likely" as anywhere from 10% to 85% likely). Notably, the combination of rising intonation and marked facial expressions produced the lowest perceived certainty, while there was no such additive effect on interpretation variability. These results highlight the importance of non-verbal cues in uncertainty communication, with implications for fields such as health, environmental, and technological risk communication.

Speak Last and Step-by-Step: The Effect of Order and Response Mode on Evidence Evaluation

Previous research on order effects in legal decision-making has produced mixed results, possibly due to different response modes adopted in the tasks: End-of-Sequence (EoS) or Step-by-Step (SbS), which might reflect different cognitive models that fact-finders employ during evidence evaluation and integration. This paper investigates how response mode interacts with evidence order to influence judgments of the probability of guilt and verdict. In Study 1 (N = 159), no order effects were found in the EoS condition; but a recency effect emerged in the SbS condition. Study 2 (N = 95) revealed no order effect when the first set of evidence was judged SbS and the second set EoS. We also found that participants' probability of guilt judgments were generally consistent with Bayesian predictions when they responded SbS, but not when they responded EoS. We discuss potential explanations for these findings and their implications for legal decision-making.

Framing Perception: Exploring Camera Induced Objectification in Cinema

This study investigates how cinematographic techniques influence viewer perception and contribute to the objectification of women, utilizing eye-tracking data from 91 participants. They watched a sexualized music video (SV) known for objectifying portrayals and a non-sexualized music video (TV). Using dynamic Areas of Interests (AOIs)—head, torso, and lower body—gaze metrics such as fixation duration, visit count, and scan paths were recorded to assess visual attention patterns. Participants were grouped according to their average fixations on sexualized AOIs. Statistical analyses revealed significant differences in gaze behavior between the videos and among the groups, with increased attention to sexualized AOIs in SV. Additionally, data-driven group differences in fixations identified specific segments with heightened objectification that are further analyzed using scan path visualization techniques. These findings provide strong empirical evidence of camera-driven gaze objectification, demonstrating how cinematic framing implicitly shapes objectifying gaze patterns, highlighting the critical need for mindful media representation.

Modeling intrinsic motivation as reflective planning

Why do people seek to improve themselves? One explanation is that improvement is intrinsically rewarding. This can be formalized in reinforcement learning models by augmenting the reward function with intrinsic rewards (e.g., internally-generated improvement signals). In this paper, we develop an alternative explanation: the drive for improvement arises from planning in a state space that includes internal states (e.g., competence). Planning is therefore reflective in the sense that it considers the value of future internal states (e.g., "What could I accomplish in the future if I improve my competence?"). We formalize this idea as a sequential decision problem which we dub the reflective Markov Decision Process. The model captures qualitative patterns of skill development better than a range of alternative models that lack some of its components. Importantly, it explains these patterns without appealing to intrinsic rewards.

The effect of gender on multimodal child-directed language: Evidence from analyses of broadcast programmes

We investigated gender differences in multimodal communication directed to children and adults. Eighty-two broadcasters (46 females and 36 males) participated in hosting adult-directed and child-directed broadcasting programmes respectively, and their lexical/syntactic features, prosody, and gestures were compared. Results revealed that broadcasters adapted their communication styles when addressing children. However, notable gender differences emerged: male broadcasters exhibited less diverse vocabulary, longer utterances, lower pitch but higher intensity, faster speaking rate with more pauses, and fewer referential gestures than their female counterparts. Furthermore, male broadcasters demonstrated larger adjustments in word frequency and vocal intensity but smaller adjustments in the use of questions and gestures than females. These findings highlight distinct patterns in how men and women adapt multimodal communication to children, offering valuable insights into gendered strategies in child-directed language production and recipient design. Moreover, it offers implications for developing tailored broadcast training.

Navigating Family Ties: Young Children's Cognitive Representations of the Family Network

Family is often central to an individual's early life. However, past work suggests mixed evidence as to whether young children can represent family relationships, showing even the words used to represent these relationships—like grandmother—are hard for young children to learn and define. The current study investigates whether 4-to-5-year-old children (N=64) recognize relationships in their families, testing the hypothesis that children can recognize intimate relationships in their environments. Children expected moms to seek out comfort from maternal but not paternal grandparents and expected dads to seek out comfort from paternal but not maternal grandparents. Children did not share those expectations for general information-seeking, and instead expected their parents to seek information for the relative with the relevant skill even when that grandparent was not socially close to the parent. These results suggest that from a young age, humans have the capacity to recognize relationships within their earliest social network—the family.

When Bayesians take over: A computational model of parental intervention

When children encounter challenges, parents often wonder: Should I let my child figure it out or take over? How parents resolve this dilemma shapes key developmental outcomes, yet we know little about the cognitive mechanisms that drive these decisions. Here, we model parental "take over" decisions as a Bayesian solution to a Partially Observable Markov Decision Process (POMDP) and qualitatively compare model predictions with behavioral data from parent-child interactions. We find that two core beliefs guide intervention: the child's probability of success and the utility of the task. Parents are more likely to take over when they believe their child is less skilled and the task is harder, and more likely to step back when they expect the rewards of independent effort to outweigh the costs. The model captures how these beliefs interact to shape decision-making and, together with the empirical data, reveals the cognitive computations that underlie parental intervention.

Behavioral Characteristics of Learning Phases: How Individual Differences Shape Learning Trajectories in a Virtual Environment

Procedural learning occurs in three phases—cognitive, associative, and autonomous—enabling skill acquisition across domains like medicine and sports. However, learning efficiency varies due to individual differences. While factors like cognitive abilities and learning environments influence this variability, their effects across learning phases remain understudied, particularly in Virtual Reality (VR). This study examines how cognitive abilities (memory span, mental rotation) and VR-related factors (familiarity, cybersickness) impact performance in a 3D assembly task within an immersive VR environment. Results reveal that lower VR familiarity prolongs task completion in early phases, highlighting interaction-related challenges. Higher mental rotation ability enhances performance in the autonomous phase, whereas cybersickness hinders efficiency. These findings suggest that adapting VR-based learning scenarios to individual profiles—such as early guidance for VR novices and phase-specific challenge adjustments—could optimize learning outcomes. Additionally, considering cybersickness effects in advanced phases supports the use of distributed learning approaches to mitigate discomfort.

Mapping Acoustic Cues to Pragmatic Functions: Perceptual Cue Weighting of Prosodic Focus in Mandarin

Understanding how multiple acoustic dimensions are mapped onto linguistic representations is important in speech perception. This study explores how native Mandarin listeners process the communicative intentions of prosodic focus by examining the perceptual weightings of F0, duration, and intensity. Using a Visual World Paradigm, thirty native Mandarin participants listened to re-synthesized audio stimuli and responded to broad-focus or narrow-focus options. Results showed that the acoustic cues significantly influenced focus interpretation, with a greater reliance on F0 than intensity and duration. Eye-tracking data revealed perceptual divergence in the F0 condition, with the divergence of looks occurring at an earlier time window for acoustic processing and later for pragmatic processing. These findings suggest that native listeners effectively map acoustic variations to communicative demands, emphasizing the critical role of F0. The study highlights the temporal dynamics of interpreting prosodic focus, offering insights into language comprehension.

Reading instruction and individual differences in a computational model of Chinese character reading

Adopting effective reading instruction is vital for educators and novice readers. In modern Chinese, approximately 80% of the characters are phonetic-semantic compounds. Orthographic knowledge training is one of the efficient training methods among fluency, working memory, phonological, orthographic, and morphological training in literacy development. However, the comparative effectiveness of orthographic knowledge training within phonics-based versus meaning-based instruction has received limited attention. Such comparisons have been shown to be vital for understanding effective reading instruction and individual differences in English reading. By developing a series of triangle models of Chinese character reading, this study aimed to investigate the influence of instructional methods on individual differences in reading. Specifically, the models were trained with sound-focused, meaning-focused, or even (mixed) instructional schemes. We employed semantic reliance (SR), which measures the relative reliance on orthography-to-phonology and orthography-to-semantics pathways, to assess individual reading differences across various training conditions. The simulation results demonstrated that SR scores varied across instructional methods, with the highest scores observed in the meaning-focused condition, followed by the even condition, and then the sound-focused condition. Furthermore, across all instructional conditions, the orthography-to-phonology pathway played a greater role in the reading-aloud task. These simulation results align with findings from studies of English reading. While the models successfully captured a range of typical reading effects in Chinese reading-aloud tasks, the presence of radical consistency effects also depended on various instructional methods.

A Normative Model of Delay Discounting Across the Lifespan: Tradeoffs Between Mortality, Fertility, and Parenting

The developmental trajectory of delay discounting has been debated for over 30 years, with empirical findings often inconsistent. The controversy stems partly from the absence of formal models to guide empirical investigation and theoretical construction. We proposed a normative model of delay discounting across the lifespan, building on the work of Sozou and Seymour (2003), which emphasizes the tradeoff between mortality, fertility, and parenting in determining delay discounting. We simulated the model across varying parameter spaces and identified a U-shaped association between age and delay discounting, with stronger parental investment postponing the turning point of the function and reducing the overall discount rate. Empirical data supported these predictions, demonstrating a U-shaped age effect and highlighting the role of parental care motivation in explaining cultural and age-related differences in delay discounting. The results suggest that variance in delay discounting can be understood as a rational adaptation to life history tradeoffs.

The Effect of Moral and Statistical Norm Violations in Children's Counterfactual Reasoning

Counterfactual reasoning involves thinking about how reality could have been different. Adults show remarkable consistency in the counterfactual possibilities they imagine. For example, they tend to imagine counterfactuals that undo immoral actions. However, it remains unclear whether this link between morality and the counterfactual imagination is an inherent cognitive feature, present from early childhood. To elucidate this relation, we tested 191 4- to-11-year-olds across two studies. In Study 1, children heard stories in which a moral norm-violating and a moral norm-conforming character together bring about a negative outcome. When asked what could have happened differently, children changed the moral norm-violating part of reality over other parts with increasing age. In Study 2, we examined whether this effect is unique to moral norm violations or extends to statistical norm violations. Children began mutating moral norm violations earlier than statistical norm violations, suggesting morality influences counterfactual thinking earlier than statistical norms.

Potentially therapeutic effects of telling and retelling meaningful life stories

We investigate how autobiographical stories of meaningful life events differ from other stories, and how the emotional dynamics of these stories shift when retold by the same narrators. Participants were initially asked to write down a personally meaningful memory. Later, they were invited to retell the same memory. As a baseline, we analyzed a previously collected set of stories that emphasized sentiment (sad vs. happy) but not meaningfulness. To examine emotional patterns, we utilized self-reported emotions, external ratings on core emotions, and sentiment analysis of stories, capturing both overall sentiment and emotional shifts within each story. Our findings showed that meaningful memories tend to be more positive than baseline stories, with a notable increase in positive emotions toward the end of the stories. However, in retelling, both positive and negative core emotions decreased. We suggest that the telling and retelling of a meaningful memory has therapeutic effects that emphasize positive sentiments, while decreasing the emotions to allow for a reframing of the memory.

How Helpful is Visual Context for Speech Processing? Evidence from Multi-modal Speech Tracking in Monolingual and Bilingual Speakers

Visual cues like facial expressions and gestures enhance speech comprehension. While prior studies have explored L1 multimodal speech processing, research on bilinguals remains limited. Here we examine how visual context influences speech processing in monolinguals and bilinguals by assessing changes in sensitivity to different speech dimensions. EEG was recorded from 24 monolinguals and 24 bilinguals as they viewed multimodal speech clips presented in audiovisual or audio-only formats. Then, a temporal response function was applied to decode neural responses to audio envelope and surprisal to index sensitivity of acoustic and semantic information. Results show that visual context facilitated audio tracking in bilinguals but did not enhance surprisal tracking. Conversely, monolinguals benefited from visual input for surprisal tracking but not envelope tracking. These results suggest that bilinguals may allocate more cognitive resources to audio processing when integrating visual cues, potentially limiting the availability of resources for higher-level semantic processing.

Computational Modeling of Tonal Encoding in Disyllabic Mandarin Word Production

Approximately half of the world's languages are tonal, and how lexical tone is encoded in spoken word production is still unclear. In Mandarin word production, there are two contrasting views regarding the mechanisms of tonal encoding. The two-stage model assumes that the lexical tone is selected first at the early stage of production, and then integrated with the atonal syllable at the later stage, while other researchers proposed that the lexical tone is retrieved only at the later stage of production. In this study, we performed computational simulations on disyllabic words to uncover the mechanisms underlying the facilitation and interference effects on naming latencies observed in previous primed picture naming studies, which intended to verify the two theoretical accounts of tonal encoding in Mandarin spoken word production. The results supported the two-stage model of tonal encoding in disyllabic Mandarin word production. Increased inhibition between atonal syllables and decreased activation between the tonal frame and the syllable motor program, implying slower tone-to-syllable integration, appear to be the prerequisite for generating the interference effect of tonal overlap without shared syllabic information.

Comparisons promote a tradeoff between generalization and individuation

Qualitative comparison provides learning opportunities that support a range of cognitive abilities. The present research clarifies how the contrasting goals of qualitative comparison, processing similarities versus differences, contribute to the overall impact that comparison has on learning. We argue that attending to similarities produces memory traces that are dissociable from those left by attending to differences, and that these memory traces reflect a tradeoff between generalization and individuation. Across two preregistered experiments, participants either attended to similarities or differences between stimuli. We show that whereas attending to similarities yielded better performance on a category recognition task (Experiment 1), attending to differences yielded better performance on an old/new recognition task (Experiment 2). Comparison thus serves as a learning opportunity that enables agents to manage the generalization-individuation tradeoff by shifting their attention toward similarities or differences.

Decoding Metaphors and Brain Signals in Naturalistic Contexts: An Empirical Study based on EEG and MetaPro

Metaphors are seen as psycholinguistic phenomena that reveal human cognition. However, their neural basis in naturalistic contexts remains underexplored, although it offers insights into how metaphors shape everyday cognition at the neural level. In this study, we examine how metaphors are reflected in brain activity using electroencephalography (EEG). We analyze EEG data collected during naturalistic reading conditions, where participants read texts without explicit cues indicating the presence of metaphors. Using MetaPro, an advanced metaphor processing tool, we aim to identify the neural signatures of metaphor perception in real-world contexts. Our results reveal significant differences in EEG patterns between metaphorical and literal language. Metaphorical cognition is associated with increased high-frequency EEG variability and enhanced functional connectivity in the left hemisphere. Case studies suggest that different metaphorical concept mappings correspond to distinct neurocognitive patterns. These findings provide important neural evidence for the use of metaphorical concept mappings to analyze and differentiate cognitive processes.

Learning about Inductive Potential from Generic Statements

Generic statements (e.g., "Climbers drive Subarus") shape what categories people take as meaningful bases for generalization. After hearing a generic, people not only learn about the prevalence of a feature in a category (e.g., how many climbers drive Subarus), but also about the inductive potential of the category (e.g., that climbers share many features). Here, we propose a Bayesian model of how people infer inductive potential from generics. To test our model, we introduced adults (n = 284) to nothing (baseline), or to members of a novel social category accompanied by generic statements (e.g., ``Zarpies sleep in trees'') or specific statements (e.g., ``This Zarpie sleeps in trees''). We then measured inferred inductive potential by eliciting the prevalence of novel features. As predicted, generics increased while specific statements decreased the category's inductive potential, relative to baseline. Our account explains how generics facilitate the cultural transmission of social categories believed to be bases for generalization.

Taking the C-nic Route: Object-Directedness and Path, Not Efficiency, Shape Adults' Word Extension

What intuitions guide adults in extending the meanings of new words? Are these intuitions consistent with prelinguistic sensitivities operative in infancy? Across two preregistered experiments, adult participants saw simple grid-world environments in which characters moved in a "C" path efficiently (or not) to an object (or not). These events were labeled with a novel verb or noun. Participants were asked whether that word applied to new events varying in object-directedness, path, and efficiency. By contrast to infants' focus on efficiency, adults instead focused on object-directedness and path, and they did so similarly for both verbs and nouns. Language might thus build on universal, prelinguistic assumptions of goal-directedness and efficiency to specify what goal an agent might have (e.g., object- vs. location-directed) as well as how an agent might achieve that goal (e.g., this vs. that kind of path), ultimately restricting the hypothesis space for action understanding and supporting learning.

Community detection in inflectional networks

Inflection classes partition the lexicon, classifying lexemes into groups based on shared inflectional exponence; as such, they are foundational aspects of lexical organisation. The task of identifying good partitions is non-trivial, however. We show that a modularity-based community detection method, borrowed from network theory, allows for identification of intuitive partitions at different granularities of representation. Applied to French verbs and Bosnian/Croatian/Montenegrin/Serbian nouns, the method detects inflection classes at multiple granularities and reveals (imperfect) hierarchical organisation of these. Ultimately, community detection methods facilitate more nuanced understanding of the inflectional organisation of the lexicon. Keywords: lexicon; morphology; inflection classes; hierarchical organisation; network science; community detection

Territorial Gestalt in the Strategy of Conflicts

Human social interactions can often be modeled as mixed-motive games, where cooperation and competition coexist. Although many studies have examined how people coordinate in purely cooperative settings, less is known about how individuals navigate severe, protracted conflicts to reach mutually beneficial settlements. In this study, we introduced a long-horizon territorial conflict game in which participants competed on a two-dimensional board by dispatching troops to expand their territory. Despite repeated interactions, conflict intensity did not subside over time—contrary to findings in simpler, one-shot matrix games. However, when a payoff-irrelevant color boundary was introduced, participants used this salient perceptual cue as a focal point for dividing the territory. The presence of this "territorial Gestalt" shifted strategies toward defensive postures, reduced the frequency of direct battles, and enabled opponents to settle conflicts precisely along the perceptual boundary. These findings extend focal point theory by demonstrating that humans naturally import external, payoff-irrelevant concepts into conflict situations to achieve coordinated outcomes. Our results highlight the importance of perception-based territorial Gestalt in fostering cooperative resolutions to otherwise intense and enduring disputes.

Spatial Dynamics Shape the Interaction Between Motor Adaptation Processes

Motor adaptation relies on explicit aiming strategies and implicit recalibration, but their interaction and effects on implicit skill learning remain debated. While these processes were initially thought to combine linearly, recent research challenges this view, though spatial and temporal dynamics may have confounded these findings. Specifically, implicit recalibration peaks at where individuals aim their movements, and adaptation operates across multiple timescales, with both stable and volatile components. To examine whether these factors mask the true relationship between explicit strategies and implicit recalibration, we conducted a visuomotor rotation task that obtained independent measures of both processes while accounting for spatial and temporal dynamics. After controlling for task instruction clarity and spatial dynamics (plan-based generalization), we found a strong but subadditive relationship between explicit strategies and implicit recalibration, with temporal dynamics showing minimal influence. This sub-additivity may stem from methodological imprecision or nonlinear interactions between processes.

Lexical Representation of Noncanonical Forms: Evidence from Persian

The lexical representation of words with multiple pronunciation variants has been widely debated: while single storage accounts propose that all variants of a word are represented by a single, canonical, representation, multiple storage accounts include representations for different pronunciation variants. Previous work has provided evidence for the representation of noncanonical variants, consistent with multiple storage models; however, this work has focused on highly frequent noncanonical variants, which some single storage models propose can also be lexically represented. To test predictions of multiple storage models more rigorously, we examined the representation of low-frequency noncanonical variants of the Persian uvular stop [É¢]. Results from three experiments support the multiple storage model, in that they provided evidence for lexical storage of low-frequency variants. Implications of these findings are discussed.

Neuro-identity mixing impacts linguistic accommodation and regularisation: evidence from autistic and allistic interactions

Linguistic accommodation is the process by which people make their language more like that of their interlocutor, and has been argued to contribute to language change. However, it is unclear to what extent people of different neurotypes accommodate, or how neurotype mixing -- which has been shown to reduce communicative success -- impacts linguistic accommodation. In this paper, we build on previous research which uses artificial language learning to investigate accommodation as a mechanism for linguistic regularisation (i.e., the reduction of variation in a grammatical system). We test the impact of neurotype mixing on accommodation, with the aim of better understanding whether such mixing impacts processes of language change. Our results suggest that both allistic and autistic participants accommodate less and retain less of the variant when in mixed-neurotype pairs, but that this effect is more pronounced in autistic people. We discuss the importance of these results with respect to the Double Empathy theory of mixed-neurotype communication and language evolution.

Calculating probabilities from imagined possibilities: Limitations in 4-year-olds

Adults can calculate probabilities by running simulations and calculating proportions of each outcome. How does this ability develop? We developed a method that lets us bring computational modeling to bear on this question. A study of 40 adults and 31 4-year-olds indicates that unlike adults, many 4-year-olds use a single simulation to estimate probability distributions over simulated possibilities. We also implemented the 3-cups task, an established test of children's sensitivity to possibilities, in a novel format. We replicate existing 3-cups results. Moreover, children who our model categorized as running a single simulation on our novel task show a signature of running a single simulation in the 3-cups task. This signature is not observed in children who were categorized as running multiple simulations. This validates our model and adds to the evidence that about half of 4-year-olds don't evaluate multiple candidates for reality in parallel.

Vicarious emotion predictions integrate information about relationship strength

It is well-established that people feel empathic and counter-empathic emotions in response to others' experiences. The abilities to predict, interpret, and elicit these emotions help people successfully navigate social interactions. However, we know little about how people understand these vicarious emotions. Here we test the hypothesis that observers use a third-party appraisal approach, much like the one they use for direct emotion inference, to predict vicarious emotions. Critically, though, this reasoning must include information about relationship strength to capture the relevance of one person's experiences for an onlooker's emotional appraisals. We find support for this hypothesis from an experiment in which participants predicted emotions for both the player of a gambling game and an onlooker, whose social closeness to the player varied across trials. A model that integrated closeness information into an appraisal process better explained the data than an alternative model based on expectations of emotional contagion. These results offer initial insights into human reasoning about vicarious emotions.

Miscalibrated trust hinders effective partner choices in human-AI collectives

Trust, a cornerstone of human cooperation, faces unprecedented challenges as artificial intelligence (AI) agents permeate social systems, transforming mechanisms humans have evolved to build trust. We demonstrate how a prevalent feature of AI agents—being excessively prosocial—reshapes trust dynamics in experiments (N = 675) simulating hybrid societies comprising humans and AI agents ("bots") powered by a state-of-the-art large language model. Using a partner-selection game with pre-decision communication, Study 1 revealed a paradox: Undisclosed bots, despite being more trustworthy than humans and detectable by communication, were not preferentially selected as partners. Instead, bots' prosociality was misattributed to their human competitors. Study 2 showed that disclosing bots' identity initially enhanced humans' bias against selecting bots but improved trust calibration over time. Our work demonstrates the dual effect of transparency in the dynamic calibration of trust in human-AI ecosystems and introduces a framework for evaluating AI agents in interactive, hybrid environments.

Blending Boundaries: A Computational Approach to How Bilinguals Reconcile Cross-Linguistic Categorization

We categorize the world using labels that aid memory, recognition, and generalization. While some concepts have clear boundaries, others are more fluid, leading to cross-linguistic differences. How bilinguals manage these differences remains unclear. We investigate this by comparing English monolinguals, Mandarin monolinguals, and Mandarin-English bilinguals in a 2AFC task to test whether bilinguals' categorization aligns with monolingual norms or forms an integrated system. Additionally, we develop a neural network model to simulate category boundary formation under varying language exposure. Our model closely mirrors behavioral data, supporting the idea that bilinguals develop a shared categorization system shaped by dominant language exposure. This combined behavioral and computational approach offers new insights into how bilinguals resolve cross-linguistic conflict and the cognitive mechanisms underlying multilingual concept organization.

From Minimal Traces to Scenarios of the Past: A Neuro-Computational Model on Regaining Categoricity and Compositionality in Remembering

This paper presents a proof of principle for Trace Minimalism (Werning, 2020), a novel philosophical framework for episodic memory. Trace Minimalism claims that remembering does not involve the storage of representational content but rather the reconstruction of past scenarios through the interaction of minimal traces with semantic information. Minimal traces establish a causal link to prior experiences but lack categorical and compositional content. We provide a neuro-computational model using a vector-quantized autoencoder and a transformer-based semantic completion mechanism. Our findings support the hypothesis that remembering is possible without representational memory traces and that minimal traces, in interaction with semantic information, reliably construct past scenarios. The results offer a compelling alternative to classical representational theories of memory while maintaining causal continuity with past experiences.

Visual Theory of Mind Enables the Invention of Proto-Writing

Symbolic writing systems are graphical semiotic codes that are ubiquitous in modern society but are otherwise absent in the animal kingdom. Anthropological evidence suggests that the earliest forms of some writing systems originally consisted of iconic pictographs, which signify their referent via visual resemblance. While previous studies have examined the emergence and, separately, the evolution of pictographic systems through a computational lens, most employ non-naturalistic methodologies that make it difficult to draw clear analogies to human and animal cognition. We develop a multi-agent reinforcement learning testbed for emergent communication called a Signification Game, and formulate a model of inferential communication that enables agents to leverage visual theory of mind to communicate actions using pictographs. Our model, which is situated within a broader formalism for animal communication, sheds light on the cognitive and cultural processes underlying the emergence of proto-writing.

Agent Preference in Children: The Role of Animacy and Event Coherence

Thematic roles in language (Agents, Patients) are considered to be hierarchically organized in terms of their salience, and this hierarchy is rooted in their counterparts as event participants in cognition. Here, we examine the relative salience of Agents over Patients in two-participant causative events in Turkish-speaking 3- to 5-year-old children. We also test if this asymmetry is modulated by the animacy of the Patient (human vs. inanimate object) and specific to the presence of a coherent event. In an eye-tracked change detection task, changes to Agents were detected more accurately (and after fewer fixations) than changes to inanimate Patients when there was a coherent event. This asymmetry disappeared when the Patient was animate (for accuracy) and when event coherence was disrupted (for both accuracy and fixations). These findings suggest an interplay of event roles and animacy in Agent preference.

A Metacognitive Model of Memory Encoding Modulated by Rewards

Despite robust empirical evidence supporting the role of reward in enhancing memory, the relationship between reward and memory shows complex patterns. We present a novel computational model that considers how people optimally allocate limited cognitive resources during memory encoding. Unlike previous accounts that directly link rewards with stronger memory encoding, we allow our model to adaptively decide how much to encode based on the overall reward environment and one's limited cognitive resources. Our model's predictions align closely with human behavior across three experiments. It explains why high-reward items are better remembered than low-reward items only when presented together but not separately, and how memory is modulated by rewards of both current and preceding (but not future) items. We also collected data demonstrating that this insensitivity to rewards of future items can be reversed when participants anticipate upcoming rewards. These findings provide evidence that memory encoding is an active process involving meta-level control rather than a passive response to individual reward values.

Two paths to variation in semantic judgments: How ambiguity and conceptual diversity drive individual differences in meaning

Why do individuals differ in the way they assign semantic labels to the same perceptual referent? One possible source of disagreement is referential ambiguity, where stimuli near category boundaries are harder to label. Another is latent diversity in conceptual representations, leading to labeling differences even for confidently categorized referents. To distinguish between these sources of variation, we used Gibbs Sampling to search through the multidimensional Chernoff face space to find faces that are prototypically happy or sad, or ambiguous (Experiment 1, N = 253). Then in Experiment 2 (N = 684), we asked a naive group of participants to rate the emotions of these faces, finding that ambiguous faces elicited greater individual differences in valence interpretation and a medium level of variation when being labeled using basic emotional terms. Simultaneously, even well-categorized happy and sad faces triggered variability in their consensus labels, though showing less disagreement when mapped onto an obviously inappropriate label. These findings suggest that both categorical boundaries and within-category variability shape individual differences in semantic interpretation.

Communicating Global Income Rank Increases Charitable Donations

People in high-income countries underestimate their affluence relative to the global population, potentially limiting their will- ingness to donate to charity. We test whether a rank-based nudge (RBN) informing individuals how their post-tax income ranks globally increases charitable donations. Further, because people have been shown to shift their reference point for who should give to charity based on their current income, we inves- tigate whether donations can be boosted by asking how much people with different income ranks should give to charity (in- junctive distribution task, IDT) before administering the RBN. Participants (N = 1,217) were randomly assigned to one of four conditions: control, RBN, IDT, or IDT+RBN. Those in the RBN conditions donated significantly more and reduced over- estimation of others' income across all percentiles of the global income distribution. However, the addition of the IDT did not further increase donations. These findings suggest that RBNs effectively boost generosity by correcting misperceptions of relative affluence.

Words with more diverse semantic networks are more readily extended to novel meanings

Why do we say ''grasp an idea" but not ''hold an idea," or ''small talk" but not ''little talk"? Although these word pairs share similar meanings, they differ in their tendency to be metaphorically extended. Prior research has focused on how well source concepts map onto target meanings to explain why some words are more readily extended to novel metaphorical contexts. However, much less attention has been paid to how the structural properties of a word's semantic network contribute to its patterns of metaphorical extension. Using a large-scale dataset of semantic networks in English (De Deyne et al., 2019), we explored whether these structural properties predict selective extension in a metaphor rating study with English speakers, who were presented with Mandarin metaphorical expressions that lack direct English equivalents. Our findings revealed that English speakers systematically favor certain conceptual mappings from Chinese, suggesting that some metaphorical mappings may be universally recognized. However, when controlling for conceptual mappings by analyzing synonym pairs, we found that the structural features of a word's semantic network significantly predict rates of extension. Specifically, words embedded in densely interconnected neighborhoods showed stronger resistance to extension, while words bridging diverse semantic communities demonstrate greater mutability.

ChatGPT as a Competent Enough Judge in Validating Responses from a Divergent Thinking Task

The validation of responses in divergent thinking tasks is a critical yet understandardized step that should precede creativity scoring. However, inconsistencies related to human judges in this step may compromise the reliability of the results. This study introduces a systematic approach using ChatGPT to validate responses in the Alternate Uses Task (AUT) and compares its performance against six human judges. Analyzing 1245 AUT responses for common objects, we evaluated validity based on precisely defined criteria. Human judges exhibited significant variability, achieving unanimous agreement for only 58% of responses, while ChatGPT demonstrated significant alignment with human assessments, reflecting a capacity to replicate aggregated human judgment. These findings underscore the potential of Large Language Models to enhance objectivity and reproducibility in creativity research by automating response validation. We advocate for integrating AI-driven validation protocols into divergent thinking response evaluation and emphasize transparent reporting of criteria to advance methodological rigor in the field.

A Bayesian Model of Mind Reading from Decisions and Emotions in Social Dilemmas

Humans can effectively infer others' mental states, predict their behavior, and adapt their own level of cooperation accordingly in social dilemmas. However, the computational mechanisms underlying this ability remain unclear. While previous research has shown that people use both actions and emotional expressions as social cues, how these different signals are integrated during social inference has not been formally modeled. Here we propose a Bayesian framework that explains how people infer others' Social Value Orientation (SVO) from their decisions and emotional expressions in the iterated Prisoner's Dilemma. Our model formalizes this inference process through two key mechanisms: (1) rational decision-making based on utility transformation according to SVO, and (2) emotional expressions driven by outcome appraisals. We tested our model against empirical data from an existing study involving 711 participants and found that it captured both their reputation judgments and cooperation predictions. These results suggest that people may employ Bayesian inference to integrate behavioral and emotional signals when predicting others' cooperative tendencies.

The Cognitive Complexity of Rule Changes

Concept change is a fundamental cognitive process that enables individuals to adapt to new and conflicting information. To investigate the mechanisms underlying such adaptations, we introduce the Counting Game, an abstract rule-based paradigm. In this article, we evaluate whether the paradigm effectively captures complexity differences between different types of manipulations. Our experiment involved counting tasks where participants had to apply rules that modify how certain objects are counted, enabling us to examine the effects of perceptual complexity, rule operations, rule interactions, temporal dependencies and scope. We analyzed accuracy and response times to assess whether these manipulations elicit desired effects. Additionally, we constructed predictive models to identify key features influencing task difficulty. By evaluating the theoretical soundness of the Counting Game, we establish an empirical foundations for future studies on forgetting operations, among other concept changes.

mixEEG: Enhancing EEG Federated Learning for Cross-subject EEG Classification with Tailored mixup

The cross-subject electroencephalography (EEG) classification exhibits great challenges due to the diversity of cognitive processes and physiological structures between different subjects. Modern EEG models are based on neural networks, demanding a large amount of data to achieve high performance and generalizability. However, privacy concerns associated with EEG pose significant limitations to data sharing between different hospitals and institutions, resulting in the lack of large dataset for most EEG tasks. Federated learning (FL) enables multiple decentralized clients to collaboratively train a global model without direct communication of raw data, thus preserving privacy. For the first time, we investigate the cross-subject EEG classification in the FL setting. In this paper, we propose a simple yet effective framework termed mixEEG. Specifically, we tailor the vanilla mixup considering the unique properties of the EEG modality. mixEEG shares the unlabeled averaged data of the unseen subject rather than simply sharing raw data under the domain adaptation setting, thus better preserving privacy and offering an averaged label as pseudo-label. Extensive experiments are conducted on an epilepsy detection and an emotion recognition dataset. The experimental result demonstrates that our mixEEG enhances the transferability of global model for cross-subject EEG classification consistently across different datasets and model architectures. Code is published at: https://github.com/XuanhaoLiu/mixEEG.

Exposing the Biased Vulnerabilities of Large Language Models in Explainable Recommender Systems

Explainable recommender systems (XRSs) enhance user trust by providing personalized recommendations followed by persuasive explanations. Integrating large language models (LLMs), such as GPT-4, advances this domain but introduces risks from biases embedded within LLMs. These biases can lead XRSs to generate persuasive explanations that promote favored recommendations, influencing users to accept the model's preferences over their own. This paper identifies a previously unrecognized security threat: the intentional induction of XRSs via biased LLMs to promote specific items through misleading yet compelling explanations. Inspired by work in the psychology of persuasion, we construct biased datasets and systematically insert these biases into LLM-based XRSs. Experiments across four leading LLMs reveal that biases can significantly affect user decisions, with close to 50\% of users changing their choices. To counteract this, we propose a prompt rephrasing defense that effectively mitigates these biases, safeguarding the trustworthiness of XRSs.

Mental Model Alignment: Building Cognitive Interfaces for Explainable Reinforcement Learning

Deep reinforcement learning has achieved remarkable success in complex decision-making tasks, yet its black-box nature limits practical deployment in safety-critical domains. Current explainable reinforcement learning methods often fail to align with the hierarchical and temporal structure of human mental models, which are central to cognitive science theories of decision making. To bridge this gap, we propose Mental Model Alignment (MMA), a novel framework that constructs cognitive interfaces using behavior trees to harmonize AI decision-making with human-understandable reasoning. MMA introduces three innovations: (1) a mental model encoder that captures the hierarchical decomposition of tasks into subgoals, mirroring human cognitive processes; (2) a cognitive pruning algorithm that simplifies BTs while preserving decision-critical nodes aligned with human mental schemas; and (3) a mental effort metric to quantify the cognitive load required for users to interpret policies. Evaluated across six benchmark environments, MMA outperforms state-of-the-art methods in interpretability, policy fidelity, and computational efficiency. Our results demonstrate that aligning AI policies with human mental models significantly enhances trust and usability in real-world applications.

Quantifying the cost of context sensitivity in decision making

It is well known that context-dependent decisions incur mental costs. While previous research has sought to formalize these costs at various levels of analysis, we still lack basic insight into the nature of mental costs, including the underlying cognitive resources being consumed. Moreover, many computational models assume that mental costs scale linearly with the cognitive resource being used, an assumption of convenience that has yet to be systematically tested. To address these gaps, we build on rate-distortion theory by formalizing an information-theoretic notion of mental costs. Specifically, we define the cost of policies---the mappings from states to actions---as a function of the mutual information between states and actions, the policy complexity. Across four decision-making experiments featuring diverse task manipulations, we find that this mental cost formulation offers a parsimonious description of how humans adaptively adjust their policy complexity across different tasks. Notably, a quadratic mental cost formulation, where increases in policy complexity incur supralinear costs, provides the best fit. These findings highlight the meta-cognitive ability of humans to account for mental costs when forming decision strategies, and pave the way towards a domain-general quantification of mental effort.

Visual moral inference and communication

Humans can make moral inferences from multiple sources of input. In contrast, automated moral inference in artificial intelligence typically relies on language models with textual input. However, morality is conveyed through modalities beyond language. We present a computational framework that supports moral inference from natural images, demonstrated in two related tasks: 1) inferring human moral judgment toward visual images and 2) analyzing patterns in moral content communicated via images from public news. We find that models based on text alone cannot capture the fine-grained human moral judgment toward visual stimuli, but language-vision fusion models offer better precision in visual moral inference. Furthermore, applications of our framework to news data reveal implicit biases in news categories and geopolitical discussions. Our work creates avenues for automating visual moral inference and discovering patterns of visual moral communication in public media.

Simulating the Emergence of Differential Case Marking with Communicating Neural-Network Agents

Differential Case Marking (DCM) refers to the selective use of grammatical case marking based on semantic, pragmatic, or other factors. The emergence of DCM has been studied in artificial language learning experiments with human participants, which were specifically aimed at disentangling the effects of learning from those of communication (Smith & Culbertson, 2020). Meanwhile, multi-agent reinforcement learning frameworks based on neural networks have gained significant interest to simulate the emergence of human-like linguistic phenomena. In this study, we employ such a framework in which agents first acquire an artificial language before engaging in communicative interactions, enabling direct comparisons to human results. Using a very generic communication optimization algorithm and neural-network learners that have no prior experience with language or semantic preferences, our results demonstrate that learning alone does not lead to DCM, but when agents communicate, differential use of markers arises. This supports Smith & Culbertson (2020)'s findings highlighting the critical role of communication in shaping DCM and showcases the potential of neural-agent models to complement experimental research on language evolution.

Blind spots in the mind's eye: Mental imagery often lacks detail and coherence

The workings and products of the imagination are often described in visual terms: We speak of ‘mental images' and the ‘mind's eye.' To what extent is this metaphorical? Should imagination be conceived of as a process of quasi-perceptual simulation or is it more sparse and abstract? Building on recent findings that suggest mental imagery tends to lack detail, we investigate how complete and coherent people's imagined scenes are. In Experiment 1, we presented a riddle-like vignette to participants and found that they on average only imagined 54% of the simple features we asked them about. Moreover, successfully finding a solution was unrelated to the number of features imagined. In Experiments 2a – 2c we found that participants often did not notice spatial contradictions in text descriptions (2a), even when scaffolded with a map (2b), and that spotting these contradictions was unrelated to performance on a mental rotation task (2c).

Prior-Prompt-Based GCN for Depression Recognition Through Gait Observation

In recent years, depression, as a prevalent mental health disorder has drawn increasing attention. With the advance of AI technology, automatic and objective diagnosis methods emerge by observing signals like electroencephalogram (EEG) signals, faces and behaviors. In the present paper, we propose gait analysis as a non-invasive method for depression detection. In this study, we propose a prior-prompt-based graph convolution network (PP-GCN) for depression recognition through gait that integrates skeleton and text modalities. Different from the conventional single-modal methods in the present study, we utilize prior knowledge and angle features. We innovatively introduce Generative Action Prompt (GAP), leveraging a pre-trained large language model to generate motion descriptions for different body parts, thereby providing prior knowledge for depression recognition. Additionally, considering the subtle gait feature variations in individuals with depression, we further incorporate a joint-angle-based representation strategy to capture fine-grained variations in movements. Experimental results demonstrate that the proposed model outperforms existing skeleton-based approaches on a large-scale dataset which contains over 25,000 gait sequences from nearly 300 volunteers named D-Gait, achieving excellent performance.

Comparison Helps Children Form Broad Explanations

Research has shown that generating explanations can benefit learning in both children and adults. In part this is because people prefer explanations that characterize phenomena in terms of broad regularities. Here we propose that comparison is integral to the process of generating broad, satisfactory explanations. Specifically, (1) generating explanations often invokes comparison, (2) the resulting structural alignment process reveals commonalities that feed into a broad explanation. In Experiment 1a, we adapted a study on explanation-generation by Walker et al. (2017). 5- and 6-year-old children were asked to explain a set of outcomes that could result either from a single broad cause or from two or more specific causes. When children had the opportunity to compare the outcomes, they arrived at the broad explanation, replicating Walker et al. When comparison was made difficult, children preferred specific explanations. The results suggest that comparison is integral to the power of self-explanation. In Experiment 1b, we found that comparison by itself was not sufficient to lead children to broad explanations—suggesting that both explanation and comparison are critical in allowing children to attend to the broad pattern.

Learning in Groups: Possible Advantages of Working in Threes

This study investigated how interacting in different size discussion groups can improve students' understanding of research methods topics at the university level. Students completed a discussion activity in groups of twos, threes, and fours, and later were tested individually for their understanding of the target concepts. While the largest group size (fours) performed best on the worksheet that was completed in groups as part of the discussion activity, working in a group of three appeared to support better understanding of the target concepts for weaker students in the course. Several alternative explanations for the benefit from working in threes are considered.

Children's Differentiation of Fake News from Real News is Facilitated by Cognitive Reflection

Adults' ability to detect online misinformation is improved by cognitive reflection and targeted instruction. Is the same true for children? We explored this question by asking elementary-school-aged children (n = 135) to judge the veracity of news stories, some real and some fake, and comparing their performance to scores on the Cognitive Reflection Test, Developmental version (CRT-D). Participants were also administered a tutorial encouraging them to scrutinize the plausibility of a story's content or the credibility of its source. Children's differentiation of fake news from real news was correlated with their CRT-D scores but did not improve with instruction. A comparison group of adults (n = 117) demonstrated similar findings with the exception that source-based instruction improved their news differentiation. These findings suggest that the ability to detect online misinformation is aided by cognitive reflection from the start but could be improved with knowledge of news sources.

Perceptual Discriminability Drives Overinformative Reference, But Colour Information is Special

When speakers refer to objects in the world, they often overinform: provide their listener with redundant adjectival information. Contrary to classical theories in linguistics, recent theories have framed overinformativeness as an efficient means of grounding reference in perceptual information of high discriminability to facilitate listener comprehension. However, the generalisability of such theories is constrained by the methodological challenge associated with reliably manipulating the perceptual discriminability of naturalistic stimuli. Here, we overcome these methodological challenges, using methods from psychophysics to manipulate the perceptual discriminability of colour and material attributes in a reference-production experiment. We provide a robust validation of the view that overinformative reference is driven by speakers grounding expressions in attributes of high discriminability. However, we also find that colour information is privileged above and beyond such factors of discriminability.

Do Whales Have Hair? Are Whales Mammals? Identifying Synchronic Inconsistencies Among Beliefs

Inconsistency among beliefs is a hallmark of irrationality. Despite longstanding interest in inconsistency in philosophy and psychology, empirical evidence of synchronically held inconsistencies among people's belief has proven elusive. Here, across two pre-registered experiments (Ns = 500, 274), we identify inconsistent beliefs simultaneously held by individual participants. Drawing on Sommer et al.'s (2023) proposal that accessibility in memory helps people achieve consistent beliefs, we constructed sets of questions that facilitated or hindered the accessibility of relevant knowledge. Our results support the proposal that consistency is enforced when beliefs are simultaneously accessible, rather than resulting from exhaustive consistency-checking. We find that when participants have simultaneous access to inconsistent beliefs–even regarding inconsequential general knowledge topics–they tend to revise their beliefs toward consistency. Furthermore, we experimentally distinguish an alternative explanation that the inconsistencies we evoked are merely inconsistent responses. Taken together, our results suggest that inconsistency among beliefs may be common, arising when inconsistencies are inaccessible.

Lexical leveraging across the vocabulary spectrum: Different semantic properties support delayed and advanced learners

Toddlers better retain novel object-label mappings for items from taxonomic categories they have more knowledge in. Separately, words for concepts with more perceptual features are learned earlier than words for concepts with fewer perceptual features. Because these factors have only been examined separately, it is unclear whether effects of taxonomic density stem from differences in structured taxonomic knowledge or simply reflect lower-level differences in perceptual similarity among concepts. In the current study, we asked how taxonomic knowledge and perceptual information jointly contribute to word learning in a group of 24-month-olds with a wide range of vocabulary skill. We found that taxonomic knowledge facilitated word learning. We also found that the availability of perceptual cues to meaning was used as an additional support for word learning by children with smaller expressive vocabularies. Together these findings suggest that taxonomic knowledge is a better predictor of word learn-ing compared to lower-level perceptual features at 24 months old. However, perceptual cues to meaning may provide additional support for vocabulary growth for learners with smaller vocabularies and/or late-talkers.

The Wisdom of Intellectually Humble Networks

People's collectively shared beliefs can have significant social implications, including on democratic processes and policies. Unfortunately, as people interact with peers to form and update their beliefs, various cognitive and social biases can hinder their collective wisdom. In this paper, we probe whether and how the psychological construct of intellectual humility can modulate collective wisdom in a networked interaction setting. Through agent-based modeling and data-calibrated simulations, we provide a proof of concept demonstrating that intellectual humility can foster more accurate estimations while mitigating polarization in social networks. We investigate the mechanisms behind the performance improvements and confirm robustness across task settings and network structures. Our work can guide intervention designs to capitalize on the promises of intellectual humility in boosting collective wisdom in social networks.

Comparing Individual and Collective Performance in Deductive Reasoning across Content Types

This research analyzes whether collective performance surpasses individual performance in solving two different deductive reasoning tasks. One of the tasks contains arguments with factual content, while the other includes arguments with ideological content. Additionally, we seek to determine whether the truth-wins model accurately represents the social combination process involved in the collective resolution of both deductive tasks. We designed and conducted two studies. Study 1 (N = 115) employed a within-subjects design with three conditions: individual, collective, and post-collective individual resolutions. Participants evaluated syllogisms with factual content. Study 2 (N = 111) followed a similar design but used syllogisms with ideologically controversial content. Results from both studies indicate that collective performance surpassed individual performance. Furthermore, while the truth-wins model best describes the collaborative decision- making process in Study 1, the majority model was more accurate for Study 2. These findings align with theories advocating for the social origins of human reasoning.

Building Interconnected Networks of Word Knowledge Over Time

To become fluent language users, children must learn not only individual words but also connections between them. For example, connections are vital for understanding that "apples" are "yummy", something you can "eat" and similar to "oranges". To date, there is evidence that children develop increasingly sophisticated abilities to form these connections from regularities in the way that words co-occur with other words that are ubiquitous in everyday language. Yet despite the fact that children repeatedly experience such regularities day-to-day, existing evidence focuses just on what children learn from a single experience. We used a multi-session approach to examine the connections children build from repeated exposures over time. We found that from age four to six, children not only improve in their formation of connections between words from regularities in language, but also in building increasingly richly interconnected knowledge from one experience to the next.

Statistical Word Segmentation in Unfamiliar Speech

Statistical learning, the ability to detect patterns in sensory input, allows listeners to segment words from continuous speech by tracking transitional probabilities. While this mechanism is robust in familiar contexts, its adaptability to unfamiliar speech with distinct phonological properties remains less understood. This study investigates whether English-speaking adults can use TPs to segment an artificial language modeled on Cantonese. Participants identified words where syllables consistently occurred together (statistical words) and syllables that partially co-occurred (part-words) compared to those that never did (non-words). However, they struggled to distinguish statistical words from part-words when frequency was controlled. Pupillometry results showed participants dilated more to part-words and non-words at test, compared to frequency-controlled statistical words. Pupillary responses during familiarization also predicted test performance, demonstrating the potential of pupillometry to track learning in real time. These findings highlight the flexibility of statistical learning in adapting to novel linguistic contexts while revealing its limitations.

A Computational Model for Estimating Effective Connectivity Using Virtual Neurostimulation

Effective connectivity (EC) is crucial for elucidating causal interactions between brain regions, providing valuable insights into cognitive function. Traditional methods infer EC indirectly through temporal predictions, but unmeasured confounding factors lead to temporal delays, resulting in spurious causal inferences. Compared to the indirect inference of EC using traditional methods, neurostimulation experiments directly infer EC, yet their invasive nature and ethical constraints limit their applicability in assessing whole-brain EC. To address this, we propose a data-theory-driven virtual neurostimulation (VNS) model for directly estimating EC between brain regions. This model constructs a large-scale brain network model as a surrogate for the brain, applying perturbations to specific brain regions based on the physiological mechanism of membrane potential polarization induced by current stimulation. Causal relationships between brain regions are then inferred directly by performing statistical analysis on the responses in blood oxygenation level dependent (BOLD) signals. The model's accuracy and stability were validated on macaque and human medial temporal lobe (MTL) datasets with known ground-truth EC, demonstrating superior performance over baseline methods. Application to a disease specific dataset further highlights its potential as a personalized biomarker for neurodegenerative diseases, providing a novel pathway for early diagnosis in brain disorders.

Trade-Offs Between Tasks Induced by Capacity Constraints Bound the Scope of Intelligence

A core challenge in cognitive science is understanding the barriers to intelligence and the circumstances that favor cognitive specialization. General intelligence requires a cognitive architecture that is successful across tasks. However, improving an architecture for a given task is often observed to hinder performance on others. Although trade-offs between tasks are a recurring element of explanations in cognitive science, they have received little direct theoretical attention. We present a formal framework that provides a principled understanding of when trade-offs emerge. In particular, we build on recent advances in applying rate-distortion theory to reinforcement learning. This allows us to formalize the representational capacity an agent can call upon in approaching tasks in terms of information. We find trade-offs occur when components of a task conflict in ways that cannot be easily coarse-grained by the agent's encoding scheme. Further, cognition may be general, specialized, or implement a coverall strategy, depending on conditions.

Speaker-Related Cognitive Constraints on Multimodal Audience Design During Spatial Communication

Human communication is multimodal, characterized by the use of speech and co-speech iconic gestures. While previous research examined how cognitive and communicative demands affect multimodal language use, the interplay between speaker-related cognitive constraints and listener-oriented adaptations remains unclear. The present study examines how speakers with varying spatial skills adjust their speech and iconic gestures when addressing interlocutors with different spatial skills. By employing the imagined addressee paradigm, twenty-three participants described how to solve mental rotation problems to interlocutors with low- versus high-spatial skills. Speakers produced more iconic gestures when addressing low-spatial compared to high-spatial interlocutors. However, this adaptation was primarily observed among individuals with above-average spatial skills. These findings extend prior work on multimodal audience design, showing that while speakers adjust their gestures based on listener characteristics, these adaptations are constrained by the speaker's cognitive capacities. This study highlights the dynamic interaction between cognitive and communicative demands in multimodal language.

Folk Teleology and Split Entity Identity

Reasoning about the identity of objects is challenging, because objects can experience alterations that change its properties. Researchers have proposed that causal explanations are important in making identity judgments, and one important causal factor is objects' teleology (purpose or function). This study focuses on how teleological information affects identity judgments of entities that split into two descendants. In Experiment 1, we provide evidence that two types of teleological information – function type (structure-dependent or not) and function preservation influence the likelihood that a descendant is judged as the original entity. In Experiment 2, we show that object/substance construal is a mediator of function type in the identity task, suggesting that some of the teleological effects can be explained via construing the entity as object/substance. Together, these two experiments highlight the importance of teleological information in identity judgments.

A Pragmatic Model of Spatial Language

Spatial language is an integral part of everyday communication, but existing theories fail to explain the cognitive processing underlying the interpretation of spatial descriptions. The traditional spatial template theory does not attempt to provide a mechanism behind the computation of acceptability ratings, nor does it adequately address the effect of distractors or the conversational context. In this study, we propose a new model of spatial pragmatics, based on the Rational Speech Act (RSA) framework. This Bayesian model shows how the patterns of spatial language understanding follow directly from the principles of pragmatics. Our model accounts for previously found effects of angle, distance, and distractors on acceptability ratings and indication behavior. We test these predictions in an experiment consisting of various tasks. The results largely support our model's predictions, suggesting that pragmatic reasoning might play a role in spatial language use.

How goals affect information seeking

This study investigates how goals influence information sampling strategies in active learning. Previous work in this area compared different sampling heuristics while holding constant participants' goals (e.g., the final incentivized test). In a behavioral experiment, we examine the effect of generation-driven versus discrimination-driven goals on information-seeking sampling strategies by manipulating the (pre-declared) test condition across subjects. Our results suggest that goals affect information sampling, with discrimination-driven tasks leading to more sampling around class borders (``Label-Margin" sampling strategy) and generation-driven tasks leading to more sampling around class centers (``Most-Certain" sampling strategy). Moreover, we show that strategies evolve over time and are related to the performance of participants. These findings highlight the importance of considering goals in understanding human information-seeking behavior.

Overcoming Learning Traps Through Social Learning

People often fall into learning traps where a false belief about the structure of the environment leads to under-exploration of rewarding options. Two studies (N = 324) examined whether observation of the approach decisions of another learner facilitated escape from a trap. After an initial learning phase in a task where approach of different category members could lead to gains or losses, we identified whether participants had learned an optimal two-dimensional categorization rule or fallen into the trap of using a one-dimensional rule. Participants then observed the approach decisions of another learner using the same or a different category rule. Participants' categorization rules in a final round of category learning were then assessed. A substantial proportion of those who had initially fallen into the trap shifted to the optimal rule after observing use of an alternate rule. This effect was found following observation of both optimal (Experiment 1) and sub-optimal rules (Experiment 2). In contrast, those who learned the optimal rule in the initial learning phase were unaffected by social observation. The results show that social learning is a viable approach for facilitating escape from learning traps.

Whose Values Prevail? Bias in Large Language Model Value Alignment

As large language models (LLMs) are increasingly integrated into our lives, concerns have been raised about whether they are biased towards the values of particular cultures. We show that while LLMs were biased toward the values of WEIRD populations, some non-Western populations, including East Asia and Russia, were also represented relatively well. Notably, the Rich dimension was the strongest predictor of LLM's alignment instead of the most discussed Western dimension. This suggests the need to attend to less prosperous populations instead of focusing only on easily accessible populations. We also found that one source of this bias could be unbalanced training data as approximated by an Internet Freedom measure, and that prompting the model to act as individuals from different populations reduced the bias but could not eliminate it. These findings raise the importance of training process disclosure and the consideration of culture-specific models to ensure ethical usage of LLMs.

Event Structure and the Experience of Viewing Art

Future theories of cognition need to encompass a wide range of human experiences, beyond those typically assessed in the laboratory. This study assessed how the experience of a prior artwork (prime) influenced the experience of the next artwork (target), and how this influence was affected by the presence or absence of event boundaries in a VR environment. We found that when primes were more emotionally intense, targets were rated lower in liking/beauty and emotional intensity. This influence was attenuated when the paintings were separated by an event boundary (separate rooms). Surprisingly, there was an effect of event boundaries on the processing of the prime paintings. An evaluation of additional data suggests that this is due to the mere presence of another painting in the same room, even before it is actually viewed. Thus, event structure can meaningfully impact the experience of viewing art.

Linking Verbs to Syntax: Investigating Error-Based Learning using Pupillometry

Verbs show different statistical preferences for syntactic structures. An influential theory for how verb-syntax links are learned suggests that learning is based on, and proportional to, prediction error. However, the evidence is mixed and there is need for evidence from a diverse set of paradigms. We exposed 90 college-aged adults to an artificial language containing novel verbs and sentence structures and tested their production of utterances in the new language. We reversed verb-syntax links from one training block to another and found that participants were able to learn the reversal, as seen in a production test. Pupillometry detected surprise when participants heard sentences in the second training block that had the opposite verb-syntax links than in the first training block. Despite detecting both surprise and successful learning, we did not find an association between the two. Thus, we did not find evidence that learning was based on surprise. We discuss alternative learning mechanisms that can help language users adapt their language based on the input.

Examining Future Context Predictability Effects in Word-form Variation and Word Choice

Contextual predictability drives both word form and word choice in language use. The effects of the predictability of a word given its previous context are generally well-understood in both production and comprehension, but studies of naturalistic production have also revealed a poorly-understood backward predictability effect of a word given its future context, which may be related to planning. In this study, we revisit backward predictability effects using improved measures and more powerful language models, and introduce a principled measure of planning based on the pointwise mutual information between the word and the future context after controlling for the effects of previous context. We evaluate both measures for predicting word duration, and then extend the scope of these effects to a novel paradigm that involves predicting substitution errors in naturalistic productions. Our findings reveal that the proposed PMI-based measure of planning performs comparably to backward predictability. This analysis provides a useful test-bed for probing the link between past and future context predictability and underlying cognitive processes. Keywords: Language production; Information-theoretic linguistics; Corpus Research;

Multimodal Dynamicity in Fictive Expressions: Exploring Co-speech Gestures in Spatial Descriptions

Both fictive change and motion expressions are linked to dynamic conceptualization, a central concept in cognitive linguistics. However, it remains unclear whether producing these expressions involves the mental simulation of change or a dynamic perception of events—an area that invites further exploration. In this paper, we examine co-speech gestures in a spatial description task, exploring two main predictions: (1) If fictive expressions involve some form of dynamicity or simulation of change or motion, speakers will gesture more frequently than with factive expressions; and (2) If fictive expressions involve dynamicity or simulation, the gestures will reflect this imagery and may be more dynamic than static. The findings from this study support both predictions, suggesting that fictive expressions indeed involve a dynamic conceptualization or simulation of static spatial concepts.

Individual differences in habituation predict dishabituation magnitude in adults and infants

From infancy to adulthood, habituation and dishabituation enable learners to filter out repetitive information and orient to novel information. Because variability in these processes has been linked to differences in later cognitive outcomes, studying individual differences in habituation and dishabituation is crucial for building a more comprehensive model of early learning. Here, we leveraged large-scale datasets spanning infants, preschoolers, and adults to examine how individual differences in habituation predict dishabituation magnitude. We found that faster habituation and higher volatility predicted stronger dishabituation. Moreover, we showed that different measures of dishabituation sometimes yielded divergent patterns, suggesting that measurement choices can influence observed effects and should be carefully considered in developmental research. These findings reveal how endogenous factors are meaningful drivers of looking behaviors. Overall, our results underscore the need for large-scale data approaches to studying visual attention across the lifespan.

Near-Zipfian Distribution is Prevalent in Infant Input

Understanding infants' natural input is essential for advancing theories of cognitive development and learning. Recent research indicates that across modalities, infant input approximates a near-Zipfian distribution, with a large amount of input about a few items and substantially less about the rest. However, prior work has only examined aggregated distributions across subjects, focused on a single modality in isolation, and considered the input available to infants rather than what they actively select. We show that at both the corpus and individual levels, infant attention selection and the verbal input infants receive from parents follows a near-Zipfian distribution. Moreover, when integrating across modalities, the verbal input infants hear while attending to the same object becomes even more skewed than verbal input alone. Findings suggest that Zipfian-like structure is not only a property of infant environments but emerges through active selection, highlighting its potential role in shaping early learning.

Humans integrate heuristics and Bayesian inference to efficiently explore under uncertainty

Exploring the environment efficiently and exploiting learned information effectively are crucial to intelligent agent behavior. Prior work has shown that humans can manage the exploration-exploitation trade-off not only action-by-action, but also at the strategy or rule level, using a heuristic on rule certainty (Collins & Koechlin, 2012). We evaluated this theory on a partially observable rule-switching task and collected human behavioral data (n=112) on two task variants with different levels of rule complexity to test whether taxing cognitive resources impacts exploration heuristics. Our results replicated previous findings, showing that the model is robust to dynamically switching task structure and increased executive demands due to rule complexity. Additionally, we identified a novel meta-heuristic of using high-level rule structure to inform decision-making and computationally characterized its integration with Bayesian inference to support efficient exploration. Through modeling analyses, we show that increased demand on executive function might interfere with this meta-cognitive process.

The Effect of Text Simplification on Reading Fluency and Reading Comprehension in L1 English Speakers

Text simplification is a common practice for making texts easier to read and easier to understand. To which extent does it achieve these goals, and which participant and text characteristics drive simplification benefits? In this work, we use eye tracking to address these questions for the first time for the population of adult native (L1) English speakers. We find that 42% of the readers exhibit reading facilitation effects, while only 2% improve reading comprehension accuracy. We further observe that reading fluency benefits are larger for slower and less experienced readers, while comprehension benefits are more substantial in lower comprehension readers, but not vice versa. Finally, we find that high-complexity original texts are key for enhancing reading fluency, while large complexity reduction is more pertinent to improving comprehension. Our study highlights the potential of cognitive measures in the evaluation of text simplification and distills empirically driven principles for enhancing simplification effectiveness.

The Malleability of Children's Mental Rotation Strategies: What do children's mental rotation tests really measure?

The most commonly used measure of spatial cognition to assess both adults and children is mental rotation. However, little is known about the cognitive strategies that children use to solve this task. Understanding how and when children may employ different mental rotation strategies can illuminate the development of mental rotation ability and help clarify previous mixed findings on the developmental trajectory of children's mental rotation skills. Thus, in this study, we investigated what strategies children use in a new mental rotation task and whether their strategy use would be influenced by the test instructions. In Experiment 1, we found that the types of strategies children used in our mental rotation test differed from strategies reported in previous research, suggesting that strategy use is dependent on test design. In Experiment 2, we found that children's strategy use can be malleable; changing the test instructions reduced one type of erroneous strategy, flipping. Our findings suggest that different tests labeled "mental rotation tests" may actually be measuring different abilities.

Food Neophobia: a Barrier to The Development of Categorization and Executive Functions

The majority of evidence on the relations between young children's levels of food neophobia (the fear of novel food), categorization abilities and executive functions is cross-sectional, leaving the direction of causality unclear. This study aimed to examine the bidirectional relations between children's food neophobia, categorization performance and strategies, and executive functions (working memory, inhibition and cognitive flexibility) longitudinally. Children (n = 113; M age = 48.30 months at Time 1) were assessed at two time points over the course of a year of schooling. Controlling for age, early levels of food neophobia significantly predicted lower subsequent categorization performance and executive functions. No significant evidence was found to support the reverse directionality; neither categorization performance, strategies, nor executive functions at Time 1 predicted subsequent levels of food neophobia. The findings provide longitudinal evidence that neophobia hinders the development of categorization and executive functions abilities.

Resource-rational belief revision can mitigate as well as amplify polarization

People's beliefs sometimes diverge after observing the same information, which has been interpreted as evidence of irrationality. This behaviour has been proposed to result from people's limited cognitive resources and motivated reasoning, but how belief revision differs across these explanations has not been formalized or compared to a rational norm. Further, while people may be biased relative to a normative ideal, they may still make optimal choices given their limited cognitive resources, or rationally balance the utility of holding accurate beliefs with the belief's intrinsic utility. Across two studies, we develop and test a unified computational account of belief polarization under these proposed mechanisms, showing that people's performance on a belief updating task best fits a limited-resource Bayesian model; external motivations may contribute to divergence (or convergence) by determining what pre-existing information people consider relevant to a situation, rather than by changing how people evaluate new information in isolation.

Decoupling Hand and Mind in Abstract Temporal Reasoning: Variation in temporal gesture and temporal reasoning

Time is often conceptualized spatially. Some argue that this spatialization is essential for understanding time and should manifest reliably in temporal gesture. Here we ask whether people vary in their production of temporal gesture and what this variation might signal about abstract reasoning and communication. Participants (N = 94) watched time-travel narratives, reasoned aloud about the narrative's temporal structure, and later completed a computer assessment of their recollection and comprehension of the narrative's complex temporal structure. We found that participants varied considerably in how often they produced temporal gesture. This variation in temporal gesture was strongly associated with the production of non-temporal representational gesture, suggesting a ‘gestural style' that governs the production of gesture in general. There was no association, however, between temporal gesture and temporal reasoning accuracy. We speculate about the role that temporal gesture might play in larger assemblages of temporal understanding and communication.

Incentive Effects Capture Variability in Task-General Control Allocation

To understand how people vary in their cognitive control engagement, researchers use different laboratory tasks and compare performance on trials that are more versus less control-demanding (e.g., congruency effects). However, previous research has struggled to uncover consistent patterns of correlation across cognitive control tasks, leading to questions about the utility of these tasks and the existence of task-general control. The current study sought to test whether these validity concerns may center on the stimulus-driven nature of congruency effects, rather than the tasks themselves. To overcome this obstacle, we varied task incentives while holding stimulus features constant. We show both theoretically and empirically that the effects of incentives on control allocation correlate across tasks. Together, findings support task-general control processes that operate across different contexts.

Hidden costs of overparenting: Children feel worse about their abilities when adults take over for their peers

Overparenting undermines children's self-efficacy and motivation. However, little research has explored whether its negative impacts extend beyond the home and affect not only overparented children, but also their peers. Here, we test the hypothesis that 6- to 8-year-old children attribute peer success to internal (ability) rather than external (parental intervention) causes and that this attribution leads children to form negative beliefs about their own competencies. In Experiment 1, children were more likely to spontaneously attribute outstanding peer performance to internal causes (ability) than external ones (parental intervention). In Experiment 2, children reported lower self-perceived abilities when they learned that peers outperformed them due to internal (ability) versus external (parental intervention) causes. Together, these findings reveal an unintended consequence of overparenting: Intervening to enhance one child's performance leads peers to feel worse about their abilities, potentially harming their self-concept and future motivation.

Second Hand Effects: Exploring Spatial Influences on Temporal Judgments in Clocks

Humans use clocks to objectively measure their subjective temporal experiences. But can the spatial properties in a clock distort our experience of time ? This study examines how spatial boundaries in an analog clock influence prospective temporal judgements. We found that when the second hand crossed more boundaries, it distorted participants' spatial memories, causing them to overestimate the arc traced by the second hand. However, this distortion in spatial memory did not significantly influence participants' temporal judgments. Besides boundary crossing, our study examined and replicated the influence of speed on temporal judgment, with faster speeds of the second hand being associated with a dilated temporal experience.

A Variational Neural Network Model of Resource-Rational Reward Encoding in Human Planning

Working memory (WM) is essential for planning and decision-making, enabling us to temporarily store and manipulate information about potential future actions and their outcomes. Existing research on WM, however, has primarily considered contexts where stimuli are presented simultaneously and encoded independently. It thus remains unclear how WM dynamically manages information about reward and value during planning, when actions are evaluated sequentially in time and their cumulative values must be integrated to guide choice. To address this gap, we developed an information-theoretic model of WM allocation during planning, implemented using variational recurrent neural networks. In this model, an agent optimizes plan quality while maintaining reward information under WM constraints. To test our model, we designed a task in which participants sequentially observed the rewards available at different future states before executing a sequence of actions, attempting to maximize cumulative rewards. Our results suggest that humans preferentially maintain rewards that are most informative for plan selection, integrating both local and global factors. These findings bridge theories of WM limitations with models of human planning, revealing how cognitive constraints shape decision-making strategies.

Limits of repetition in the illusion of consensus

Humans rely on social consensus to assess the credibility of information. Confidence in a claim can be influenced by repeated exposure to the same source (dependent consensus), which can create an illusion of consensus. This study investigated whether people differentiate between dependent consensus and consensus derived from multiple independent sources. In two experiments, participants rated their confidence in a claim after exposure to an increasing number of supporting claims within a mock social media environment. Results revealed that although dependent and independent consensus were weighted similarly when exposed to a small number of claims, independent consensus carried greater influence as claim exposures increased, regardless of the claim's valence. These findings were further supported by analysis of free-text justifications of responses using large language models (LLMs). Our findings show that people differentiate between the epistemic weight of different consensus types on social media, where repeated exposure to claims is common.

Native Language Suffixation Patterns and Perception of Sequences: A Case of Cantonese Speakers

In the languages of the world, it is more common to form complex words by adding suffixes to the end, rather than prefixes at the beginning. It has been argued that this pattern may reflect the salience of word beginnings (Hawkins and Cutler 1988, Hupp et al. 2009). For example, Hupp et al. (2009) find that English speakers rate sequences of syllables that differ at the end as more similar than those that differ at the beginning. However, subsequent research has shown that people's perceptions of sequence similarity are affected by the word-formation patterns in their native language. While the beginnings of sequences are perceived as more salient by speakers of suffixing languages (e.g., English), the ends are more salient to speakers of prefixing languages (e.g., Kîîtharaka, Martin and Culbertson 2020). Thus, it remains unclear whether universal perceptual preferences are linked to the predominance of suffixing in the world's languages. We address this question by investigating perceptual-similarity judgments in speakers of Cantonese – a language with little affixation. We find that, like English speakers, Cantonese speakers perceive the beginnings as more salient, in sequences of shapes and syllables. This finding revives the possibility of a universal perceptual bias, albeit one that can be strengthened or attenuated with language experience.

"Hearing As": Top-down Processing Affects Early ERP Components for Musical Expectation

Harmonic expectation is an important generator of musical experience often explained through mechanisms of statistical learning. EEG research has identified ERP components associated with expectation, including the Early (Right) Anterior Negativity (E(R)AN), which is theorized to index harmonic surprisal with reference to long-term memory of the statistical structure of music. However, the role of top-down influence remains under-explored. We present data from a novel paradigm that cues listeners to the syntactic structure of the stimuli (but not whether they contain improbable events). Our main result revealed larger E(R)AN amplitudes for surprising chords when listeners knew that additional context would follow a surprising harmony. We propose that listeners prospectively integrate surprising chords with anticipated future context, rather than responding to them solely through automatic probability assessment. Musical surprisal arises from a dynamic interplay between bottom-up cues and a listener's top-down anticipated syntactic structure.

LLMs Struggle to Reject False Presuppositions when Misinformation Stakes are High

This paper examines how LLMs handle false presuppositions and whether certain linguistic factors influence their responses to falsely presupposed content. Presuppositions subtly introduce information as given, making them highly effective at embedding disputable or false information. This raises concerns about whether LLMs, like humans, may fail to detect and correct misleading assumptions introduced as false presuppositions, even when the stakes of misinformation are high. Using a systematic approach based on linguistic presupposition analysis, we investigate the conditions under which LLMs are more or less sensitive to adopt or reject false presuppositions. Focusing on political contexts, we examine how factors like linguistic construction, political party, and scenario probability impact the recognition of false presuppositions. We conduct experiments with a newly created dataset and examine three LLMs: OpenAI's GPT-4-o, Meta's LLama-3-8B, and MistralAI's Mistral-7B-v03. Our results show that the models struggle to recognize false presuppositions, with performance varying by condition. This study highlights that linguistic presupposition analysis is a valuable tool for uncovering the reinforcement of political misinformation in LLM responses.

An ACT-R model of resource-rational performance in a pragmatic reference game

In the Gricean tradition, pragmatic competence is part of the general human capacity for social reasoning. Indeed, human performance in reference games involving ad-hoc implicatures sometimes aligns with idealized models of rational interaction. But such experiments have also found that humans derive far fewer implicatures than ideal models, subject to individual differences unrelated to social reasoning. In this paper, we consider whether these patterns could arise from the resource-rational deployment of a core social competence, such that individuals choose from various strategies of interpretation, given those strategies' resource demands and success rates, subject to individually-varying predispositions and exploration tendencies. We construct a model of this resource- rational performance in the cognitive architecture ACT-R—to our knowledge the first mechanistic model of performance in these tasks—and we examine its predictions for multi-trial reference games across two model experiments. The model reproduces the key patterns in the human data, providing an initial proof of concept for the role of resource-rationality in these tasks and opening a new avenue for understanding individual differences in pragmatic reasoning.

Social Engagement Leads Infants to Represent People as Individuals

How do infants come to represent people's identities? In two experiments (N = 86), we investigated 7- to 10-month-old infants' abilities to individuate (i) one of their own caregivers and an unfamiliar adult and (ii) people who are either socially engaged or disengaged. Although classic research has found that infants at these ages do not individuate objects (e.g., a duck and a ball), the infants in our experiments individuated people, so long as those people were socially engaging. These findings suggest that infants represent people as individual entities before they do objects, which may support the formation of children's relationships or their evaluations of others.

Affective Representations between Association and Cognition

Research on animal cognition is currently struggling with two main problems. Firstly, mainstream views on reasoning in philosophy of mind and epistemology are unable to explain a growing body of empirical results in animal cognition research. The second problem concerns the interpretation of the same empirical results within comparative psychology, namely, whether they are best explained by appeal to (‘lower') associative or (‘higher') cognitive mechanisms, where neither side is providing convincing explanations. This problem is amplified by a crisis of the dichotomy itself between the ‘associative' and the ‘cognitive'. In response to these two problems, I develop an affect-based construal of cognition, which broadens the conception of cognition to also include affective processes. I introduce the concept of affective representations that exhibit both ‘associative' and ‘cognitive' characteristics. This concept allows for new kinds of explanations of comparative research results, and indicates a future direction to explore human and nonhuman cognition.

Re-examining the tradeoff between lexicon size and average morphosyntactic complexity in recursive numeral systems

Denic and Szymanik (2024, henceforth D&S) argued that recursive numeral systems optimize the tradeoff between lexicon size and average morphosyntactic complexity. In support of this claim, they showed that a broad range of attested numeral systems trade off these two quantities nearly as well as the best of a large set of artificial numeral systems. However, D&S' artificial systems were in some respects not entirely comparable to natural ones. Here, we address this issue by creating a grammar framework that can represent both natural and artificial numeral systems, and we derive both natural and artificial numeral systems from this single framework, which ensures the comparability of the two sorts of systems. We test D&S' original claim under these new conditions, and find support for it. We also explore the proposal that numeral systems might optimize the sum of lexicon size and average morphosyntactic complexity under a fixed weighting of the two terms, and the role of the prior distribution over numbers.

Helping and hindering guide infants' expectations about future behavior

What inferences do infants make from people's helping behavior? Two preregistered studies examined whether 14- & 15-month-old infants expect consistent helping behavior across different social contexts, and whether any such expectations are consistent with inferences about relationships or dispositions. Participants saw one individual help a target social partner move a boulder up a hill, and they saw another individual hinder the same target from reaching the top of the hill. We then tested infants' expectations about which individual was more likely to provide help in the future by measuring how long they looked when both the helper and hinderer provided help in a new context. In Exp1, the target social partner in the new helping context was the same character that appeared in the familiarization events. In Exp 2, the target was a novel character. Infants looked longer when the hinderer, rather than the helper, provided help to the original target in the new context (Exp1). However, infants' looking times did not differ between events when the target was novel (Exp2). The looking patterns between the two experiments were significantly different. Thus, infants use the pro- or antisocial nature of an individual's past actions to generate expectations for future behavior, but do not generalize those expectations to new targets. Together, this suggests that infants primarily infer social relationships, rather than dispositions, from others' helping behavior.

Communication form modulates sentence interpretation: (A)typicality inferences from descriptions vs. direct speech

This study provides the first evidence that sentence interpretation/inferences that comprehenders draw about situation typicality are moderated by the effects of communication form and sentence polarity. We examined the interpretations of affirmative and negative sentences describing real-world situations of varying typicality. We manipulated whether a sentence is presented as a description with an omniscient narrator (e.g., The house has a bathroom.) vs. as direct speech with the speaker-addressee relationship identified (e.g., "The house has a bathroom," Emma told her partner.). By comparing sentence interpretation across situations varying in (i) typicality (e.g., house has bathroom/garage/ballroom), (ii) communicative form (description vs. direct speech), and (iii) sentence polarity (affirmative vs. negative), we find that presenting information as direct speech encourages pragmatic inferences beyond world knowledge. We also find that negation per se does not trigger theoretically predicted pragmatic inferences, but with a specific communicative act like direct speech it does.

Identifying "when" and "whether" causation: How people distinguish generation, hastening, prevention, and delay

Causal relationships in the real world can have diverse mechanisms with differing statistical signatures. We investigate whether people can distinguish between causes that merely change the timing of events ("when" causes) and those that bring about or prevent those events ("whether" causes). We designed experiments where the rate of an event varies over time due to one such causal influence. Events were shown in real time in Experiment 1 and as a timeline visualization in Experiment 2. Our results suggest that people are capable of identifying "when" and "whether" causes but with a distinctive pattern of confusability: People confuse Generation with Hastening; and Prevention with Delaying. We develop a Causal Abstraction from Summarizing Events (CASE) model, which explains people's judgments as mediated by their rate-change-event detection. We discuss how this line of research can be extended to study human cognition about dynamic causal influences and its relevance to real-life judgment and decision-making.

When the Learning Gets Tough: Children's Accent-Based Learning Choices are Influenced by Processing Difficulty

Children use a variety of cues to decide who they can trust to be a credible source of information. One such cue is accent. Previous research has attributed accent-based preferences to a bias for in-group members. In the present study, we examine another potential contributor to these preferences: processing difficulty. Four- to seven-year-old children completed a selective word-learning task, in which they were presented with pairs of speakers and needed to choose one to learn a new word from. The speakers differed in accent type – native or non-native – and non-native speakers differed in how difficult their speech was to process. Children were more likely to choose to learn from the speaker whose speech was easier to process, and the magnitude of this effect was linearly related to the processing difficulty disparity between the two speakers: the greater the disparity, the stronger the effect. These findings are the first to demonstrate the role of processing difficulty in children's accent-based selective learning.

Framing, not transparency, reduces cheating in algorithmic delegation

Recent evidence suggests that delegating tasks to machines can facilitate unethical behavior, but the psychological mechanisms driving this effect are not yet well understood. This study investigates whether two interventions can mitigate cheating in an algorithmic honesty game: transparency (information about which user input causes which algorithm behavior) and framing (natural language cues about the moral valence of behavior). In a 2 x 2 experimental design, we find that transparency does not reduce dishonest behavior, despite participants actively engaging with and understanding the provided information. Conversely, framing — replacing neutral labels like "maximize profit" with ethically charged terms like "maximize cheating" — substantially reduces dishonesty. These findings suggest that curbing misuse of AI requires confronting users with its moral implications, not just explaining the mechanics.

The Role of Structural Input Features in Statistical Learning

Two learning mechanisms have been suggested to underlie statistical learning: computation of transitional probabilities and chunking. It remains an open question though what determines which mechanism is used. In this study, we examined whether learning mechanisms are exploited differentially depending on the structure of the input to be learned. More specifically, we investigated whether the strength of the relationships between elements in the input structure and the presence of higher-order relationships influence the employment of the mechanisms. Participants were presented with three different input structures. We measured reaction times in a self-paced statistical learning task and created Bayesian models that formalised different learning mechanisms. The results show that the employment of the learning mechanisms indeed depends on the input structure. Further studies will need to examine a more specific mapping between the input structures and the learning mechanisms.

GPNet: Granularity-Aware Pyramid Network with Graph Aggregation for Sleep Staging and Face-Emotional Recognition Speed Prediction

Accurate classification of sleep stages is crucial for sleep quality assessment, health monitoring, and disease prevention. To effectively extract significant waveform features and capture the interactive coupling of features at different layers from single-channel electroencephalogram (EEG) signals, this study proposes the Granularity-Aware Pyramid Network with Graph Aggregation for Sleep Staging (GPNet) model. Specifically, the model first extracts fine-grained time-frequency features from multi-resolution input signals using the feature pyramid. Subsequently, an adaptive deep attention mechanism is incorporated into the layer with the highest depth-wise information to explore the correlations between local and global features. Finally, graph convolution is employed to learn the coupling interactions among high-level features across multiple layers. Comparative experiments conducted on the Sleep-EDF-X datasets demonstrate that GPNet exhibits highly competitive performance compared to other models. Additionally, GPNet predicts post-sleep recognition speed of negative emotions, revealing a negative correlation with REM(%) sleep and suggesting that sleep mitigates negative effects.

Developmental evidence for sensitivity to hierarchical structure in the noun phrase

In most of the world's languages, complex noun phrases are created by placing adjectives closest to the noun, and demonstratives farthest away, with numerals in the middle (e.g., "this one short paper"). Theoretical linguistics suggests that this tendency may result from a typical hierarchical structure, in which adjectives form an immediate constituent with nouns, numerals combine with that sub-constituent, and demonstratives combine with the resulting unit. Recent experimental studies also support this idea, showing learners prefer orders that follow this hierarchy (e.g., Martin et al. 2020). However, it is unknown whether this same preference is found in children, who have less experience than adults with these structures, and who might be less sensitive to the hierarchical structure of language. Here, we investigate ordering preferences in 5-6-year-old children. Results show that children, like adults, prefer hierarchy-following orders, strengthening the hypothesis that the prevalence of these orders reflects a universal cognitive bias.

Development of Linguistic-Mediated Abstraction: Insights from Word Ladders task

What is the developmental trajectory of language-mediated abstraction skills, and to what extent are these skills influenced by semantics? We address these questions asking children to generate semantic relations of categorical inclusion for words varying in concreteness. Results show that abstraction improves over time, independent of age, with both concrete and abstract concepts organized into hierarchical taxonomies. However, abstract concepts allow shorter ladders, making them harder to categorize, especially for younger children. These findings underscore the distinction between concreteness and specificity as separate dimensions of abstract reasoning, and they lend empirical support to theoretical models that treat these facets of abstraction as dissociable.

AI-enhanced semantic feature norms for 786 concepts

Semantic feature norms have been foundational in the study of human conceptual knowledge, yet traditional methods face trade-offs between concept/feature coverage and verifiability of quality due to the labor-intensive nature of norming studies. Here, we introduce a novel approach that augments a dataset of human-generated feature norms with responses from large language models (LLMs) while verifying the quality of norms against reliable human judgments. We find that our AI-enhanced feature norm dataset, NOVA: Norms Optimized Via AI, shows much higher feature density and overlap among concepts while outperforming a comparable human-only norm dataset and word-embedding models in predicting people's semantic similarity judgments. Taken together, we demonstrate that human conceptual knowledge is richer than captured in previous norm datasets and show that, with proper validation, LLMs can serve as powerful tools for cognitive science research.

Music-induced Positive Mood Stimulates Metaphor Production

Metaphors are a creative use of language that conveys complex ideas through abstract reasoning and cognitive flexibility. While prior research has demonstrated that music influences creativity, its specific impact on metaphor production remains unexplored. In this study, 90 adults were assigned to one of three groups—silence, happy music, and sad music—and completed a metaphor production task, generating expressions for emotions (e.g., being happy) and actions (e.g., telling a lie). Participants also completed convergent and divergent thinking assessments to account for individual differences in creativity. Results showed that participants who listened to happy music while doing the task were more likely to produce figurative expressions, with convergent creativity positively predicting their production, while divergent creativity had no effect. Moreover, metaphors produced with background music were generally rated as more novel than those produced in silence, with sad music leading to metaphors with a more negative emotional tone. These findings suggest that extrinsic factors, particularly happy music, can enhance our ability to produce metaphors by boosting the cognitive flexibility required for creative thinking.

PACE: Procedural Abstractions for Communicating Efficiently

A central but unresolved aspect of problem-solving in AI is the capability to introduce and use abstractions, something humans excel at. Work in cognitive science has demonstrated that humans tend towards higher levels of abstraction when engaged in collaborative task-oriented communication, enabling gradually shorter and more information-efficient utterances. Several computational methods have attempted to replicate this phenomenon, but all make unrealistic simplifying assumptions about how abstractions are introduced and learned. Our method, Procedural Abstractions for Communicating Efficiently (PACE), overcomes these limitations through a neuro-symbolic approach. On the symbolic side, we draw on work from library learning for proposing abstractions. We combine this with neural methods for communication and reinforcement learning, via a novel use of bandit algorithms for controlling the exploration and exploitation trade-off in introducing new abstractions. PACE exhibits similar tendencies to humans on a collaborative construction task from the cognitive science literature, where one agent (the architect) instructs the other (the builder) to reconstruct a scene of block-buildings. PACE results in the emergence of an efficient language as a by-product of collaborative communication. Beyond providing mechanistic insights into human communication, our work serves as a first step to providing conversational agents with the ability for human-like communicative abstractions.

Not seeing it: What young children don't understand about attention

Do children understand that people vary in how attentive they are and recognize that people prefer attentive social partners? Across six experiments, we showed participants one agent engaging attentively with a child puppet and another who was distracted throughout the interaction. Across four experiments, four and five-year-olds (total N= 132; overall mean: 4.85; range: 4.0-5.9 years) failed to distinguish the agents. Six and seven-year-olds (total N=131; overall mean: 7.01; range: 6.0-7.9 years) succeeded given repeated interactions but not robustly: fewer than half the children consistently chose the attentive agent. By contrast, adults succeeded given a single demonstration. Children's difficulty was not due to task demands; four and five-year-olds readily distinguished agents who did and did not satisfy the puppet's desires. Thus, although children understand attention as a discrete mental state very early in development, and react negatively when adults are not responsive, children may be relatively insensitive to cues to attention as a continuous mental state.

Backwards counterfactuals and the closest possible world

One source of complexity in counterfactual reasoning is the order in which events are presented within the conditional. Counterfactuals with a backwards order of events (aka ‘backtracking' counterfactuals) involve reasoning backward: from the consequent to the antecedent. We extend on prior experimental work (e.g., Rips 2010), and consider the possibilities adults reason over when they backtrack. We find that adults' reasoning strategies tend to be inconsistent when responding to backtracking questions. In scenarios involving a single causal variable, participants do not generally allow for extraneous changes from the actual world. Furthermore, when reasoning forward along a causal chain, participants do not allow for extraneous changes. However, in backtracking scenarios involving multiple causal variables, participants are at chance in choosing worlds with extraneous changes. We provide novel evidence for the changes allowed from the actual world when backtracking, with mixed support for theoretical claims such as Minimal Networks Theory.

Islands Result from Clash of Functions: Single-conjunct Wh-Qs

When we produce utterances, we aim to express our message in a coherent way. While listeners generally prefer semantically related constituents to be close together in the string ("local"), certain constructions allow long-distance dependencies (LLDs). There is growing evidence that constraints on LDDs involve information structure, but conjunction is recognized to require its own unique constraints. Here we offer 4 experimental studies aimed to illuminate why conjunctions resist LDDs, by investigating a particular case experimentally in detail: English wh-questions that query only the final conjunct in verb phrase conjunction. The first two studies demonstrate that gradient acceptability is predicted by the extent to which a conjunction expresses a single complex event rather than two separate events. Exp. 3 demonstrates a role for the information structure constraint that holds of LDDs generally: more prominent (less backgrounded) conjuncts combine with wh-questions more easily. A final experiment manipulates construal as a single event and the prominence of a conjunct. Results demonstrate an additive causal role for both factors.

How different cognitive strategies can influence implicit recalibration in visuomotor adaptation

Visuomotor adaptation involves explicit strategies and implicit recalibration, but their interaction remains unclear. Strategies can take two forms: algorithmic strategies, involving mental simulation of motor solutions, and retrieval strategies, which rely on previously successful solutions. These strategies arise from distinct neural circuits, which are likely to influence cerebellar-dependent implicit recalibration in different ways. To explore this, we created conditions favoring algorithmic (visuomotor mental rotation) or retrieval strategies by varying training target set size, as retrieval is limited by working memory. We controlled for generalization and intertrial effects, isolating implicit recalibration. Preparation times confirmed distinct strategy adoption. While the magnitudes of implicit recalibration were similar, generalization breadth was narrower with retrieval strategies, suggesting stricter stimulus-response associations. Algorithmic strategies produced broader generalization. These findings confirm that algorithmic and retrieval strategies impact implicit recalibration differently, and demand that future efforts to characterize the pattern of implicit generalization must account for the unique contribution of different forms of explicit strategies.

Acoustic Cues Facilitate the Acquisition of Non-adjacent Dependencies in Sequences of Dynamic Object Transformation

Human learners' ability to detect rule-governed elements plays an important role in cognitive functions. While extensive research on the acquisition of regularities among adjacent items has provided robust and reliable evidence, learning structured patterns among non-adjacent components-known as non-adjacent dependencies (NADs)-remains far more tentative and only occurs under specific conditions. A past study by Lu and Mintz (2023) found that human learners need more training exposure to detect NADs in visual sequences of object transformations compared to sequences of human actions, but is unclear why. Building on this work, we present a series of three experiments to investigate whether learning NADs from visual dynamic sequences can be enhanced by maintaining the identifiability of the object throughout its transformations. In addition, we explore the effect of providing auditory information-speech or pure tones-along with the visual object transformation sequences. Our findings demonstrate that (a) NAD learning succeeded when speech cues co-occurred and matched with NAD-type frames but failed in the absence of auditory cues (Experiment 1); (b) pure tones presented contingently with the visual sequences also facilitated NAD learning (Experiment 2); and (c) regardless of whether speech or tones were used as additional cues, adult learners were unable to detect NADs when the relationship between specific auditory stimuli and specific visual object transformation sequences was disrupted (Experiment 3).

DIMS Dashboard for Exploring Dynamic Interactions and Multimodal Signals

Social interaction is a complex, multimodal phenomenon with varying timescales and meaning-making structures. Research in this area has progressed along two largely separate paths: qualitative researchers focus on fine-grained analysis, while quantitative researchers computationally identify broader patterns. To bridge this gap and promote cross-disciplinarity, we developed the Dynamic Interaction and Multimodal Signals (DIMS) Dashboard, an open tool for visualizing multimodal data, enabling a qualitative-quantitative synergy in social interaction research. We overview its development, and conduct a proof-of-concept qualitative-quantitative ("quali-quanti") analysis using neural and behavioral time-series data combined with video recording. Our exploratory case study reveals that 80% of segments with sharply increased neural activations in the right temporoparietal juncture align with highly engaged interaction, while 20% correspond to topic transitions. Through triangulation with qualitative insights, we observed that social brain synchrony relates in meaningful moments to head motion synchrony. Finally, we discuss how visualization tools like DIMS enhance multimodal, cross-disciplinary research in social interaction, and tool development.

Learning novel intransitive verbs from input cues: Experiments with Mandarin-learning toddlers

In two novel verb experiments using the visual fixation paradigm, we investigated how Mandarin-learning toddlers employ distributional cues and semantic cues to categorize novel unaccusative and unergative verbs. In Experiment 1, 31-month-old (but not 19-month-old) participants were found to use the word-order cue to categorize two novel verbs VUA and VUE: after hearing "VUA-le NP" and "NP VUE-le" in the training phase, they categorized VUA as unaccusative and VUE as unergative, showing discrimination in looking times between grammatical trials "NP VUA-le" and ungrammatical trials "VUE-le NP" in the test phase. In Experiment 2, 31-month-olds used the semantic cue of telicity provided via novel events to make categorizations: watching a telic event paired with "VUA-le" and an atelic event paired with "VUE-le" led to differentiation between grammatical trials "VUA-le NP" and ungrammatical trials "VUE-le NP". The findings provide evidence for toddlers' ability to extract information from the input and make generalizations in verb learning.

Cause and fault in development

Responsibility requires causation. But there are different kinds of causes. Some are connected to their effects; others are disconnected. We ask how children's developing ability to distinguish causes relates to their understanding of moral responsibility. We found in Experiment 1 that when Andy hits Suzy with his bike, she falls into a fence and it breaks, 3-year-old children treated ``caused'', ``break'' and ``fault'' as referring to the direct cause, Suzy. By 4, they differentiated causes: Andy ``caused'' the fence to break, it's his ``fault'', but Suzy ``broke'' it. We found in Experiment 2 that when the chain involved disconnection, 3-year-olds focused only on the direct cause. Around 5 they distinguished causes, saying that the disconnecting cause ``caused'' an object to break, it's their ``fault'', but the direct cause ``broke'' it. Our findings relate to the outcome-to-intention shift in moral responsibility and suggest a more fundamental shift in children's understanding of causation.

Event construal through social verbs in English and German: The LISADA corpus

How do people understand linguistic descriptions of inherently and potentially social events, such as to meet and to dance, and how do these interpretations align or differ across languages? To explore these questions, we developed an empirical database (LISADA), containing ratings for 240 verbs in English and German along two social dimensions: mutuality and jointness. While both languages show an overall positive correlation between these dimensions, hierarchical cluster analyses reveal meaningful within- and cross-linguistic differences. Through an exemplary test case, we demonstrate how these differences can provide insights into the linguistic and conceptual representations of social events, focusing on the role of morphosyntactic marking in event construal.

Seeing Things Differently: The Role of Differing Perspectives in Advice-Taking

Advice-taking plays a critical role in collaboration. Yet people tend to under-utilize advice, often to their own detriment. We investigate if people's utilization of advice improves when they know the advisor has access to different information compared to them. We examine how individuals integrate advice in an estimation task, where the advisee and the advisor have access to different perspectives of the same problem. We assess how individuals adjust their estimates when presented with estimates from a human advisor and an AI advisor, and when they are given information about the advisor's perspective. Our findings are consistent with egocentric discounting where individuals exhibit a general bias toward their own information. However, this discounting is lower for AI advisors compared to human advisors in our experiment. Our results also show that the advisor's estimate is taken more into account when the advisor has a more favorable viewpoint -- for both human and AI advisors. This suggests a potential for optimizing advice-taking behavior by enhancing people's understanding of the advisor's viewpoint. This study furthers our understanding of advice-taking dynamics with human and AI advisors and the role of perspective-taking in decision-making processes.

Enhancing Objectivity in LLM-as-a-Judge through Perturbation Injection

LLM-as-a-judge is considered a potential substitute for human evaluation due to its efficiency and cost-effectiveness. However, recent studies indicate that LLM-as-a-judge exhibits systematic biases when comparing candidate answers, including contextual, verbosity, and positional bias. These biases, as we find, mirror human cognitive biases like the anchoring effect and availability heuristic, where intuitive decisions prioritize superficial features over deeper analysis. Inspired by the Dual Process Theory, we propose that LLM evaluations often resemble system 1 thinking, leading to biased judgments. To address this, we introduce PeBC, a Perturbation-Based Calibration framework that shifts LLM evaluations from system 1 to system 2 reasoning through perturbation injection, bias analysis, and rule calibration. Our experiments on the meta-evaluation benchmarks LLMBar-Natural and LLMBar-Adversarial demonstrate that PeBC successfully mitigates evaluation biases, outperforming existing state-of-the-art (SOTA) methods across various test scenarios and achieving better alignment with human judgments.

FD-Bench: Fine-Grained Evaluating the Decision-Making Capability of LLM Agents in Dynamic Scenarios

Large language models(LLMs) exhibit growing potential as autonomous agents, yet their decision-making capabilities in real-world scenarios remain underexplored, particularly in dynamic scenarios where conditions are constantly changing. Most existing benchmarks mainly focus on static environments, which significantly differ from real-world scenarios. Additionally, existing evaluation frameworks lack fine-grained assessments, providing limited insights during evaluation. To address these, we propose FD-Bench a benchmark for evaluating the decision-making in dynamic scenarios. FD-Bench employs a fire evacuation scenario as a representative dynamic setting and decomposes decision-making into perception, prediction, and action stages, enabling granular evaluation of 8 LLMs and different reasoning frameworks. Our results show that LLMs experience a performance drop of over 50% in dynamic versus static scenarios. Inspired by "chunking" principle in Cognitive Load Theory (CLT), our hierarchical prompting strategy demonstrates improved performance in dynamic decision-making tasks. This work provides insights into LLMs' limitations and pathways toward robust real-world deployment.

Taking others for granted: balancing personal and presentational goals in action selection

This study investigates how individuals balance personal and presentational goals - how they want to be perceived by others - in social interactions where those are conflicting. We develop a computational model that construes presentational goals as minimising the divergence between the perceived and desired belief state of their partner. Based on the divergence between how much a person's partner trusts them versus how much they want to be trusted, we predict complex decision-making patterns that cannot arise from solely focusing on maximising a partner's utility. In accordance with our model, participants tended to forego signalling good intentions and prioritised their own goals when they perceived their partner to trust them. Participants were also less concerned about how they were perceived and acted more often in their own interest when their partner was unlikely to change their mind. We show that people are sensitive to the specific belief state of others and can dynamically adjust their decision strategy to trade off presentational and material gains.

Investigating Humor in EEG: Pun-Based Jokes Elicit Anterior N400 and Posterior P600 Effects

This study explores how readers process different types of pun-based jokes by analyzing their responses to various forms of linguistic ambiguity. Specifically, we examined puns rooted in homonymy, polysemy, and the contrast between idiomatic and literal interpretations of idiomatic expressions. Using EEG, we measured ERPs elicited by the ambiguous elements of these jokes and their punchline. These measurements enabled us to assess how distinct ambiguity types influence the comprehension of punchlines. Furthermore, we compared reader responses to puns against nonsensical sentences and straightforward control sentences. We hypothesize that the differences among joke types will manifest in the relative N400 amplitudes associated with punchline words, providing insights into the neural mechanisms underlying humor comprehension while having more control over joke setups compared to previous EEG studies in this field of research.

Memory Overlap Enhances Shared Feature Recognition but Hinders Specific Memory in Adolescents and Adults

Over time we accumulate memories for many related experiences. However, it remains poorly understood how this relatedness, or overlap, among learned information shapes how we remember shared and unique features. The current study investigated this question and further asked whether effects of overlap on memory differ in adolescence compared to adulthood, given evidence that memory specificity continues to be refined beyond childhood. We had adolescents (12-13 years old) and adults learn pairs of objects that overlapped with one another to different degrees and then tested their memory for both overlapping and pair-unique features. Across both age groups, we found that greater overlap boosted memory for the overlapping feature but also led to worse memory for unique features. Further, adolescents were more detrimentally affected by high overlap than adults when recalling specific pairs. Our results suggest there may be a trade-off between memory for shared and unique features of overlapping materials and that adolescents experience a greater cost to this trade-off. More generally, we find that the connections among learned information play an important role in how it is remembered.

Characterizing Human Planning on Large, Real-World Conceptual Networks

Planning in the real world involves navigating vast spaces of possibilities, from finding a route through a city to searching for information online. Yet our understanding of human planning has largely come from studies involving small, simplified environments. To bridge this gap, we explored human planning in the context of the \emph{Wiki Game}, where players start on a random Wikipedia article and are tasked with clicking on hyperlinks to reach a target article with minimal steps. We hypothesized that human planners reduce the computational cost of search by employing heuristic-guided and hierarchical search strategies. Analyzing a dataset of over 75,000 games, we discovered several behavioral signatures of heuristic-guided, hierarchical search. We formalized these insights using computational models, including tree search and hierarchical tree search algorithms. We found that our hierarchical tree search model mimicked these behavioral aspects of human navigation. Moreover, the patterns in human thinking times appeared to resemble patterns in the number of search iterations in the hierarchical tree search model. Collectively, these results suggest that humans use a combination of heuristic-guided search and hierarchical decomposition to efficiently plan in large, complex conceptual spaces.

Distributional learning over meaningful words facilitates semantic inferences about previously unknown words

Prior research suggests that a small vocabulary of meaningful words (a semantic seed) aids distributional learning. In two experiments, we show that adults who are exposed to a complex artificial language are better at inferring the meaning of previously unknown pseudowords when they were taught a semantic seed prior to distributional exposure. We further show that the benefit of a semantic seed is driven primarily by using seed words to discover the relationship between distributional and semantic classes. These results have implications for how syntactic bootstrapping begins.

Heterogeneity in Loss Aversion Estimates across Modeling Approaches

This study investigates how different modeling approaches affect the measurement of loss aversion, a fundamental concept in psychology and economics. Analyzing 10 datasets comprising over 140,000 trials from 686 participants, we compared four prominent methods: Maximum Likelihood Estimation with Prospect Theory, Bayesian Prospect Theory, Generalized Linear Models, and Drift Diffusion Models. While group-level median loss aversion estimates showed consistency across methods, significant differences emerged at the individual level. The analysis revealed substantial individual-level methodological variability in both the magnitude of loss aversion estimates and participant classification. These findings demonstrate the impact of methodological choices on loss aversion measurement and underscore the need for careful consideration when comparing results across studies using different estimation techniques.

Pencils to Pixels: A Systematic Study of Creative Drawings across Children, Adults and AI

Can we derive computational metrics to quantify visual creativity in drawings across intelligent agents, while accounting for inherent differences in technical skill and style? To answer this, we curate a novel dataset consisting of 1338 drawings by children, adults and AI on a popular creative drawing task. We characterize two aspects of the drawings—(1) style and (2) content. For style, we define measures based on ink density, ink distribution and number of elements. For content, we use expert-annotated categories to study conceptual diversity, and image and text embeddings to compute distance measures. We find significant differences in style and content in the groups—children's drawings had more components, AI drawings had greater ink density, and adult drawings were conceptually diverse. We also highlight a misalignment between creativity judgments obtained through expert and automated ratings. Our work provides a novel framework for studying human and artificial creativity beyond the textual modality.

Fine-tuning conceptual structure of referents through coordinated interaction

During communicative interactions, language production and comprehension are bounded by the accumulation of a shared communicative context. Besides lexical pacts and simplification of referential expressions, the creation of a shared context allows for the intelligible use of linguistic signals with novel, interaction-specific meanings. Here, we explore whether context-specific language use leads to mutual adjustment of interlocutors' conceptual representations. We tasked dyads with solving a referential communication game, quantifying dialogue-related adjustments in interlocutors' conceptual structures, and coordination dynamics during the interaction. After engaging in the dialogue, interlocutors judged the same set of referents more similarly than other participants engaged in the same task, but not with each other (pseudo-pairs). Exploratory analyses of the structural complexity of unfolding semantic spaces indicate a stronger alignment between interacting dyads than pseudo-pairs. These findings suggest that human communication is supported by structural coordination of conceptual representations of the communicative referents, over and above signal-level alignment.

Computational insights from a novel habit induction protocol

Habits -- automatic behavioral patterns formed through repetition -- are essential for daily functioning, but can also lead to inflexible behavior. While crucial for understanding both adaptive and maladaptive decision-making, studying habits' computational and neural mechanisms has been challenging due to limited laboratory experiments demonstrating overtraining-induced inflexibility. We developed a novel task with features designed to encourage participants to engage goal-directed (GD) control between trials (interleaving extensively- and minimally-practiced contexts), then naturally release control within trials (hierarchical multi-step trial structure and opportunities to self-correct). Results showed that overtrained participants displayed stronger biases toward behaviors learned in extensively-practiced contexts, evidenced by higher Habit Index values at early response times. This effect decreased at later response times, suggesting participants could override habitual impulses with GD control. Our computational model, characterizing behavior as a mixture of reinforcement-learned policies, reproduced observed behavioral patterns, suggesting that habits can be viewed as goal-directed deployment of overtrained policies.

The Applications and Limitations of the Burstiness Metric in Investigating the Temporal Distribution of Words in Child-Centered Audio

As the use of naturalistic speech data of children's language experiences increases, the temporal dynamics of the speech environment becomes a more obvious aspect of speech to investigate. To connect the temporal dynamics in child- centered speech to existing experimental work showing that the temporal presentation of items has measurable effects on learning, it is important to develop measures that quantify temporal patterns in speech. The present work explores one such measure, the burstiness metric, and investigates word burstiness and its relationship with frequency, its behavior across different timescales, and whether it can be used to quantify constructs of massed and spaced orders. Our findings suggest that while related, word burstiness is not an index of frequency, that burstiness varies across both words of different lexical classes and timescales, and that it does not appropriately capture massed and spaced temporal patterns. We discuss implications for how this measure may be used for child- centered audio.

Selective social influence on aesthetic evaluations via natural language testimony

Why and how do we incorporate others' judgments when making an aesthetic evaluation? We investigated this question by studying social transmission of aesthetic evaluations via natural language, which conveys richness that more common (numerical) measures may fail to convey, such as the reasoning behind a judgment. Participants in a large-scale study aesthetically evaluated photographs, either independently or after observing testimony from another person. We found that participants formed more similar evaluations to the testimony they observed (than the asocial control). Furthermore, participants who received the same evaluative testimony wrote evaluations that were more similar to each other in content but not sentiment (relative to a matched asocial cohort). This suggests that social influence on aesthetic evaluations may have a greater informational aspect than previously understood.

Sub-phonemic featural dimensions mediate consonantal co-occurrence biases in a cross-linguistically consistent manner

Studies of segmental co-occurrence constraints have consistently produced evidence consistent with a cross-linguistic anti-similarity bias for consonants (e.g. Frisch et al. (2004), Pozdniakov & Segerer (2007), Walter (2010), Doucette et al. (2024)). The current study tests whether, in spite of this universal similarity avoidance bias, there are cross-linguistically consistent featural harmony biases the world's lexicons. In particular, we test for the presence of a nasal consonant harmony bias, given the fact that categorical nasal consonant harmony is attested in multiple language families and that nasal harmony is phonetically-motivated. A Bayesian negative binomial model of 91 typologically diverse languages' type frequencies for two-consonant words shows evidence of a weak but reliable cross-linguistic bias in favor of nasal harmony, as well as a comparable bias in favor of voicing harmony. The findings also show patterns consistent with a similarityavoidance bias, most notably a strong cross-linguistic bias against coronal harmony. Taken together, these findings support the notion that similarity-based co-occurrence constraints may be feature dependent in cross-linguistically consistent ways, and more generally that featural dimensions are relevant for understanding the role of segmental redundancy in lexicons.

Testing the Emergentist Theory of Number Perception Development

The origin of abstract concepts remains a central question in cognitive science. Empiricists argue that abstract concepts can be learned through a small set of domain-general cognitive mechanisms, whereas nativists argue that most abstract concepts are not learned at all. To explore these opposing views, we examine the abstract concept of number. Drawing on a recent "emergentist" proposal derived from machine learning models, we tested whether stimuli that align with experiences critical for an empiricist view of number perception result in more accurate and less variable numeric estimations. Five- to nine-year-olds were tested on stimuli that conformed to the natural statistics of objects (i.e., number was correlated with features such as cumulative area and spatial position, and the distribution of stimuli was power-law based) or those that did not comply with these statistics. We find little-to-no evidence for predictions of the emergentist theory when presenting children with natural stimuli.

What Do Head Scans Reveal About Depression? Insights from 360° Psychomotor Assessment

Psychomotor changes, while crucial indicators of depres- sion, remain underrepresented in clinical observations. We examined the relationship between depression and psychomotor behavior by analyzing head-tracking data related to yaw movements made during exploration of 360° emotional videos, alongside valence and arousal ratings on 9-point Likert scale. Symptoms of depres- sion were recorded using the Patient Health Question- naire (PHQ-9). While subjective ratings for valence and arousal showed no differences across depression groups, the head-tracking data revealed novel results. Individu- als with moderate to severe depression exhibited signif- icantly lower scanning speed and standard deviation in yaw movement compared to minimal to mild depression. Although preliminary, these results underscore the im- portance of psychomotor measures in diagnosis, risk as- sessment, and monitoring in psychiatric care, alongside subjective evaluations.

Stumped! Learning to think outside the box in 3-7 year old children

Many theories conceptualize thinking as search through a space of hypotheses. But what if your initial space is wrong? What cognitive skills support abandoning an ineffective hypothesis space and re-construing a problem with the correct hypothesis space? We examined the development of such abilities in n=172 children ages 3-7 years using Stumper riddles, which challenge respondents to explain seemingly impossible situations. We found evidence that children both learned the relevant hypothesis space for different riddle categories and generalized the cognitive strategy across riddle categories. Although older children showed greater overall accuracy, these effects of learning and meta-learning were found even for the youngest 3-5-year-olds. These results suggest a promising method for probing both flexible hypothesis search and meta-cognitive skills. We discuss ongoing plans to characterize individual differences as a way to uncover the underlying mechanisms of creative problem-solving.

Children with ASD show diminished input statistics for word learning during caregiver-child interaction

This study explored cross-situational word learning in children with and without autism spectrum disorder (ASD). Children learn words by mapping object names in caregiver utterances to objects in their visual field. We examined the confluence of caregiver object naming and child visual attention in children with and without ASD at play. Head-mounted eye-tracking revealed that children with ASD spent less time attending to named objects than typically developing (TD) children. In both groups, learning input improved as children accrued increased looking time to named objects across multiple naming events. However, for objects with high quantity of naming events, TD children had higher quality learning input than children with ASD. These findings suggest that the input statistics of social interaction are less conducive to word learning in children with ASD. This work has important implications for clinical interventions to scaffold word learning.

What makes people think a puzzle is fun to solve?

Many tasks feel like chores, while others are fun. Why? Here we leverage a popular puzzle game, Sokoban, to explore potential sources of variation in how enjoyable different levels of this game are to solve. In Sokoban, players navigate a grid world, pushing boxes onto goal locations while avoiding getting stuck. We first analyzed natural game play statistics (n = 442 puzzles) and found that some variation in enjoyment ratings could be jointly predicted by surface-level features (e.g., puzzle area) and solution complexity. Next, we measured how much participants reported enjoying a puzzle immediately after attempting it (N= 250 participants). We found that on successful attempts, participants enjoyed it more when they took fewer moves, whereas when unsuccessful, having made more moves was associated with greater enjoyment. Together, these studies advance understanding of how both features of the task environment and the dynamics of exploration make some activities more fun than others.

Syntactic Choice Is Shaped by Fine-Grained, Item-Specific Knowledge

There is a longstanding debate over how much idiosyncratic, item-specific knowledge is contained in our mental grammars, in addition to productive knowledge of item-general rules and constraints. A key source of evidence is ordering preferences for syntactic alternations like the dative ("throw him an apple" vs. "throw an apple to him") vary depending on which words they contain. But the quantitative extent of this variability is poorly understood, especially in relation to superficially similar, non-dative constructions which are not alternating ("throw the man to the floor" vs. "*throw the floor the man"). To address this, we built a large corpus of naturally-occurring sentences including either dative or superficially similar non-dative structures, and analyzed the unique contributions of productive and verb-specific knowledge in predicting argument ordering preferences

A Model of Approximate and Incremental Noisy-Channel Language Processing

How are comprehenders able to extract meaning from utterances in the presence of production errors? The noisy-channel theory provides an account grounded in Bayesian inference: comprehenders may interpret utterances non-literally in favor of an alternative with higher prior probability that is close under some error model. However, we lack implemented computational models of prior expectation and error likelihood capable of predicting human processing of arbitrary utterances. Here, we model sentence processing for ``noisy" utterances as incremental and approximate probabilistic inference over intended sentences and production errors. We demonstrate that the model reproduces patterns in human behavior for anomalous sentences in three separate case studies from the noisy-channel literature. Our results offer a step towards an algorithmic account of inference during real-world language comprehension. Our model code, implemented in Gen, is available at https://github.com/thomashikaru/noisy_channel_model.

Trade-offs in posterior hippocampus versus medial prefrontal cortex mechanisms underlie memory precision in childhood

Hippocampal subregions and medial prefrontal cortex may differentially shape memory precision in the mature brain: While anterior and posterior hippocampus may store the general themes and specific details of an episode, respectively, medial prefrontal cortex may instead connect related memories. However, given continued change to the functionality of these regions beyond childhood, it is unclear how children's memory precision is influenced by these same mechanisms. We characterized how hippocampal subregions and medial prefrontal cortex separately and in tandem encourage memory precision in childhood versus adulthood. Children (7-9 years old) and adults studied scene photographs and then performed a recognition test that included both the studied scenes and highly similar lures. Behaviourally, adults had more precise memories than children in that they were better able to discriminate studied scenes from lures. At the neural level, anterior hippocampus and medial prefrontal cortex were differently engaged during this memory formation: Anterior hippocampus engagement was related to subsequent memory across age groups, while children showed greater medial prefrontal cortex engagement than adults overall when studying scenes. Considering individual differences in engagement revealed further developmental differences. Children showed evidence for a trade-off in their reliance on posterior hippocampus versus medial prefrontal cortex during precise memory formation, suggesting competition between these regions. By contrast, the same structures in adults played a more cooperative role in supporting memory precision. These findings suggest that the relationship between posterior hippocampus and medial prefrontal cortex reverses over development to yield adult-like memory precision, such that these regions work in opposition in childhood before becoming specialized to cooperatively encourage precision in adulthood.

The asymmetric effects of aging on between- and within-trial timescales of inhibition

Widespread cognitive decline in older adults has been hypothesized to stem from a fundamental deficit in inhibition, or the ability to ignore goal-irrelevant information. The extent to which inhibition operates across different timescales, however, has been under-explored. We introduce a novel cognitive task designed to assess both between- and within-trial inhibition using a common set of stimuli. Behavioral results from younger, middle-aged, and older adults (N=100; age range: 18-73) reveal significant age-related differences in between-trial inhibition, with older adults showing less efficient adaptation to rule changes compared to younger adults. Within-trial inhibition, requiring suppression of distractors within the current visual environment, appears to remain intact alongside normal aging. These findings will support the development of tools for the early detection of age-related cognitive decline, prior to subjective awareness of impaired daily functioning.

Mobile EEG Suggests that Alpha-Band Oscillations Support the Retrieval of the Egocentric Direction of Landmarks Around a Navigator

Remaining oriented while navigating is a key aspect of survival for many mobile organisms. Previous work suggested that the parietal lobes play a key role in helping navigators determine directions to landmarks relative to themselves. Recent evidence suggests that alpha-band oscillations are crucial for spatial attention and may track egocentric direction as well. We used mobile EEG to integrate these disparate lines of research and test our novel "alpha window hypothesis" that alpha-band oscillations support the retrieval of egocentric directional information around navigators oriented within a real-world environment. Time-frequency-based machine learning analysis revealed significant classification accuracy of the target's egocentric direction within the 8-12 Hz frequency range, thus supporting the alpha window hypothesis. Our results provide a pivotal advancement in our understanding of the neural mechanisms of directional memory by extending previous research that used neuropsychology, fMRI, and EEG into the domain of a dynamic, situated, embodied spatial memory task.

How do children learn new words via reading emotional narratives?

Context valence has been shown to predict word learning in adult experiments. Little is known about whether this extends to children. To address this gap, we conducted a pre-registered word learning experiment to investigate how emotional narrative context shapes children's learning of novel adjectives during naturalistic reading. 120 children aged 7 to 11 years from UK primary schools read 15 novel words (such as "garive") embedded in 30 short narratives of either neutral, negative, or positive valence. Three immediate post-tests assessed learning. We found that children were able to learn novel adjectives from reading short narratives, and older children outperformed younger children. Novel adjectives read in more emotional (positive or negative) contexts were recognized more accurately than those read in neutral narratives. The findings extend previous research conducted using noun concepts and with adults, providing further evidence for affective embodiment in supporting the learning of abstract concepts.

"Wow! You drew it!": How overly positive emotional reactions influence children's motivation in learning contexts

Adults often exhibit various emotional responses to children's performance. How do these responses shape children's learning and motivation? This study investigated how overly positive emotional reactions influence children's task engagement and their willingness to take on challenges. Children aged 5 to 7 (N = 81) completed some simple tasks and received either overly positive or mildly positive emotional responses from an adult. Girls were more likely to continue engaging in the task and take on challenges after receiving a mildly positive emotional response compared to an overly positive one. In contrast, boys exhibited the opposite pattern, engaging in the task for longer and taking on more challenges after receiving an overly positive response than a mildly positive one. These findings suggest that girls and boys may interpret varying levels of positive emotional reactions differently, highlighting the complex, gender-specific ways in which emotional cues influence children's motivation and learning.

Phonetic cue distributions guide perceptual adaptation in speech: Evidence from a three-week study with a natural non-native accent

Human languages vary widely in the combination and balance of phonetic cues used to encode speech sounds, a source of non-native accents in second language learners. It has been hypothesized that repeated exposure to an accent leads to adaptive perceptual changes in native listeners through multidimensional distributional learning. Although this hypothesis is highly influential, it has rarely been tested against the complexity of naturally produced accented speech, and for a period longer than a single-session experiment. The current large-scale (N = 338), five-session experiment goes beyond this status-quo to examine the adaptation of native American English listeners to naturally produced Mandarin-accented English. A repeated exposure-test design was used to characterize adaptive changes in perception from the first few minutes to over the course of three weeks. The results reveal that behavioral changes can be predicted by listeners' sensitivity to changes in phonetic cue distributions. Possible joint contributions of early-stage auditory normalization and later-stage decision processes are discussed.

Disentangling Model-Based and Model-Free Moral Learning

To resolve moral dilemmas, people often rely on decision strategies such as cost-benefit reasoning (CBR) or following moral rules. Previous studies show that people learn to increasingly rely on whichever strategy led to better outcomes in the past. Do they learn this by constructing a mental model of what outcomes would result from using either strategy (i.e., model-based learning) or by assigning value directly to each strategy (i.e., model-free learning)? To answer this question, we adapted the two-step task to a trolley-type dilemma between following moral rules (e.g., obeying authority) versus CBR (e.g., saving a larger group). In each of the 125 trials, participants' choices led to either a common or a rare transition, which probabilistically led to good versus bad outcomes. Computational modeling and pre-registered analysis of behavioral data provided converging evidence that participants apply both model-based and model-free learning.

Does normality influence children's causal selections?

Our expectations about what normally occurs influence our explanations. In conjunctive causal structures, people tend to select the more abnormal cause. This tendency reverses in disjunctive structures, and people select the normal cause. It is currently unknown how these tendencies develop, and what factors contribute to their emergence in childhood. Across three experiments, we tested adults (n = 179) and 5- to 7-year-olds (n = 96) on two tasks where an abnormal and normal factor jointly caused an outcome. Experiments 1 and 2 revealed that while adults' explanations varied according to the causal structure, children exclusively chose the normal cause, regardless of causal structure. Using a task with an intuitive and explicit causal mechanism, Experiment 3 found that children were more likely to select the abnormal cause in the conjunctive case than in previous experiments. This suggests that intuitions about causal mechanisms may facilitate adult-like judgments. We consider potential explanations, including the role of counterfactual reasoning.

Understanding Human Heuristics in Context-Sensitive Image Captioning

Recent studies highlight the context sensitivity of image captioning, where the context in which an image appears strongly influences its caption's informativeness and linguistic style. While AI-generated text increasingly resembles human language, its informativeness and coherence, derived from cross-modal image-text reasoning, may still fall short of content generated by human experts. Given the intertwined nature of informativeness and linguistic style, this study examines news image captioning, a naturally high-context task, to manipulate caption informativeness and assess human sensitivity to such variations. Two experiments (N = 378) and logistic regression analyses reveal that while humans effectively interpret informational cues, their intuition about AI linguistic style often diverges from actual AI language markers. Moreover, humans more readily integrate multiple modalities in preference tasks but rely heavily on linguistic-based strategies for AI detection. These findings underscore the adaptability of human evaluation in image-text systems and suggest informative signals as the more reliable basis for judgment.

Collective Behavior Emerging from Social Learning Strategies and Network Structures

Humans make decisions collectively by combining individual and social learning. Individuals benefit from groups when individual exploration fails to accurately assess the environment, a phenomenon known as the "wisdom of crowds." Previous studies indicate that self-organizing group dynamics can reduce suboptimal biases in noisy environments, particularly in fully connected groups. However, agents often have only partial information due to cognitive and physical constraints. To explore how diverse social network structures influence the collective dynamics of social learners, we integrate a decentralized network with a social reinforcement learning model in repeated two-armed bandit tasks. Our results suggest that: 1. Social learning in a sparse network outperforms asocial solo learning in highly uncertain tasks. 2. The phenomenon "less is more, and more is different" holds true only when agents strategically balance individual and social learning. 3. The group size effect on collective performance is significantly influenced by network structures.

All Eyes on the Hippocampus: The Primate Hippocampus as a Visually-Guided Cognitive Graph

Spatial memory is a core cognitive function of many mobile animals. The study of spatial cognition is highly interdisciplinary and different approaches have led to discordant hypotheses about spatial memory. The neuroscientific discovery of place cells and grid cells in the rodent hippocampus and entorhinal cortex led to the highly influential "cognitive map" hypothesis. Conversely, behavioral evidence led to the "cognitive graph" hypothesis that suggests memory is distortion-prone. We argue that between-species differences in sensory and perceptual systems cause vision to play a predominant role in primates. We build on previous modeling work by developing a visual version of the SR model to show that it can provide a unified framework to account for seemingly disparate findings from the brain and behavior, thus providing evidence for our hypothesis that primate (e.g., human) spatial cognition is driven by the hippocampal system, which instantiates a visually-guided cognitive graph.

Children explore conservatively when learning novel word extensions

Children are active, curious learners. How might children's curiosity shape their curriculum during word learning? Past research suggests that children's tendency to explore can lead them to discover novel information during learning. This exploratory tendency could be especially useful when learning word meanings: exploring potential meanings for words broadly could help children efficiently probe a word's possible extension. To investigate this question, we tested how children (5-8 years of age) and adults sample information when presented with a novel word and tasked with uncovering the word's extension. Overall, we found that children explored novel word extensions conservatively. Children (as well as adults) favored sampling choices that confirmed a novel word meaning, as opposed to exploring broader possible meanings. Younger children's sampling choices were especially conservative, with children often sampling the narrowest possible generalization option. Older children were more exploratory, probing broader possible word extensions more frequently. Counter to proposals that children are generally more exploratory at younger ages, our results suggest that when children test the extension of novel word meanings, they are often more likely to confirm their hypotheses than to explore.

Transmission of Natural and Supernatural Explanations by Hindu and Muslim Schoolchildren in Gujarat, India

What determines which stories (or parts of stories) about the social world are captured and conveyed by children? How do they transform with retelling? We use an iterated learning paradigm to explore how peer-to-peer transmission of explanatory stories (here, explanations for the social customs of novel social groups) is influenced by explanatory framework (natural, supernatural, or hybrid) and children's existing belief systems. Our participants were 69 Hindu and Muslim 3rd-7th-graders in Gujarat, India. Consistent with the `minimally counterintuitive' nature of many highly culturally preserved concepts, hybrid explanations (containing both natural and supernatural elements) were transmitted with the greatest fidelity across chains. Individual religiosity also affected transmission: children who reported themselves as more religious transmitted scientific explanations less faithfully (and hybrid explanations more faithfully) than less religious children.

What if child vocabulary development followed network acquisition models exactly?

The network science perspective on vocabulary development emphasizes the structured relationships between words and how they can influence learning. Words that appear in many contexts and thus develop associations with many other words tend to be learned earlier---a growth model called preferential acquisition. Likewise, children appear to have a bias towards learning new words associated with many known words---a growth model called lure of the associates. Although both are statistically related to age of acquisition estimates, much variance remains unexplained, and it is unknown what structures these models promote within the developing vocabulary. We simulated vocabulary growth strictly adhering to preferential acquisition and lure of the associates and found that they promote similar structures: more connectivity, more clustering, and much shorter path lengths than random growth would achieve. They clearly promote small-world structure, consistent with that seen in young children's vocabularies.

An information bottleneck view of social stereotype use

For decades, social psychologists have wondered about the cognitive foundations of social stereotype use. Arguments have generally centred either resource constraints, framing stereotypes as `energy-saving devices', or `fit', framing stereotypes as tools to represent real structure in the social environment that sometimes go awry. These resource-based and fit-based accounts have typically been presented as being in opposition to one another. In this paper, we seek to show that both are compatible under an information bottleneck model of agent representation. Through a simple simulation experiment, we demonstrate how stereotype use emerges in resource-rational representations as a function of both capacity constraints and the structure of the social environment. We then use the same framework to consider a possible explanation for the outgroup homogeneity bias in terms of limited cognitive capacity.

Representations of what's possible reflect others' epistemic states

People's judgments about what an agent can do are shaped by various constraints, including probability, morality, and normality. However, little is known about how these representations of possible actions—what we call modal space representations—are influenced by an agent's knowledge of their environment. Across two studies, we investigated whether epistemic constraints systematically shift modal space representations and whether these shifts affect high-level force judgments. Study 1 replicated prior findings that the first actions that come to mind are perceived as the most probable, moral, and normal, and demonstrated that these constraints apply regardless of an agent's epistemic state. Study 2 showed that limiting an agent's knowledge changes which actions people perceive to be available for the agent, which in turn affects whether people judged an agent as being "forced" to take a particular action. These findings highlight the role of Theory of Mind in modal cognition, revealing how epistemic constraints shape perceptions of possibilities.

Word Integration Across Sentence Boundaries in Third and Fourth Age Adults: Evidence from Eye-Tracking during Reading

This study investigates how adults in the third (60–79 years) and fourth age (80+ years) integrate words across sentence boundaries during reading. Participants read two-sentence passages involving direct lexical repetition or bridging inferences. Both age groups exhibited longer reading times when bridging was required, showing that inference-making is still present despite potential cognitive declines. However, while third-age adults showed immediate sensitivity to inference demands, fourth-age participants demonstrated a delayed response, suggesting compensatory strategies. These findings highlight a key role for semantic knowledge in sustaining reading comprehension in old age. Future research with more diverse samples and longitudinal methods should clarify how age-related changes interact with linguistic resources. Interventions may target processing speeds to support reading comprehension in late adulthood.

Infants' recognition of social conventions

From early in life, humans expect members of the same social group to act like one another. What drives this expectation? Across two experiments, we investigated how 8- and 9-month-old infants' (N = 100) expectations about shared behaviors align with accounts based on collective identities, ritualistic actions, or conventions. In Experiment 1, infants inferred that an action would generalize to a new group member only when they had previously seen more than one group member share that action, suggesting that multimember demonstration influences infants' inductive reasoning. In Experiment 2, infants did generalize an action after observing it from just one group member, but only if they had observed that same action shared by two members of another group in a different social context. Together, these findings suggest that infants learn to recognize which actions are socially conventional and then readily generalize these actions even in new social contexts.

How linguistic boundaries form encoding contexts in memory: Evidence from temporal order effects

While temporal contiguity and order effects have been shown to be highly robust in the word list memory literature, little is known about how the presence of boundaries signaling higher-order linguistic units affects temporal order memory for items within those units. Here, we present the results of two sentence memory experiments that show that linguistic boundaries block contiguity effects through a temporal context mechanism, whereby words encoded across distinct linguistic environments are also encoded as contextually dissimilar. Two consequences of this encoding mechanism are that (i) retrieval of an item results in reactivation of that item's linguistic context alone due to co-activation of contextually similar content and (ii) significant linguistic boundaries reduce encoding interference between proximal, semantically similar items. We take these results to suggest that linguistic groupings map onto encoding contexts, which constrain the effect of item-to-item associations in sentence memory.

The role of language-specific and domain-general working memory resources in predictive language processing

A core aspect of language comprehension is predictive processing, which supports real-time inferences under uncertainty. Real-time prediction is constrained by the mind's limited working memory (WM) resources, which are required for maintaining the context that supports prediction, and for reallocating degrees of belief across inferred interpretations as new pieces of information (e.g., words) are perceived. What is the nature of this WM resource? Is it language-specific or shared between language and other cognitive domains? Do both domain-general and domain-specific resources support prediction? Here, we study this question using an individual differences approach. We collected self-paced reading times of naturalistic paragraphs in English and measured, for each participant, their domain-general WM (backwards digit span) and linguistic WM (reading span). We quantified predictive processing as the relationship between surprisal and reading time. We found that surprisal influenced reading times more strongly in participants with (1) stronger domain-general WM, but (2) weaker linguistic WM, although the latter relationship was less reliable. Our results indicate that domain-general WM could contribute to predictive processing during language comprehension. We discuss several theoretical interpretations of this finding, we well as potential reasons for the discrepancy between our results and past studies.

Age-related differences in forming conjunctive memories of what, when and where

Aging greatly affects memory, but not all aspects are impaired equally. Past research has demonstrated that older adults show greater deficits in remembering where an object was encountered compared to what was experienced. A third dimension, that has received less attention in this context is memory for when events occurred. In this study, we employed a Sequential Memory Task (SMT) in which participants memorized spatio-temporal visual object sequences over repeated exposures. In two experiments with 39 younger (YAs, 18-35 years) and 53 older adults (OAs, 65-75 years), we examined age-related differences in sequence memory (when) and the interplay between item (what), and location (where) knowledge. Our results revealed that memory was stronger for item sequences than for location sequences or item-location combinations in both younger and older adults. In addition, older adults exhibited greater age-related deficits in location-related memory. We also found that item and location memory were bi-directionally related and that even pure item sequence reports involved location memory and vice versa. Both age groups relied more on item-location binding associations than on transition learning, but computational modeling indicated a higher reliance on independent location transition learning in OAs than YAs. This suggests that the strong age-related impairment in spatial location memory was in part driven by age-differences in memory binding. These findings provide insights into age-related changes in spatio-temporal sequence memory, and highlight distinct learning strategies in younger and older adults.

Investigating False Memory in the DRM Paradigm with Relational Category Content

Fuzzy Trace Theory and the Activation Monitoring Framework disagree on whether gist or backwards associative strength best describes false memories in DRM. Recent evidence suggests situational features best describe processes underlying DRM results. Thematic, goal-derived, and relational categories capture different aspects of situational features, but perform differently in memory and coherence. We constructed novel category-specific lists for use in the DRM paradigm to determine whether different aspects of situational features make different contributions to a successful DRM result, whether relational and goal-derived content can produce false recognition despite low gist and backwards associative strength, and whether DRM captures an aspect of category coherence that has been difficult to measure in relational and goal-derived content. Only relational and thematic content produced sufficient false recognition. This provides mixed support for situational features, evidence that relational content makes specific contributions to DRM success, and evidence of coherence in relational categories.

Making Sense of Nonsense

Some impossible things are more impossible than others. Magically levitating a feather seems easier than levitating a rock, even though both are impossible in the real world. But within the things that are inconceivable---e.g., "the number 13 writing a play" or "a girl being a prime number"---are some things more inconceivable than others? We first established that people have graded, systematic judgements of the likelihood of inconceivable and nonsense sentences (Experiment 1). We then examined two hypotheses as to how people make such judgments: the ease of a metaphorical interpretation (Experiment 2), and how difficult it is to transform a nonsense statement into a sensible one, as measured by distance in a type hierarchy (Experiment 3). We found that graded judgments of inconceivability are not captured by metaphorizability, but do correspond to a measure of distance in a type hierarchy. Our results suggest that inconceivability is graded, and the perceived likelihood of an inconceivable event may be a product of one's ontology of the world.

Low Power Constrains the Space of Narratives Available to Speakers

How does power shape communicative freedom? During communication, speakers often balance multiple competing goals—such as being informative versus preserving their reputation. Across four experiments (N=~1400), we examined how the power relation between speakers and listeners affects narrative production in the face of conflicting informational and reputational goals. Participants imagined having to confess about a minor wrongdoing and were assigned to low, equal, or high power roles (an employee speaking to a supervisor, co-worker, or intern). Low power speakers were more likely to prioritize informativeness and believability. Importantly, the open-ended narratives they provided were more homogenous than those of high or equal power narrators, suggesting that power restricts the range of utterance choices considered. Speakers' ratings of pre-written narratives indicated that power may additionally affect how speakers evaluate the believability of different utterance choices.

Just unlucky?: Children are sensitive to the cause of rejection

Rejection from clubs, teams, or schools is an inevitable part of growing up. Knowing when to quit or persist in the face of rejection is critical for goal pursuit, yet it is unclear how children respond to various sources of rejection. In a pre-registered experiment (N = 202), we tested whether 7- and 8-year-old children are sensitive to the cause of rejection. Children played a game in order to try out for a selective club and were rejected either based on merit (their performance) or by chance (a spinner). Children who experienced luck-based rejection felt better about their competence and persisted marginally more than those rejected based on merit. Across conditions, girls persisted more than boys, and persistence declined with age. These results suggest that by early elementary school, children are sensitive to the cause of their rejection, with implications for how they calibrate effort and pursue goals.

Exploring the affective structure of children's early language environment through egocentric video

When addressing infants, parents often exaggerate positive affect in both their faces and speech. While these cues have been theorized to support learning, their frequency alongside linguistic input and the information they convey in children's real-world experiences remain unclear. We analyzed ~377 hours of egocentric at-home videos from infants (5–28 months) to examine affective cues in faces and language surrounding early-learned words. Using automated tools, we tagged happy affect in the utterance in which each word was embedded and in co-occurring faces. Faces were visible in only 13.5% of word instances, and even fewer displayed happy affect. However, more (vs. less) positive words tended to co-occur with happier faces. Linguistic context, in contrast, conveyed stronger positive affect and more reliably aligned with word valence than facial affect. These findings suggest that facial and linguistic affect may serve distinct roles in infants' learning environments.

Iterated language learning is shaped by a drive for optimizing lossy compression

It has recently been theorized that languages evolve under pressure to attain near-optimal lossy compression of meanings into words. While this theory has been supported by broad cross-linguistic empirical evidence, it remains largely unknown what cognitive mechanisms may drive the cultural evolution of language toward near-optimal semantic systems. Here, we address this open question by studying language evolution in the lab via iterated learning. Across two qualitatively different domains (colors and Shepard circles), we find that semantic systems evolve toward the theoretical limit of efficient lossy compression, and over time, converge to highly efficient systems. This provides direct evidence that adult learners may operate under a bias to maintain efficiently compressed semantic representations. Moreover, it demonstrates how this bias can be amplified by cultural transmission, leading to the evolution of information-theoretically optimal semantic systems.

Who Has More Furniture? Context Effects on the Quantification of Mass vs. Count Superordinate Nouns

In many languages, words in count syntax quantify over countable individuals (e.g., too many strings), while mass nouns often don't (e.g., too much string). Theories differ in how to characterize nouns that violate this pattern, such as object-mass nouns (e.g. furniture, clothing). These nouns exhibit mass syntax, but often quantify by number (Barner & Snedeker, 2005). On one hypothesis, the individuation of object-mass nouns is lexically specified (Bale & Barner, 2009). Another argues that, while count nouns always quantify by number, object-mass nouns have different quantification criteria depending on context (Rothstein, 2010), including function fulfillment (McCawley, 1975). We evaluated these hypotheses by comparing English quantity judgments for object-mass nouns to (1) superordinate count nouns, and (2) French judgments for translations of object-mass nouns. In each case, we found that object-mass nouns behaved like count nouns, and were no more susceptible to contextual effects. These findings support the idea that object-mass nouns specify individuation lexically.

Scaling up the think-aloud method

The think-aloud method, where participants voice their thoughts as they solve a task, is a valuable source of rich data about human reasoning processes. Yet, it has declined in popularity in contemporary cognitive science, largely because labor-intensive transcription and annotation preclude large sample sizes. Here, we develop methods to automate the transcription and annotation of verbal reports of reasoning using natural language processing tools, allowing for large-scale analysis of think-aloud data. In our study, 640 participants thought aloud while playing the Game of 24, a mathematical reasoning task. We automatically transcribed the recordings and coded the transcripts as search graphs, finding moderate inter-rater reliability with humans. We analyze these graphs and characterize consistency and variation in human reasoning traces. Our work demonstrates the value of think-aloud data at scale and serves as a proof of concept for the automated analysis of verbal reports.

When 0 is good: instrumental learning with counterintuitive goals decreases working memory engagement

Humans are adept at setting goals quickly and flexibly in their daily lives. Previous research has shown that people can assign rewarding properties to abstract or novel outcomes and use them to guide behavior. However, the mechanisms supporting this flexibility and their impact on learning processes, such as working memory (WM) or slower incremental systems, remain unclear. To address this, we designed an instrumental learning task in which participants learned stimulus-action associations by pursuing either standard goals (+1) or counterintuitive goals (+0) under varying WM loads. Our behavioral and modeling results revealed that when pursuing counterintuitive goals, humans learned more slowly and shifted their reliance from WM to habit-like associative processes, despite both processes remaining functionally intact. Additionally, we replicated previous findings showing that humans do not rely on reinforcement learning (RL) processes but instead integrate WM and habit-like processes to learn the associations. This interplay between WM and habit-like processes may allow a more resource-efficient approach to pursuing diverse goals. Our findings shed light on the breadth and cost of people's ability to flexibly learn and pursue any goal.

Children Expect Emotional Consolation to Occur in Close Relationships

Do children see emotional consolation during times of hardship as a cue for close relationships? In this study, 6- to 8-year-old children in the U.S. (N = 62) were presented with vignettes in which a protagonist experiences a hardship. The protagonist then tells one of two side characters (i.e., their 'friend') that they either felt sad or okay about the situation, and the character either hugs or does not hug the protagonist. Children inferred that side characters who consoled when the protagonist felt sad were (i) better friends with the protagonist, (ii) more likely to be shared the protagonist's secret, and (iii) more likely to be reciprocated emotional support by the protagonist in the future. Together, these findings suggest that children see consolation during hardship as a cue for social affiliation and may use emotional support to differentiate among positive social relationships.

Polysemy and Inference: Reasoning with Underspecified Representations

Lexical ambiguity has classically been categorized into two kinds. Homonyms are single word forms that map to multiple, unrelated meanings (e.g., "bat" meaning baseball equipment or a flying mammal). Polysemes are single word forms that map to multiple, related senses (e.g., "breakfast" meaning a plate of food or an event). Yet there is a longstanding debate as to whether polysemy and homonymy reflect distinct cognitive representations. Some (e.g., Fodor & Lepore, 2002; Klein & Murphy, 2001) posit that they do not—merely describing differing patterns of usage—while others (e.g., Falkum & Vicente, 2015; Pietroski, 2018) argue that polysemes, but not homonyms, involve an underspecified representation that is neutral with respect to the form's multiple senses. While some extant experimental evidence supports the latter view (Klepousniotou, Titone, & Romero, 2008; Srinivasan, Berner, & Rabagliati, 2019), there has not yet been clear evidence of the representation of lexical ambiguity affecting domain-general reasoning. Using a novel inference paradigm, we compare participants' dispositions to endorse deductive, Aristotelian arguments with equivocating polysemes versus comparable arguments with equivocating homonyms. We find that participants endorse the former substantially more than the latter, a phenomenon that we dub the Uncommon Sense Effect. Our results provide direct evidence that polysemes and homonyms have underlyingly distinct mental representations—in particular that polysemes uniquely invoke an underspecified representation

Beyond Crosslinguistic Influence: Mandarin Speakers with Exposure to Null-subject Languages Nonetheless Use Fewer Null Pronouns in Mandarin

We explore the impact of crosslinguistic influence in first language (L1) attrition, changes in an individual's L1 due to exposure to additional languages. We report an experiment examining reference production in Mandarin in a picture description task by native Chinese speakers residing in Italy or Spain. Mandarin allows null subjects, where subjects can be expressed with a null or overt pronoun; previous work shows that L1 Mandarin speakers exposed to English use more overt pronouns in Mandarin than their more-monolingual peers. In the study reported here, despite exposure to two languages (Italian and Spanish) that, unlike English, allow null subjects, our multilingual speakers used fewer null pronouns and more overt pronouns than their more-monolingual Chinese peers. These findings contribute to attrition research by disentangling the impact of crosslinguistic influence in L1 attrition, and provide insights into the effect of bi- and multilingualism on linguistic systems.

Are two-year-olds intrinsically motivated to explore their own competence?

Children are keen explorers of the outside world: They systematically explore surprising findings and test hypotheses during play. However, less is known about whether toddlers are similarly driven to explore and learn about the self. The present work adapts classic exploratory play paradigms to ask whether toddlers are intrinsically motivated to explore their own competence. In Experiment 1, we selected Montessori practical life toys that were verified to be developmentally appropriate and equally appealing to toddlers (N = 24, ages 24-35 months). In Experiment 2, 2-year-olds (N = 49, ages 24-35 months) played with these toys along with a parent. Toys were presented in pairs. In each pair, the parent guided the toddler's hand while playing with one toy, which provided confounded evidence about the toddler's competence, and took turns playing the other toy independently, which provided unconfounded evidence. When given a chance to freely explore the toys on their own, toddlers first approached the confounded toy, which suggests that toddlers sought to resolve uncertainty about their competence. As a further test of this idea, Experiment 3 (N = 11, ongoing) asks whether toddlers' exploration is modulated by task difficulty. Preliminary results suggest that toddlers explore confounded toys more often for more challenging tasks compared to easier ones. Together, our work provides insights into children's early motivation to understand the self, and this understanding is an important first step for researchers, educators, and parents to better encourage and scaffold this motivation throughout development.

The Forest for the Trees: Global vs. Local Advice in Human-AI Interaction

Artificial intelligence (AI) can enhance human decision-making by providing assistance at different levels of abstraction. This study investigates whether AI should offer broad, high-level guidance (global AI) or focused, low-level assistance (local AI) to optimise performance and learning. Using a hierarchical multi-armed bandit task where both AI types provide equally valuable recommendations, we evaluate how participants leverage AI support in making sequential decisions. Findings reveal that while participants benefited from both types of AI suggestions, global AI led to significantly greater performance improvements. These results contribute to our understanding of human-AI interaction in hierarchical problem-solving, highlighting the importance of designing AI systems that effectively support human cognitive processes.

Abstracts with Oral Presentation

The origins of syntactic category biases: Evidence from early vocabularies of bilingual children

Why are nouns often over-represented in children's early words? Prior research has suggested a range of possible explanations for the observed bias in the syntactic category composition of early vocabularies, including cognitive, linguistic, and contextual factors. However, these factors are not mutually exclusive and can be difficult to disambiguate empirically. Across three analyses of the vocabularies of children acquiring two languages (N = 1997), we investigated the role of different language combinations, different levels of language exposure, and different environmental contexts, finding some support for cross-linguistic modulation of syntactic category bias, as well as variation across countries. These results suggest that both linguistic and contextual factors contribute to syntactic category biases in young children's vocabularies, and highlight the importance of understanding the broader background of the child when studying language acquisition.

Algorithmic representations in the human brain that underlie schema generalisation

The human brain represents structure in the external world as "cognitive maps". However, it remains unknown how it represents structure in one's own behaviour. Recent findings show rodent medial-prefrontal cortex (mPFC) does this with a structure-sensitive representation, where all future actions are represented simultaneously. Here, using 7T fMRI, we test for this representation and its properties in humans. Using RSA and a computational model, we show this representation in mPFC and orbitofrontal cortex, while entorhinal and orbitofrontal cortices contain a pure abstraction of task structure. Preceding the future actions representation, action plans are ‘loaded' to mPFC once subjects were given all action-relevant information, suggestively through replay. iEEG data of patients solving the same task shows that sharp-wave ripple rate is increased during this planning time. Together, our findings suggest an algorithm of how human mPFC encodes future actions, and provide evidence for a replay-based mechanism of loading that representation.

Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models

People are able to reason flexibly across a vast range of domains and contexts, from navigating new environments and social situations, to playing new games and even betting on the outcomes of new sports. How do we draw on our knowledge and past experiences to tractably make sense of any particular situation? Here, we explore the hypothesis that people use a combination of distributed and structured, symbolic knowledge to construct bespoke mental models tailored to novel situations. We propose a computational implementation of this idea -- a `Model Synthesis Architecture' (`MSA') -- using language models as a stand-in for distributional knowledge and a probabilistic programming language to express bespoke probabilistic symbolic models. We evaluate our model with respect to human judgments on a novel reasoning dataset. Our "Model Olympics" domain comprises a series of "sports commentary" vignettes, and is designed to test open-ended reasoning by requiring (i) reasoning about arbitrary causal structures described in language; (ii) drawing in relevant latent considerations from background knowledge; and (iii) flexibly adapting to an ‘open world' setting with novel observations sourced from other human participants. We compare our MSA to hand-coded probabilistic programs and LM-only baselines. We find that our approach captures key hallmarks of rational inference from human judgments that the LM-only baselines do not, especially for very novel scenarios. See https://sites.google.com/view/openworld-msa?usp=sharing for additional details and preprint.

Wild Possibilities: Evidence of Modal Cognition in Free-Ranging Rhesus Macaques (Macaca mulatta)

Modal cognition is foundational to human reasoning, enabling us to construct and narrow a rich space of hypotheses about the world. While this faculty has often been considered uniquely human, its evolutionary roots remained uncertain. Nonhuman primates, lacking a natural language modal lexicon, provide a crucial test case for whether modal reasoning depends on linguistic expression. Prior research has cast doubt on primates' ability to contrast mutually exclusive possibilities, with subjects failing tasks that require reasoning about uncertain outcomes. Across four experiments, we show that rhesus macaques can reliably distinguish between certain and possible rewards, exhibiting unprecedented success on a 3-cup task. Primates' previous failures may stem not from an inability to reason modally but from the burden of representing absence as a possible alternative. By alleviating this burden, we uncover an early evolutionary footprint of modal reasoning that extends beyond humans and reveals an ancient, language-independent logical framework.

Social Learning Shapes Moral Strategy Selection

Social norms—perceptions of what is commonly done in a given context—serve as powerful signals for guiding moral decision-making in complex dilemmas. We investigate whether individuals adjust their moral strategies in response to information about others' judgments, and explore the underlying learning processes that support these shifts. Using a computational approach, we compare two models of social learning: (1) Decision Biasing, where social influence temporarily alters choices without affecting underlying values, and (2) Value Shaping, where social feedback directly updates individuals' moral value representations. Our results show that a Decision Biasing model fails to adequately explain the observed data, while a Value Shaping model better accounts for the persistence of moral adaptation. Taken together, these data suggest that social norms may play a role in shaping not only immediate moral choices but also the strategies people use to make them.

Spatial language and memory diverge in BaYaka hunter-gatherers

People conceptualize space using different spatial reference frames, based either in the body or the environment. Many studies attribute this cognitive diversity to spatial language, but their effects are confounded by differences across cultures and experimental tasks. Here we tested this hypothesis in people and tasks that are directly comparable. Indigenous BaYaka adults in the Congo basin reconstructed simple object arrays from memory and later described the same arrays aloud. Reference frames diverged across modalities in three primary ways. First, they used body-based frames seven times more often in their spatial memory than in their spatial descriptions of the same stimuli. Second, linguistic and non-linguistic responses were uncorrelated across participants, despite substantial variation. Third, each of two factors – spatial aspect and axis – influenced spatial language and memory differently, even in the same individuals. The results show that this fundamental feature of spatial thinking does not reflect patterns of spatial language.

Language and Experience: A Computational Model of Social Learning in Complex Novel Tasks

The ability to combine linguistic guidance from others with direct experience is central to human development, enabling safe and rapid learning in new environments. How do people integrate these two sources of knowledge, and how might AI systems? We present a computational framework that models social learning as joint probabilistic inference of structured causal world models given both sensorimotor and linguistic data. Using behavioral and simulation studies of learning across 10 video games, we show how linguistic guidance shapes exploration by reducing risky interactions and speeding up key discoveries, in both humans and models. Most notably, we demonstrate successful cross-embodiment knowledge transfer: both human- and model-generated advice speeds up both human and model learning, revealing how structured, language-compatible representations might enable human-machine collaborative learning.

Beyond Muller-Lyer: Culture shapes ‘universal' visual phenomenology in multiple illusions across a rural-urban gradient

How cultural experience affects visual perception is a question of outstanding interest to debates regarding universality and cultural-specificity in human cognition. Yet, work comparing visual perception in 'typical' urban, industrialized samples with groups living in rural environments, typical for 99% of our species' history is strikingly limited (Deregowski, 2017). Here we more than double the total number of paradigms (visual illusions) in this literature, reporting data from 1) a 'typical' UK/US urban sample 2) a developing Namibian town 3) rural Namibian villages. Results reveal profound differences in visual processing, including aspects previously assumed to be universal (e.g. amodal completion in Gestalt shapes, line perception in the Cafe wall illusion). In contrast to recent arguments for the limited role of cultural experience in visual perception (Amir & Firestone, 2025), the present work indicates that a major research program in CCVS is warranted to capture the ways culture shapes visual perception.

Language production is harder than comprehension for children and language models

Infants can understand some language even when they have no productive ability, but later in development, children can produce much more of what they understand. Explanations of this production--comprehension asymmetry (PCA) typically appeal to specific mechanisms, such as motor demands or communicative intent. Here, we investigate the hypothesis that the development of PCA emerges from the inherent structure of the two tasks. Production involves selecting a particular word to produce (and no other); in contrast, comprehension typically involves selecting the correct response to a word within a relatively constrained context. We tested this hypothesis by exploring whether developmental changes in PCA emerge in language models, which are sensitive to these structural asymmetries but not other factors previously proposed to cause PCA. We find that two types of language models---unimodal language models and vision–language models---both show PCA. Moreover, similar to children, PCA decreases in more highly-trained models and was more pronounced for predicates than nouns. These results suggest that production--comprehension asymmetries, a fundamental feature of child language acquisition, may be explained by the basic task demands involved in language use.

Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

Understanding how cognitive and social mechanisms shape the evolution of complex artifacts such as songs is central to cultural evolution research. Social network topology (what artifacts are available?), selection (which are chosen?), and reproduction (how are they copied?) have all been proposed as key influencing factors. However, prior research has rarely studied them together due to methodological challenges. We address this gap through a controlled naturalistic paradigm whereby participants (N=2,404) are placed in networks and are asked to iteratively choose and sing back melodies from their neighbors. We show that this setting yields melodies that are more complex and pleasant than those found in the more-studied linear transmission setting, and exhibits robust differences across topologies. Crucially, these differences are diminished when selection or reproduction bias are eliminated in control studies, suggesting an interaction between mechanisms. These findings shed light on the interplay of mechanisms underlying the evolution of cultural artifacts.

Do Large Language Models Have a Planning Theory of Mind? Evidence from MindGames: a Multi-Step Persuasion Task

Recent evidence suggests that Large Language Models (LLMs) display Theory of Mind (ToM) abilities. However, experiments with LLMs typically assess only *spectatorial* ToM, where LLMs merely predict other agents' behavior, rather than *planning*. In contrast, ToM in humans also contributes to dynamically *planning action* and *intervening* on others' mental states. We present a novel task of such a `planning theory of mind' (PToM), which requires agents to infer an interlocutor's beliefs and desires and persuade them to alter their behavior. We find that humans significantly outperform o1 (an LLM) at our task, even though o1 outperforms humans in a baseline condition which requires minimal mental state inferences. The results suggest that LLM performance at other ToM tasks may be attributable to simpler predictive abilities, while people excel at counterfactual planning when reasoning about others' behavior. Our paper is here: https://jaredmoore.org/mindgames

Knowledge Exerts Dissociable Effects on Target and Distractor Processing in Binocular Rivalry

The ability to filter out distractors is crucial for effective sensory processing. Since the world is rarely random, prior knowledge can guide selective attention. Using a binocular rivalry paradigm combined with steady-state visually evoked responses, we investigated the neural dynamics of target and distractor processing. Behavioral enhancement was observed in the target-cueing condition, with no significant cost in the distractor-cueing condition. Single-trial analysis revealed that the distractor-related cost was mitigated by the complementary roles of parietal alpha and frontal theta. Specifically, prominent frontal theta activity during the rivalry phase indicated reactive control, reducing the distractor's sensory strength. Additionally, prior knowledge of distractors induced strong parietal alpha activity, reflecting pre-tuning of the attentional gate, which stabilized the target signal and blocked the distractor without affecting sensory gain. Our results reveal a dual mechanism for resolving visual competition, highlighting the distinct yet cooperative roles of parietal alpha and frontal theta.

Toddlers' mapping of emotion words to facial expressions and body postures in a looking-while-listening task

Traditional research on children's emotion word comprehension has relied on explicit-response tasks and focused primarily on facial expressions, potentially underestimating early abilities. Using a looking-while-listening paradigm, this study examined whether 18- to 30-month-old children (N=100) could map emotion words to combined facial and bodily expressions. On each trial, children heard an emotion word while viewing a pair of emotional expressions that were either across valence (e.g., happy vs. sad) or within valence (e.g., sad vs. angry). Children aged 24-30 months preferentially looked at the matched expression on both trial types, while children aged 18-24 months old performed at chance levels. These findings suggest that the ability to map emotion words to facial and bodily emotional expressions emerges in early age two.

Human newborns spontaneously attend to prosocial interactions

Homo sapiens maintain complex cooperative interactions with unrelated individuals by exploiting various cognitive mechanisms, for instance empathic reactions, the ability to tell cooperative agents from non-cooperative ones, a preference for prosocial individuals, and a desire for the punishment of antisocial individuals. The key role played by these features across moral systems suggests that core processes underpinning people's moral sense are evolved adaptations. Initial evidence consistent with this nativist view came from studies showing preferences for prosocial individuals in preverbal infants. Here we show that 5-day-old neonates can distinguish prosocial from antisocial interactions, looking longer at affiliative/helping behaviors than at avoidant/hindering behaviors. These visual preferences are specific to socially interactive stimuli, helping to rule out low-level perceptual explanations for the results. By revealing a preference for prosocial actions in newborns, these findings provide significant support for theories that posit evolutionary bases for at least some components of the human moral sense.

Usage frequency predicts lexicalization across languages

Languages are more likely to have lexical items for some concepts (e.g. CHILD) than others (e.g. PARENT). We propose that the communicative need of a concept influences how often it is lexicalized across languages, and test the hypothesis that usage frequency (which also reflects communicative need) predicts lexicalization across languages. Our analyses consider more than a thousand concepts, and demonstrate that average usage frequency across dozens of languages is a relatively good predictor of the typological prevalence of lexicalization across hundreds of languages. This finding implies that cross-linguistic regularities in lexicalization can be attributed in part to shared communicative need across cultures.

The Effect of Representational Compression on Flexibility Across Learning in Humans and Artificial Neural Networks

Humans can generalise from past experiences to novel situations as well as revise prior knowledge to flexibly adapt to changing contexts and goals. The representational geometry framework formalises how information is structured in the brain and suggests that abstraction involves a trade-off between generalisation and flexibility. However, how task representations evolve across learning and relate to behaviour remains unclear. Here, we tested the hypothesis that representational compression of task representations across learning underlies this flexibility impairment. Using an extra-dimensional shifting task, we manipulated the pretraining length to control the degree of compression. In both humans and artificial neural networks, longer pretraining was associated with decreased flexibility. Network dynamics indicated that greater compression incurred a higher representational reorganisation cost, limiting flexibility. Introducing an auxiliary reconstruction loss maintained higher dimensionality and mitigated the flexibility impairment. These findings suggest representational compression constrains flexibility, and preserving representational richness enhances flexibility.

The trade-off between rule-based thinking and mutual benefit in tacit coordination

One way to solve a repeated coordination problem is to generalize from past solutions: acting based on precedent or relying on existing rules. An alternative way is to reason about what would be optimally mutually beneficial in the moment. We investigate the trade-off between backward-looking behavior based on precedent and forward-looking reasoning about mutual benefit in a novel real-time incentivized coordination game (n = 252; 22,680 choices; preregistered). Then, we develop a cognitive model based on virtual bargaining and Bayesian inverse joint-planning which combines two components: one based on precedent, and one based on mutual benefit. Our model captures participants' behavior in the task, performs better than alternatives, and reproduces key differences between conditions in simulations.

Looping towards certainty: The role of flight loops in homing pigeon navigation

Homing pigeons develop more efficient routes over repeated flights from a given location, but many open questions remain regarding the mechanisms behind this ability. This study examines the role of flight loops in navigation—instances where pigeons circle at specific locations during their homeward journeys. While loops are a source of navigational inefficiency, their navigational utility, if any, has not been studied. We adopt a data-driven approach to test hypotheses about looping on a GPS dataset of pigeons homing from a novel release site. We found that looping decreases with experience, with birds performing fewer and shorter loops with repeated releases. Less efficient navigators exhibited significantly more looping behavior. Our analysis revealed that directional uncertainty tends to decrease after loops compared to before, suggesting that loops may be an information-gathering mechanism. Additionally, locations where pigeons performed loops were more likely to be revisited in subsequent flights, indicating these sites might correspond to important and/or salient landmarks. Together, these findings illuminate the extent to which looping is not a purely stochastic navigational event, but a deliberate strategy.

Core Logic: Fourteen-month-olds exclude physical possibilities that would make an agent irrational.

Humans have the capacity for flexible and abstract reasoning that combines knowledge across distinct domains. To investigate the developmental origin of this capacity, we asked whether preverbal infants with limited language and no formal education can use logic to integrate information across physical and social domains. Across three preregistered experiments, we show that both adults and 14-month-old infants spontaneously use disjunctive reasoning to integrate predictions about possible outcomes of a probabilistic physical event with expectations about rational social behavior. In Experiment 1, adults deduced the agent's preference by eliminating alternatives. In Experiment 2, infants were surprised when the agent's behavior violated a logically inferred preference. Experiment 3 ruled out alternative novelty-based explanations. These findings suggest that infant logical inference can be reduced neither to domain-specific representations of physical objects or social agents, nor to language, revealing general-purpose core logic as a foundation of human thought.

People use theory of mind to craft lies exploiting audience desires

Theory of Mind enables us to attribute mental states like beliefs and desires. We use it cooperatively, but we also use it adversarially, as when we lie. Prior work has shown people use Theory of Mind to craft lies to be believable to their audience, based on their audience's beliefs. But we usually also know something about our audience's desires. In this work, we ask a new question: Do people cater to their audience's desires by telling them what they want to hear? We propose that people expect others to be wishful thinkers---allowing their desires to color their beliefs---and exploit this by tailoring lies to audience desires. We implement this theory as a computational model and test it against human behavior in a novel task. This model quantitatively captures people's patterns of lying---both at the population and subject levels. This work advances our understanding of social cognition in adversarial interactions.

Six-Year-Olds Use an Intuitive Theory of Attention to Infer What Others See, Whom to Trust, and What They Want

Understanding the relationship between seeing and knowing is fundamental to social cognition. While research demonstrates that even infants grasp basic aspects of this relationship, prior work often treats perceptual access and knowledge as equivalent (e.g., "if you see it, you know it"). In reality, their connection is richer: more complex objects require longer to encode, and agents' looking patterns often reveal how well they have encoded something and how much they want it. Across three experiments, we investigated whether children understand these nuances. In Experiment 1, we found that by age six, children expect more objects to require longer looking times. In Experiment 2, children inferred that agents who looked longer were more likely to form accurate representations of what they observed. In Experiment 3, children reasoned that agents who looked longer at an object were more likely to want it. Together, these findings suggest that by age six, children develop an intuitive theory of attention, enabling them to make sophisticated inferences about others' mental states based on looking behaviors.

Blind Speakers' Path Gestures Are More Precise Than Those of Sighted and Blindfolded Speakers

Co-speech gestures arise from an interaction between visuospatial experience and speech formulation. Congenitally blind speakers produce gestures, but less than sighted speakers when describing spatial events. This study explores whether visual experience influences gesture kinematics to better understand the cognitive processes underlying gesture production. We conducted an auditory task where all participants listened to sounds of motion events (e.g., someone walking from a door). We analyzed co-speech path gestures (depicting the trajectory of the motion) spontaneously produced by 20 blind, 21 blindfolded, and 21 sighted Turkish speakers. We compared the alignment of speakers' path gestures with the actual spatial trajectory of the motions, along with other kinematic features—duration, size, and speed. Blind speakers took longer to produce larger gestures than sighted speakers. Blind speakers' gestures also reflected better precision than those of non-blind speakers—aligning with spatial cognition research. Thus, altered spatial cognition shapes gestures during event description.

Belief Attribution as Mental Explanation: The Role of Accuracy, Informativity, and Causality

A key feature of human theory-of-mind is the ability to attribute beliefs to other agents as mentalistic explanations for their behavior. But given the wide variety of beliefs that agents may hold about the world and the rich language we can use to express them, which specific beliefs are people inclined to attribute to others? In this paper, we investigate the hypothesis that people prefer to attribute beliefs that are good explanations for the behavior they observe. We develop a computational model that quantifies the explanatory strength of a (natural language) statement about an agent's beliefs via three factors: accuracy, informativity, and causal relevance to actions, each of which can be computed from a probabilistic generative model of belief-driven behavior. Using this model, we study the role of each factor in how people selectively attribute beliefs to other agents. We investigate this via an experiment where participants watch an agent collect keys hidden in boxes in order to reach a goal, then rank a set of statements describing the agent's beliefs about the boxes' contents. We find that accuracy and informativity perform reasonably well at predicting these rankings when combined, but that causal relevance is the single factor that best explains participants' responses.

Racial diversity and racial representation in U.S. children's books

It is well accepted in developmental and cognitive science that children represent the structure of their environment. This skill is generally useful, one that allows children to acquire the language(s) to which they are exposed, learn social rules, or represent relevant categories. But could this skill also have pernicious effects? Indeed, contemporary theories argue that children's racial biases emerge from being exposed to racial inequalities in their environment. Therefore, understanding – and ultimately interrupting – the development of racial biases requires better understanding the sources in young children's environment that perpetuate racial inequalities so that they can be corrected. Here we focus on a common aspect of children's environment – children's books – and document two features of how racial groups are depicted in U.S. children's books: the racial diversity (frequency) and racial representation (themes) in U.S. children's picture books. Through a meta-analysis (Study 1) and an information-theory analysis of an existing book collection (Study 2), we show that characters from minoritized racial backgrounds are not only numerically underrepresented in U.S. children's books, but there are also meaningful differences in the themes that are more likely to be associated with different racial groups. We discuss the implications of these results for developing theories of bias development that are grounded on real-world structure and for designing effective bias reduction approaches in childhood.

Papers with Poster Presentation

Multimodal Pragmatic Inference in Vision-Language Transformers

Contemporary transformer models have achieved human-like performance on many text-based tasks. However, real-world communication requires the integration of language with non-linguistic context (e.g., visual, social, etc.). Here, we study such information integration in three multimodal transformer models. We test these models' pragmatic capabilities regarding referring expressions: when an object set contains two exemplars from the same category that differ in size, unambiguously referring to one of them requires a size adjective (e.g., the big hammer); the adjective is unnecessary if only one exemplar from the category is present. We evaluate these inferences when models process text-image inputs (via their surprisal for infelicitous vs. felicitous adjective use) and when they generate open-ended descriptions of images given text prompts. We find evidence for pragmatic integration of visual and linguistic context in all models. However, these inferences remain sensitive to the in-context statistics of visual inputs, unlike pragmatic inference in humans.

DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration

Real-time human-artificial intelligence (AI) collaboration is crucial yet challenging, especially when AI agents must adapt to diverse and unseen human behaviors in dynamic scenarios. Existing large language model (LLM) agents often fail to accurately model the complex human mental characteristics such as domain intentions, especially in the absence of direct communication. To address this limitation, we propose a novel dual process multi-scale theory of mind (DPMT) framework, drawing inspiration from cognitive science's dual process theory. Our DPMT framework incorporates a multi-scale theory of mind (ToM) module to facilitate robust human partner modeling through mental characteristic reasoning. Experimental results demonstrate that DPMT significantly enhances human-AI collaboration, and ablation studies further validate the contributions of our multi-scale ToM in the slow system.

Constituency tests in human adults' language of thought for geometry

Humans can remember arrays or sequences of stimuli that exceed working memory limits in many domains, from auditory sequences to geometric shapes. This ability has been interpreted as evidence for language-like representations that compress stimuli into compact descriptions. We extend this evidence in the domain of geometry by showing that representations of geometric shapes are not only compressed but also syntactically structured. Experiment 1 shows that different representations can be induced for the same geometric shape, indicating structural representation. Experiment 2 shows that subparts of a shape are easier to recognize when they belong to the same subtree than whey do not, indicating hierarchical organization. Taken together, the results indicate that geometric shapes are encoded in representations that possess internal syntax, just like natural language sentences.

Go Big or Go Hoax: Explanatory Scope and the Believability of Conspiracy Theories

Conspiracy theories explain the cause of world events through the machinations of shadowy, secret groups. Understanding what in conspiracy theories makes them appealing explanations is an important area of research. People like explanations that can account for a large number of seen events (broad explanatory scope), while not accounting for every possible unseen event (narrow latent scope). It is unknown if people think conspiracy theories have broad or narrow scope and how this may relate to their believability. We thus explored perceptions of conspiracy theory explanations' scope and how this relates to their believability. Participants rated 40 conspiracy theories and their fact-based alternative explanations. Fact-based explanations were seen as having larger explanatory and latent scope. Additionally, larger scope was positively correlated with higher believability for both explanation types. We discuss how these findings relate to the explanation literature and highlight important elements of the seductive appeal of conspiracy theories.

Applications of transcranial Direct Current Stimulation (tDCS) for modulating the Face Inversion Effect (FIE): Reducing and Enhancing Recognition

We report a large study examining the effects of tDCS on the FIE. Subjects randomly assigned to one of the four tDCS groups and engaged with an old/new recognition task involving upright and inverted faces. 1) Sham tDCS during the study phase and recognition task; 2) Anodal tDCS during the study phase followed by sham tDCS during the recognition task; 3) Anodal tDCS during the study phase followed by cathodal tDCS during the recognition task; 4) Cathodal tDCS during the study phase followed by sham tDCS during the recognition task. Group 2 confirmed that anodal tDCS reduces the FIE vs. sham (Group 1) by disrupting performance for upright faces. Group 3 showed that cathodal tDCS applied after anodal, increased the FIE vs. Group 2, bringing it back to sham level, by enhancing upright faces. Group 4 revealed that cathodal tDCS applied after sham has no effect on the baseline FIE

Top-Down Biases for Lexicality and Frequency in Both Monosyllabic and Disyllabic Stimuli: Evidence from Cantonese

Also known as the Ganong effect, a lexicality bias effect- i.e., the bias to interpret an ambiguous sound as the phoneme that yields a real word in its context - has been widely replicated. The search for a similar frequency bias effect, on the other hand, has yielded mixed results: In English, a bias has been observed such that listeners tend to interpret an ambiguous sound as the phoneme that yields a higher-frequency word, but this has failed to be replicated in Mandarin. One difference between these studies is the use of monosyllabic vs. disyllabic stimuli. To determine the factors that influence the presence of a bias effect, the present study tested for both frequency and lexicality bias effects using monosyllabic and disyllabic stimuli in Cantonese. Results show that the lexicality and frequency bias effects can be elicited in both monosyllabic and disyllabic stimuli, but the frequency effect is weaker.

What You Ask Affects What You Get: Task-Dependent ERPs in the Processing of Syntactically Ambiguous Sentences

Two experiments investigated how task demands influence sentence processing mechanisms, as reflected in neural responses at the disambiguating word. Participants read garden-path sentences with late-closure ambiguity (e.g., While the man hunted the deer ran into the woods) and answered comprehension questions while their brainwaves were recorded. Experiment 1 used standard questions (e.g., Did the man hunt the deer?), while Experiment 2 used "explicit" questions (e.g., Did the sentence explicitly say that the man hunted the deer?) to reduce inference-based responses. Results showed a typical P600 effect for ambiguous sentences in Experiment 1, indicating syntactic reanalysis. In Experiment 2, responses showed an N400 followed by Sustained Frontal Negativity, suggesting a shift toward plausibility evaluation and increased processing load. The explicit questions appear to alter underlying sentence processing mechanisms, prompting more effortful resolution of interpretive conflict beyond initial syntactic reanalysis.

Individual differences in encoding style moderate framing effects on risk-taking

Risk attitudes determine how decision-makers resolve tradeoffs when decisions involve uncertainty. The framing of a decision problem can affect these attitudes. Regulatory focus theory holds that (promotion-focused) frames that emphasize acquiring gains induce higher risk-taking than (prevention-focused) frames that emphasize avoiding losses. Here, we examine how this framing effect is moderated by individual differences in the internality of encoding style—the readiness to construe stimuli in terms of expectancies and pre-existing categories. In two experiments, participants could obtain costly information to aid their focal decision; thus, a risky choice corresponded to obtaining little or no information. Payoffs were framed to emphasize either gaining a bonus or retaining an endowed budget. The results of both experiments suggested that individuals with a more internal encoding style were more likely to be affected by the payoff framing. These results suggest that framings' effectiveness on risk-taking depends on individual differences in cognitive processing style.

A Rational Model of Dimension-reduced Human Categorization

Humans can categorize with only a few samples despite the numerous features. To mimic this ability, we propose a novel mixture of probabilistic principal component analyzers (mPPCA) model with dimension-reduced category representations, along with a theoretical analysis of rational dimensionality choices in categorization. Tests on the {\tt CIFAR-10H} natural image categorization dataset show that introducing a single principal component for each category effectively improves predictions of human categorization patterns. We further use mPPCA to account for human category generalization with very few samples. In our experiments with visual patterns of varying size and color, combining principal components and the hierarchical prior leads to significantly better predictions of human generalization within and beyond previously learned categories.

Fine-tuning semantic vectors with semantic fluency data

Semantic vectors derived from training on large text corpora (e.g., word2vec, BERT) are widely used as a methodological tool to model similarity of concepts. Recent work has demonstrated that a small amount of human training data can be used to fine-tune these vectors for modeling specific tasks. For example, human ratings of pairwise similarity can be used to estimate a set of dimensional weights, and these weights can improve estimates of human similarity ratings for held-out pairs. We applied this methodology to the semantic fluency task (listing items from a category) and find that category- specific weights can be used to identify the semantic category of a fluency list. The results have methodological implications for modeling retrieval in semantic fluency tasks, estimating semantic representations, and identifying semantic clusters and switches in fluency data.

Maybe She'll Say Yes: How Young Learners Acquire and Apply Knowledge about Inconsistent Causal Relationships from Different Domains

Children are adept at learning the principles and properties governing their environment. However, this environment is often highly inconsistent: causes do not always bring about their effects; people do not always act according to their preferences. Past research shows that young causal learners readily reason from probabilistic evidence, but little is known as to how they reason about that evidence. This study presented preschoolers (N=114) with the behavior of three different causes—one consistently effective, one consistently ineffective, and one inconsistent—from one of three domains (social, mechanical, biological) and asked children to predict the future behavior of each. Children's predictions not only captured the different degrees of inconsistency observed in the evidence but also reflected differences in prior knowledge and expectations about inconsistency between domains. These results offer a novel, more nuanced look into early causal cognition and often-overlooked complexities of causal learning and reasoning in the real world.

Overcoming Science Misconceptions: When is Refutation an Effective Tool for Knowledge Revision?

Evidence from studies of knowledge revision show the refutation text is an effective tool for helping readers correct their inaccurate understandings. While refutation shows immediate benefits in group comparisons, it is unclear whether these benefits are restricted to particular retention intervals, if these benefits generalize across topics, or what predicts knowledge revision following refutation. In the present study, participants evaluated misconceptions across various science topics, then read a mix of refutation and expository texts and were prompted to rate their surprise and confusion after each text. Participants then re-evaluated the misconceptions immediately after reading the texts and again at a two-week delay. While both texts reduced misconception endorsement, refutation texts lead to a greater reduction in misconception endorsement on both the immediate posttest and at a two-week delay. Finally, exploratory analyses suggest participants' ratings of surprise predicted knowledge revision and partially mediated the effect of text type on knowledge revision.

Improving Human Answers Quality by Machine Questions Number and Context Factors

Mobile phones provide an opportunity for a symbiotic interaction between humans and machines, which allows phones to collect human-centric data at anytime and anywhere. However, low-quality answers, which refer to the wrong answers, may be provided by users when they are asked excessive questions or in unsuitable contexts (e.g., driving). To solve this problem, we aim to design a methodology to collect more correct answers. We propose to use answer reaction time to annotate answer quality, to find a suitable number of daily questions, and the context factors that need to be considered according to their history records. We validated our methodology via the public dataset, which was collected by an extensive four-week in-the-wild study at the University of Trento, Italy. The results reveal that the context information and the number of daily questions are factors that can impact user answer behavior. These factors, therefore, influence the answer quality.

MAM-GAN: Multimodal association modeling based on generative adversarial networks for Alzheimer's disease diagnosis

Alzheimer's disease (AD) is a highly heritable neurodegenerative disease, and brain imaging genetics (BIG) has become a key area for understanding its pathogenesis. However, existing methods often ignore the complex interrelationships between the multiple factors that lead to AD, especially when exploring the intrinsic connection between brain imaging features and gene variation. To address this challenge, we proposed a multimodal association modeling framework (MAM-GAN) based on generative adversarial networks, which aims to deeply reveal the association between genes and brain imaging features and apply it to disease state prediction. To verify the effectiveness of the framework, we conducted experiments using public datasets, and the results showed that MAM-GAN performed well in two classification tasks and successfully identified biomarkers closely related to AD.

Can Visual Fixations Explain Context-Dependent Reinforcement Learning?

Context-dependent reinforcement learning (RL) challenges the assumption that decision makers encode the absolute values of choice outcomes. This study investigates whether the associated choice biases arise from a relative encoding of outcomes or an alternative mechanism involving cumulative reward learning and selective attention to outcomes. Using eye tracking, participants completed a RL task where choice options were initially learned in fixed contexts before being tested in novel pairings. Results revealed an overall preference for options that were contextually favored in the learning phase, even when these preferences violated expected value maximization. Computational model comparisons demonstrated that hybrid encoding models, incorporating absolute and relative values, provided the best overall account of individual behavior. While eye fixations on choice outcomes decreased over trials, fixation-dependent RL models did not fit the data well, suggesting that overt visual attention patterns do not fully explain context-dependent choice biases.

Generating Representations In Space with GRIS

When conducting experimental research, the research questions are often inherently linked (and limited) to the paradigm that is used. In this paper, we present a new experimental tool -- GRIS (Generating Representations in Space) -- that builds experiments where participants can manipulate objects on a screen. Through a series of three experiments on sentence acceptability, category typicality, and multi-dimensional similarity, we demonstrate how GRIS-based experiments allow cognitive scientists to approximate representational spaces for a variety of cognitive phenomena, expanding the set of possible research questions that cognitive scientists may ask.

Industry Influencing Collective Scientific Reasoning: A Bayesian, Agent-based Exploration

Recent work in Bayesian, agent-based modelling of scientific communities has employed the Bala-Goyal framework to study the mechanisms involved when industry influence applies the so-called 'Tobacco Strategy' to undermine collective inquiry. Motivated by limitations of these models, we propose an alternative based on a recently introduced framework for normative argument exchange across networks. We implement representations of two distinct types of industry influence: `Obfuscating' influence directs inquiry to experiments with low expected value of information. `Misleading' influence filters private research and only communicates misleading signals from the world. We explored the impacts of both strategies on the polarization

amp; mean error of, and flow of information through, social networks of scientists via computer simulations. We conclude that even against highly optimistic background assumptions, and in a less simplified model of inquiry and argumentation, industry influence poses a plausible threat to collective deliberation.

The intrinsic drive for knowing boosts pro-environmental choices

The ongoing climate crisis demands massive changes in people's life style. Behavioral economics has highlighted the use of extrinsic incentives (e.g., money) as a powerful tool for changing behavior. However, external incentives come with significant costs, making them feasible primarily for wealthier countries. Here, following recent insights from the field of curiosity and information-seeking, we explore whether internal incentives such as the intrinsic drive for knowing can motivate people to act more pro-environmentally. By developing a novel decision-making task, we showed that the drive for knowing predicts pro-environmental choices. Moreover, participants chose eco-friendly options more when their values were unknown compared to when they were known. Results from this study hold the potential to inform the development of future behavioral interventions, although a replication of its findings is still ongoing.

Adaptive use of vagueness to coordinate joint action

"Let's share the load" is a much less helpful way to coordinate cleaning the house than "You take the bedrooms, I'll do the kitchen" - unless there are 12 bedrooms. Vague plans can be useful tools to support joint action, but using vagueness effectively is a difficult computational problem. Participants in a joint planning study selected between specific and vague plans to coordinate action across a systematic range of problems. In Experiment 1, participants deployed vague plans selectively, recognizing situations where the certainty of a bad plan is outweighed by the flexibility of a vague plan according to a probabilistic model of joint reasoning. In Experiment 2, participants with greater exposure to such situations endorsed vague plans when providing generic testimony to future actors. Our results highlight an understudied but potentially powerful dimension of human joint planning: strategic use of vague construals.

Young Children Spontaneously Appreciate the Perspectives of their Social Partners

Classic research has found that young children are often egocentric when reasoning about others' visual experiences. In two experiments (total N = 148), we investigated 3- to 4-year-old children's abilities to reason about others' distinct visual experiences when they are engaged in social actions. Across experiments, we found that young children spontaneously oriented pictures and books so that those objects appeared upright to their social partners. These findings suggest that past research has underestimated young children's understanding of others' minds.

Individual differences in the delay effect across scales

For more than two decades, researchers have been trying to explain the source of the processing cost of scalar implicature (SI). Although the computation of some SIs is associated with longer processing time (known as the delay effect), other SIs are processed cost-free. In this study, we investigated how individual differences in the rate of SI derivation modulate the delay effect across different scales. We reanalyzed four datasets from two SI verification task studies, which examined various scales. In these experiments, participants judged SI-triggering sentences as either true (literal reading) or false (SI reading). We fit a computational model to quantify the by-subject probability of computing SIs. Across datasets, we found that subjects who prefer the literal reading of the SI-triggering sentence were faster to respond true than false. However, the reading preferences modulate the verification speed differently for different scales. This suggests that the source of the delay effect might vary between scales.

Toward a Formal Pragmatics of Explanation

This paper presents a formal account of causal explanation, grounded in a theory of conversational pragmatics, and inspired by the interventionist idea that explanation is about asking and answering what-if-things-had-been-different questions. We illustrate the fruitfulness of the account, relative to previous accounts, by showing that widely recognized "explanatory virtues" emerge naturally, as do subtle empirical patterns concerning the impact of norms on causal judgments. An extended version of the paper with further details can be found here: https://arxiv.org/pdf/2505.03732

Resolving the Ambiguity of "In" and "On" Across Spatial and Abstract Contexts

The prepositions "in" and "on" are used in a wide variety of concrete, spatial contexts as well as abstract, non-spatial contexts. Previous research suggests that the meanings of these prepositions when used in abstract, non-spatial contexts might be grounded in the meanings these same prepositions have when they are used in concrete, spatial contexts; however, few studies have attempted to empirically test for evidence of these connections directly. The current study attempted to conceptually replicate Feist and Breaux (2013): No evidence of priming was observed. It could be that our prime stimuli did not adequately activate the meanings of "in" and "on" or that our participants' linguistic habits obscured any impact of the prime stimuli. Further research is needed to truly understand the role that grounded connections might play in the lexical semantics of words that are used in both abstract and spatial contexts.

How many factors underlie cognitive mechanics?

How people reason about the mechanics of the physical world is an important question for several different cognitive sciences. Education, cognitive psychology, and developmental psychology have each conducted large numbers of studies over the last several decades, largely in isolation from one another (especially in the last quarter century). The results have suggested that cognitive mechanics may be subserved by a number of mechanisms that are differentially involved in different tasks. Here, we report converging results from factor analysis of a large compendium of mechanics questions.

Concept and Feature Change in Scientific and Deep Neural Net Representations

Scientific representations and their constituent concepts change over time to reflect improvements in our understanding of the world. Similar improvements in understanding lead to changes in DNN-procured representations and their features. In this paper, we investigate whether useful methodological practices in concept change and in feature change carry across the two types of representations. We argue that there is indeed considerable potential for methodological cross-pollination and offer some examples of how such benefit may be derived.

Relations between number-knowledge and causal reasoning about number in young children: A preliminary investigation

Three experiments investigated preschoolers' ability to infer that numbers can be causally efficacious. Preschoolers observed that one of two quantities of objects activated a machine (i.e., a container holding 2 blocks activated a machine while a container holding 3 did not). Children were asked to determine whether novel containers with either 2 or 3 objects would activate the machine, and then construct their own container of objects that would do so. Four-year-olds, but not 3-year-olds, were above chance at both tasks given a contrast between the numbers ‘2' and ‘3' (Study 1), but not as good when the contrast was between the numbers ‘4' and ‘6' (Study 2) The effect of age on understanding ‘2' was mediated when children's numerical knowledge was considered (Study 3). These results are interpreted in terms of children's causal reasoning and hypothesis-formation abilities, but also their developing knowledge of numbers.

Is AI-assisted Creativity an "Original Sin"?: Lay Judgments of Qualities Justifying Copyright Protection for Artworks Derived from AI- vs. Human-generated Sources

Recent legal rulings have denied copyright protection to artworks derived from AI-generated sources, because AI is assumed to be incompatible with qualities that define human authorship. We empirically test lay intuitions related to these assumptions in two studies (N = 235, N = 119) by investigating how creator attribution of initial source material (AI- vs. human-generated), effort investment in generating source material, and modification level of a derivative work influence perceptions of transformativeness, essence change, and creativity in derivative artworks. Modification level exerted the strongest influence across all measures, with dramatic modifications rated significantly higher than slight or no modifications. Effort investment in generating source material only influenced creativity ratings, with less effort sometimes perceived as more creative. Most notably, creator attribution for source material had minimal impact. These results challenge current copyright doctrine by demonstrating that lay human observers prioritize degree of transformation over both effort and creator attribution for source material. Our findings suggest that legal frameworks should recognize that AI assistance in generating artworks does not preclude a genuine human contribution that merits copyright protection.

Tortoise Attention Algorithm: A Novel Computational Tool for Measuring Children's Concentration

This study introduces the tortoise attention algorithm (TAA), a novel computational tool for accurately measuring children's concentration during learning activities. The algorithm integrates weighted behavior duration and temporal stability metrics to calculate a comprehensive concentration score. We conducted three studies to evaluate the effectiveness of the algorithm. Study 1 demonstrated that TAA's concentration scores significantly predicted performance on math tasks but found no significant relationship with executive function performance, as measured by the ANT test. Study 2 revealed that TAA outperformed humans in predicting children's math performance, underscoring the algorithm's ability to mitigate biases inherent in human assessments of fidgeting behaviors. Study 3 further showed that while human evaluations of concentration were consistent across classroom and café settings, they failed to align with actual learning outcomes. These findings highlight that TAA provides an objective and reliable tool for evaluating concentration, enabling educators to refine teaching strategies and improve learning outcomes.

The Impact of Acute Social Stress on Working Memory Updating in Social Anxiety Disorder

Social anxiety disorder (SAD) is characterized by cognitive biases that impair emotional information processing. This study examines how acute social stress affects working memory (WM) updating in socially anxious individuals using an emotional 2-back task. A 2×2 factorial design (N = 137) categorized participants into socially anxious (SA) and non-anxious (NSA) groups, further divided into stressed and control conditions. Acute stress was induced via speech anticipation, and subjective stress ratings were recorded. Mixed- effects modelling revealed independent negative effects of both social anxiety and acute stress on WM updating, including reduced accuracy, increased false alarms, lower discriminability (d-prime), and increased inverse efficiency score. Notably, disgusted facial expressions enhanced task efficiency under stress. Findings suggest stress-related cognitive deficits in social anxiety are additive rather than interactive, highlighting potential targets for intervention. This study contributes to understanding emotion-cognition interactions and extends research to understudied cultural contexts.

KWS-TA-CNN Network: Towards Lightweight Mild Cognitive Impairment Detection Using Eye-Tracking Signals From Virtual Reality Stroop Test

Mild cognitive impairment (MCI) detection using eye-tracking (ET) signals in virtual reality (VR)-based cognitive tasks shows great promise, as it can capture rich temporal and behavioral information. Therefore, we build four VR-based tasks based on Stroop test and construct a dataset for MCI detection using ET signals. However, ET signals often suffer from non-stationarity,variability, and redundancy, challenging accurate MCI detection.To address these issues, we propose a novel lightweight network KWS-TA-CNN with three key components: 1) Kymatio Wavelet scattering transform (KWS), which generates time-robust features and reduces memory usage through a depth-first traversal strategy; 2) Temporal Attention (TA) to dynamically weight critical time steps for MCI detection; and 3) 1D Convolutional Neural Network (CNN) to capture local temporal patterns and reduce feature redundancy. Experimental results from leave-one-subject-out cross-validation show high performance, with subject-level accuracies of 0.8158, 0.9211, 0.8158, and 0.8421 across the four tasks, demonstrating its strong clinical potential.

JMS2A: Joint Multi-source Domain and Two-step Alignment Strategy for Cross-subject EEG Emotion Recognition

Emotion recognition based on electroencephalography (EEG) plays a significant role in brain-computer interface (BCI) applications. However, individual differences often hinder the generalization of emotion recognition methods to unknown subjects. To address this, we propose an unsupervised domain adaptive model with joint multi-source domain and two-step alignment strategy (JMS2A). The alignment strategy consists of two steps: (1) To capture the structured information from source domain, we combine multiple source domains into a mixed source domain. Simultaneously, a single source domain and the target domain are combined to form a pseudo-target domain, which is then indirectly aligned with the mixed source domain; (2) To extract latent class information from the target domain, we extend supervised contrastive learning to enable direct alignment between source and target domain. We evaluated JMS2A on the SEED and SEED-IV datasets, achieving accuracies of 95.30% and 86.55%, respectively. Experimental results demonstrate that our approach outperforms state-of-the-art methods. The source code is available at https://github.com/cccyangt/JMS2A.

Word Embeddings Track Social Group Changes Across 70 Years in China

Language encodes societal beliefs about social groups through word patterns. While computational methods like word embeddings enable quantitative analysis of these patterns, studies have primarily examined gradual shifts in Western contexts. We present the first large-scale computational analysis of Chinese state-controlled media (1950-2019) to examine how revolutionary social transformations are reflected in official linguistic representations of social groups. Using diachronic word embeddings at multiple temporal resolutions, we find that Chinese representations differ significantly from Western counterparts, particularly regarding economic status, ethnicity, and gender. These representations show distinct evolutionary dynamics: while stereotypes of ethnicity, age, and body type remain remarkably stable across political upheavals, representations of gender and economic classes undergo dramatic shifts tracking historical transformations. This work advances our understanding of how officially sanctioned discourse encodes social structure through language while highlighting the importance of non-Western perspectives in computational social science.

A Linguistic Analysis of Spontaneous Thoughts: Investigating Experiences of Deja Vu, Unexpected Thoughts, and Involuntary Autobiographical Memories

The onset of spontaneous thoughts are reflective of dynamic interactions between cognition, emotion, and attention. Typically, these experiences are studied through subjective appraisals that focus on their triggers, phenomenology, and emotional salience. In this work, we use linguistic signatures to investigate Déjà Vu, Involuntary Autobiographical Memories, and Unexpected Thoughts. Specifically, we analyze the inherent characteristics of the linguistic patterns in participant generated descriptions of these thought types. We show how, by positioning language as a window into spontaneous cognition, existing theories on these attentional states can be updated and reaffirmed. Our findings align with prior research, reinforcing that Déjà Vu is a metacognitive experience characterized by abstract and spatial language, Involuntary Autobiographical Memories are rich in personal and emotionally significant detail, and Unexpected Thoughts are marked by unpredictability and cognitive disruption. This work is demonstrative of languages' potential to reveal deeper insights into how internal spontaneous cognitive states manifest through expression.

A Bayesian Model of Confirmatory Exploration in Text-based Web Media

As web media, such as social networking services (SNS), become more prevalent, the formation of false beliefs through fake news and propaganda has become a significant problem. This study focuses on the cognitive process of users as actively information-seeking agents in web media exploration and proposes WEB-FEP, a computational model of users forming specific beliefs through interactions with web media. WEB-FEP specifically attempts to computationally reproduce confirmation bias in web media exploration by formalizing the trade-off between belief-confirmatory and exploratory actions inspired by active inference. WEB-FEP is validated by comparing the results of simulations with user experiments conducted on a virtual SNS. The results indicate that the initial belief distributions and learning rates modeled in WEB-FEP can successfully reproduce the diverse behaviors of users including confirmatory exploration.

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration from Cognitive Psychology

The cognitive mechanism by which Large Language Models (LLMs) solve mathematical problems remains a widely debated and unresolved issue. Currently, there is little interpretable experimental evidence that connects LLMs' problem-solving with human cognitive psychology. To determine whether LLMs possess human-like mathematical reasoning, we modified the problems used in the human Cognitive Reflection Test (CRT). Our results show that even with the use of Chain-of-Thought (CoT) prompts, mainstream LLMs, including the o1 model (noted for its reasoning capabilities), have a high error rate when solving these modified CRT problems. Specifically, the average accuracy rate dropped by up to 50% compared to the original problems. Further analysis of LLMs' incorrect answers suggests that they primarily rely on pattern matching from their training data, which aligns more with human intuition (System 1 thinking) rather than with human-like reasoning (System 2 thinking). This finding challenges the belief that LLMs have genuine mathematical reasoning abilities comparable to humans. As a result, this work may adjust overly optimistic views on LLMs' progress toward Artificial General Intelligence.

Visual and Musical Aesthetic Preferences Across Cultures

Research on how humans perceive aesthetics in shapes, colours, and music has predominantly focused on Western populations, limiting our understanding of how cultural environments shape aesthetic preferences. We present a large-scale cross-cultural study examining aesthetic preferences across five distinct modalities extensively explored in the literature: shape, curvature, colour, musical harmony and melody. We gather 401,403 preference judgements from 4,835 participants across 10 countries, systematically sampling two-dimensional parameter spaces for each modality. The findings reveal both universal patterns and cultural variations. Preferences for shape and curvature cross-culturally demonstrate a consistent preference for symmetrical forms. While colour preferences are categorically consistent, ratio-like preferences vary across cultures. Musical harmony shows strong agreement in interval relationships despite differing regions of preference within the broad frequency spectrum, while melody shows the highest cross-cultural variation. These results suggest that aesthetic preferences emerge from an interplay between shared perceptual mechanisms and cultural learning.

Quantifying Risk Propensities of Large Language Models: Ethical Focus and Bias Detection through Role-Play

As Large Language Models (LLMs) become more prevalent, concerns about their safety, ethics, and potential biases have risen. Systematically evaluating LLMs' risk decision-making tendencies, particularly in the ethical domain, has become crucial. This study innovatively applies the Domain-Specific Risk-Taking (DOSPERT) scale from cognitive science to LLMs and proposes a novel Ethical Decision-Making Risk Attitude Scale (EDRAS) to assess LLMs' ethical risk attitudes in depth. We further propose a novel approach integrating risk scales and role-playing to quantitatively evaluate systematic biases in LLMs. Through systematic evaluation of multiple mainstream LLMs, we assessed the "risk personalities" of LLMs across multiple domains, with a particular focus on the ethical domain, and revealed and quantified LLMs' systematic biases towards different groups. This helps understand LLMs' risk decision-making and ensure their safe and reliable application. Our approach provides a tool for identifying biases, contributing to fairer and more trustworthy AI systems.

What does action do to object? The case of metaphoric action.

We examined whether embodiment effects at a particular word influence subsequent words. We recorded EEG while participants read action sentences that were literal-concrete (LC), literal-abstract (LA), and metaphorical (MET). Prior work showed that at the verbs, both LC and MET elicited more negative N400s than LA, reflecting sensorimotor simulations. We found that at the object nouns, LC elicited a more negative negativity in the 500-700 ms time window than LA and MET, which may reflect combined influences from embodiment spilled over from the verbs and imagery linked to noun concreteness and imageability. These findings suggested that literal action embodiment can yield extended motor engagement extending into subsequent words, but metaphorical action verbs cannot. Moreover, lexical properties at the noun played a role.

The Uniformity Fallacy: A Second Common, Severe Misinterpretation of Bar Graphs of Averages

Past methods for studying graph interpretation have only indirectly assessed people's mental picture of the data that produced the graph. Recently, we developed a more direct, drawing-based measure and used to reveal a severe misinterpretation of the common bar graph of averages: one in five viewers mistook the average for the data's outer limit. Here, we use the same measure to reveal a second misinterpretation, whereby even more viewers—one in three—incorrectly assume that data frequency remains approximately uniform over its entire range. Missing from their mental picture are the tails of the distribution—the relative rarity of extreme values—which are so characteristic of real data that they are embedded in the core normality assumption of statistics. We label this misinterpretation the "Uniformity Fallacy" and characterize its nature, reproducibility, generalizability, and correlates. We conclude that bar graphs of averages fail to communicate data truthfully in not one, but two fundamental ways.

The Importance of Metacognitive Sensitivity in Human-AI Decision-Making

In human-AI decision-making, understanding the factors that maximize overall accuracy remains a critical challenge. This study highlights the role of metacognitive sensitivity—the agent's ability to assign confidence scores that reliably distinguish between correct and incorrect predictions. We propose a theoretical framework to evaluate the impact of accuracy and metacognitive sensitivity in hybrid decision-making contexts. Our analytical results establish conditions under which an agent with lower accuracy but higher metacognitive sensitivity can enhance overall decision accuracy when paired with another agent. Empirical analyses on a real-world image classification dataset confirm that stronger metacognitive sensitivity—whether in AI or human agents—can improve joint decision outcomes. These findings advocate for a more comprehensive approach to evaluating AI and human collaborators, emphasizing the joint optimization of accuracy and metacognitive sensitivity for enhanced decision-making.

Convergence, Reciprocity, and Asymmetry: Communication Accommodation Between Large Language Models and Users Across Cultures

The increasing adoption of conversational agents powered by large language models (LLMs) raises questions about its effects across culturally diverse interactions. While these agents are linguistically versatile and multilingual, their ability to adapt along cultural dimensions–defined as geographically and communally nurtured sets of values and behavioral norms– lacks close scrutiny of both their design and deployment. To achieve inclusive conversational AI, it is essential to understand how agents adapt to users from diverse cultural backgrounds. In this study, we analyze dialogues between human users from different countries and LLM-powered agents to examine how both parties adapt their word use, a salient aspect of linguistic styles, toward one another throughout casual conversations. Our analysis reveals that LLMs exhibit varying degrees of style matching based on users' national cultures and demonstrate asymmetric adaptation when interacting with culturally diverse users. Moreover, we observe a reciprocal dynamic where both the LLMs and users from certain cultures adjust their styles in response to one another. Additionally, our findings support the hypothesis that LLMs and users naturally converge in conversational styles over the course of interactions, mirroring the dynamics of human conversations that accommodate and converge. To develop localized and culturally aware agents, there's a potential to utilize such cross-cultural convergence process during fine-tuning to align LLMs.

Orthographic Complexity Moderates Eye Movements While Reading in Hindi, Along with Length & Frequency

Past research on eye-movements during reading and comprehension has primarily focused on alphabetic scripts, such as the Roman script used to write European languages like English, Dutch, German, and Spanish, where classical measures like word length can be easily calculated by counting characters. However, this approach may not generalize to alphasyllabic languages like Hindi and other Indian languages written using the Devanagari script, where many characters depend on diacritic markers for proper pronunciation. This poses challenges in studying these languages in eye-tracking research, discourages eye-tracking studies with these languages. To address this gap, we asked 61 native Hindi speakers (L1) read Hindi text, while their eye-movements were being tracked. Results revealed that a complexity metric for the script predicts variables such as first fixation duration, gaze duration, single fixation duration, total reading time, and number of fixations. These results were also correlated with variables such as word frequency (van Heuven et al., 2014) for all eye-tracking measures.

The Emergence of Latent Force Representation in Human Perception of Social Interactions

Humans recognize social interactions effortlessly, even when presented with minimal visual information in unfamiliar displays. While force dynamics has been proposed as latent representations for perceiving social interactions, most research has approached this topic from a linguistic perspective based on conceptual knowledge, leaving open the central question of how latent force representations arise from visual inputs. The present study developed a force model that represents social interactions through two types of compositional forces: interactive forces, driven by interactions between agents; and self-propelled forces, driven by intentions of individual agents. Each force was formulated using a physics function to capture the dynamics of repulsive and attractive forces. We conducted two human experiments to measure human similarity judgments across a range of interaction animations and to evaluate recognition performance using generated animations in which the forces applied to individual agents were systematically manipulated. We found that the force model provides a parsimonious account for human judgments in both experiments. These findings suggest that mid-level representations based on compositional forces driven by different goals play an important role in social perception. We conjecture that the development of social perception may be grounded in perceptual mechanisms that support intuitive physics.

Whose Experience is it Anyway? Examining inter-subject variability in urban beauty and safety judgements

Recent research in urban visual analytics has focused on crowdsourcing judgements of qualities such as beauty and safety, for large-scale city-wide predictive evaluation. This study examines the extent and nature of inter-subject variability in such judgements and argues that these subjective qualities often have low generalizability across individuals. We conducted an online study involving 94 participants across 19 countries, where subjects arranged streetscape scenes on visual scales of beauty and safety. An analysis of the arrangements revealed very low inter-subject consistency, including within demographic groups based on age, sex, race and nationality. K-means clustering also revealed large clusters with contradictory judgements with respect to visual features. There was however higher intra-subject consistency when rating the same scenes twice. Based on these findings, we recommend a cautious approach to the use of "average" crowdsourced judgements of urban qualities, and encourage the adoption of subject-specific prediction when evaluating such qualities at scale.

Multi-view Feature Selection with Reinforcement Learning for EEG-based Automated ESES Diagnosis

Electrical status epilepticus during sleep (ESES) is a serious condition that causes notable cognitive decline. It is characterized by distinct spike and slow-wave patterns on electroencephalograms (EEG). Clinical ESES diagnosis is extremely time-consuming and labor-intensive as it demands clinicians to manually interpret and count EEG screens. Existing automated diagnosis algorithms for ESES have major flaws, like struggling to adapt to complex spike-and-wave patterns and not fully exploiting the rich multi-view features of EEG. To overcome these issues, we propose a multi-view feature selection framework integrating reinforcement learning and attention mechanisms for automated ESES diagnosis. A CLEAN reward mechanism is introduced to address complex multi-objective feature selection challenges. Experiments on the clinical data consisting of 36 epilepsy patients prove the proposed method's remarkable spike-and-wave identification ability and high agreement with expert diagnoses. Our approach represents a significant step toward developing automated bedside ESES clinical diagnostic systems.

The Impact of Physical Effort and Cybersickness on Environmental Learning and Navigation: A Comparison of Desktop and Treadmill Interfaces

In the present study, participants learned the locations of 12 landmarks by following a guided route either in a desktop virtual environment or on an omnidirectional treadmill paired with a virtual reality headset. Spatial learning and wayfinding efficiency were later assessed across both interface conditions. Additionally, the ratio of physical to cognitive costs for navigating to goals was manipulated across trials. Results indicated that although the two groups did not differ in spatial learning, participants navigating on treadmills selected more efficient routes than those in the desktop group in trials involving high physical cost. Higher levels of self-report cybersickness were associated with reduced spatial learning and wayfinding efficiency, independent of interface condition. These findings validate the use of omnidirectional treadmills for investigating the tradeoff between cognitive and physical effort in navigation. At the same time, reducing cybersickness is essential to ensure the effective use of this technology.

How well do models of cross-situational word learning account for the learning of ambiguous words?

Existing theories of word learning largely focus on a learner's ability to learn a single meaning for a word despite the fact that many words have multiple meanings. Several computational models of cross-situational word learning have been proposed to explain how words are learned, but it is unknown to what extent they can learn ambiguous words with multiple meanings. Here, we present an experiment showing that adult learners are able to learn multiple meanings of novel ambiguous words in a cross-situational word learning paradigm, and are especially good at doing so when the meanings of the words are related (polysemous) rather than unrelated (homophonous). We evaluated the ability of ten different computational models of cross-situational word learning to explain the empirical data, and none were able to learn the ambiguous words as successfully as the adult learners. Moreover, because these computational models do not represent any semantic information, they are in principle unable to replicate the key difference between polysemous and homophonous word learning found in the study.

Prevalence-Induced Concept Change: Universal or Context-Dependet? Implications for Social Psychology and AI Cognition

Prevalence-induced concept change (PICC) occurs when reduced category prevalence increases the likelihood that ambiguous stimuli are classified as belonging to the now-minority category. PICC has been observed across perceptual and social domains and persists despite instructions or incentives to suppress it. However, other findings suggest its expression is instead shaped by social context. If AI models are to be treated as theories of cognition, they should exhibit PICC as well. We show that a standard AI model for sequential learning, trained on dynamic category distributions, does not display PICC, instead favoring the now-majority category. This opposite-PICC effect suggests that simple sequential learning may be insufficient to produce PICC. Additional mechanisms, such as structured priors, contextual sensitivity, or internal feedback, seem necessary for its emergence. Our findings contribute to the understanding of PICC and its implications for categorization theories, AI-driven decision-making, and the role of media exposure in shaping social perceptions.

Do Large Language Models Reason Causally Like Us? Even Better?

Causal reasoning is a core component of intelligence. Large language models (LLMs) have shown impressive capabilities in generating human-like text, raising questions about whether their responses reflect true understanding or statistical patterns. We compared causal reasoning in humans and four LLMs using tasks based on collider graphs, rating the likelihood of a query variable occurring given evidence from other variables. LLMs' causal inferences ranged from often nonsensical (GPT-3.5) to human-like to often more normatively aligned than those of humans (GPT-4o, Gemini-Pro, and Claude). Computational model fitting showed that one reason for GPT-4o, Gemini-Pro, and Claude's superior performance is they didn't exhibit the "associative bias'' that plagues human causal reasoning. Nevertheless, even these LLMs did not fully capture subtler reasoning patterns associated with collider graphs, such as "explaining away". These findings underscore the need to assess AI biases as they increasingly assist human decision-making.

FutureVision: A methodology for the investigation of future cognition

This paper presents a methodology combining multimodal semantic analysis with an eye-tracking experimental protocol to investigate the cognitive effort involved in understanding the communication of future scenarios. We conduct a pilot study examining how visual fixation patterns vary during evaluation of valence and counterfactuality in fictional ad pieces describing futuristic scenarios, using a portable eye tracker. Participants' eye movements are recorded while evaluating the stimuli and describing them to a conversation partner. Gaze patterns are analyzed alongside semantic representations of the stimuli and participants' descriptions, constructed from a frame semantic annotation of both linguistic and visual modalities. Preliminary results show that far-future and pessimistic scenarios are associated with longer fixations and more erratic saccades, supporting the hypothesis that fractures in the base spaces underlying interpretation of future scenarios increase cognitive load for comprehenders.

Can automated vocal analyses over child-centered audio recordings be used to predict speech-language development?

Understanding how children's spontaneous language behavior relates to standardized metrics of language development remains a crucial challenge in developmental science, particularly given the time and resources required for many traditional lab-based assessments. This study investigates whether automated analysis of naturalistic, child-centered audio recordings can index the developmental trajectory of speech-language abilities. Using a longitudinal design following N=130 preschoolers, we employed deep learning methods to compute Canonical Proportion - a theoretically-motivated metric that reflects both speech motor control development and phonological representation building - from naturalistic, child-centered audio recordings at age 3 years. Canonical proportion measures significantly predicted multiple dimensions of speech-language development longitudinally, formally assessed in the lab one year later at age 4. The strongest relationships were found for consonant articulation skill and vocabulary size, suggesting that early speech production patterns may moderately index numerous later facets of language development. These findings outline a potential relationship between children's spontaneous, everyday language behavior and more traditional language development metrics, while demonstrating the potential for automated measures to expand and diversify research in developmental science.

Dimensions of Vulnerability in Visual Working Memory: An AI-Driven Approach to Perceptual Comparison

Human memory exhibits vulnerability in cognitive tasks; comparing visual working memory with new perceptual input can cause unintended distortions. Prior studies report systematic memory distortions post-comparison, but understanding their impact on real-world objects and identifying contributing visual features remains challenging. We propose an AI-driven framework generating naturalistic stimuli based on behavioral object dimensions to elicit similarity-induced memory biases. Using two stimuli types—image wheels (dimension-edited) and dimension wheels (activation-based)—we conducted three visual working memory experiments under conditions: no perceptual comparison, image wheel comparison, and dimension wheel comparison. Results show that both similar images and dimensions induce memory distortions. Visual dimensions (e.g., shape/texture) are more distortion-prone than semantic ones (e.g., category), indicating that naturalistic stimuli's object dimensions critically influence memory vulnerability.

Association between cognitive reserve and spatial ability across the lifespan

Spatial navigation is a promising cognitive marker for several cognitive disorders. However, its clinical utility is still limited because spatial abilities are also influenced by non-pathological factors. Among these, the Cognitive Reserve (CR) stands out due to its strong influence on cognitive aging trajectories. However, the association between CR and spatial navigation across the lifespan has never been assessed. In this study, we collected spatial navigation data along with CR measures in a population with diverse demographic profiles. We found a strong decline in spatial ability with age, but no overall effect of the CR. We systematically analyzed the association between each CR item and spatial ability and found that the subscore related to reading habits was significantly associated with spatial navigation performance, even after correcting for multiple comparisons. We discuss the implications of these findings for early and personalized screening for cognitive disorders using a spatial navigation task.

Passive Behavioral Sensing: Using Within-Person Variability Features from Mobile Sensing to Assess Self-Regulated Learning

Self-regulated learning (SRL) significantly influences students' learning behaviors and academic performance. However, research has focused on "between-person" differences, neglecting "within-person" variability. Traditional SRL assessments rely on self-reports, which fail to capture fine-grained behavioral changes (such as hourly variations). We propose a novel approach using mobile sensing to assess SRL through within-person variability. We use passive sensing data from the phones of 211 university students to explore this relationship. To assess behavioral variability, we focus on five sensing behaviors—physical activity, social interactions, sleep, location, and app usage data—and calculate four within-person variability features: standard deviation, circadian rhythm, regularity index, and flexible regularity index. Our findings reveal significant associations between these variability features and self-reported SRL skills, particularly in dimensions such as environment structuring, time management, and help seeking. This research provides new insights into assessing SRL and offers a theoretical foundation for future personalized interventions in educational settings.

Symbolic numerical generalization through representational alignment

The mapping between nonsymbolic quantities and symbolic numbers lays the foundation for mathematical development in children. However, the neural mechanisms underlying this crucial cognitive bridge remain unclear. Here, we in- vestigate the computational principles governing symbolic- nonsymbolic integration using a biologically inspired neural network trained through developmentally inspired stages. Our investigation reveals that generalization from nonsymbolic to symbolic numerical processing emerges specifically when rep- resentational alignment forms between these numerical for- mats. Notably, this alignment appears to be stronger in cross- format comparison-based mapping compared to direct-label- based mapping. Furthermore, we demonstrate that subsequent symbolic specialization creates a representational divergence that impairs nonsymbolic performance while maintaining the ordinal structure of the mapping. These findings highlight rep- resentational alignment as a fundamental mechanism in nu- merical cognition and suggest that targeted cross-format com- parison tasks may be particularly effective in improving math- ematical learning in children with numerical processing diffi- culties. Keywords: Emergence of number semantics, Representa- tional alignment, Artificial neural network

Stimulus size influences gaze targets during free viewing of natural video

While modern computational models accurately predict free viewing scanpaths via learned deep neural networks, the in- vestigation of the brain's implementation of the modeled be- havior has been thus far limited to primarily bottom-up vi- sual features. One of the key question in the investigation in the early visual system's implementation of bottom-up visual saliency is how (and whether) it dynamically adapts its sensi- tivity to features of different spatial scales. This paper provides a simple test of whether identical stimuli presented at different spatial scales produce different gaze behavior. We asked sub- jects (n=12) to freely view video stimuli twice each session over two sessions In one session (intervention) the visual scale changed from large (25 degrees of visual angle) to small (10 degrees) between viewings. In the other session (control) the video size was the same. Gaze was more strongly correlated between viewings of the same video size (r=0.265) than differ- ent sizes (r=0.231), independent of whether there was a long (> 24 hours) or short (< 10 min) delay between viewings, im- plying that memory effects are not a strong factor. Although low, these within-subject correlations are higher than the cor- relation of gaze between different subjects viewing identical videos (r=0.195).

Boosting Cognitive Modelling for Human Reasoning

AI models are often developed to solve reasoning problems optimally. In contrast, cognitive models focus on explaining and predicting replicative cognitive patterns of human information processing. And while many of the theories aim to explain an assumed ‘general' human reasoner, only few are aimed at the individual. This paper addresses the challenge of the latter by investigating the automatic generation of individualised predictive algorithms using transformer-based models. These models which have been trained on huge amounts of human data, potentially exhibit built-in cognitive patterns. Leveraging such characteristics and architecture of transformer-based models, we outline a generalized methodology for establishing a human-AI collaborative framework, to generate explainable and reproducible algorithms with cross-domain applicability. While predictive accuracy and generalizability pose less of a problem, the bigger challenges in using machine learning approaches or transformer-based models may be explainability and replicability. Hence, instead of ‘just' using such a model for directly fitting the data, we use it to extract features and to propose cognitive algorithms that are executable in systems outside of the model. Using two datasets pertaining to syllogistic and spatial reasoning, the predictive algorithms thus generated applying the presented framework, achieve mean accuracies of 68% and 81%, respectively. Both algorithms outperform other established, state-of-the-art cognitive models by far, surpassing the (previously) best state-of-the art models in syllogistic and spatial human reasoning by 19% and 13%, respectively.

Mapping Communication Disruption in Traumatic Brain Injury with Transformer Embeddings

Recent advances in computational modeling have expanded our capacity to analyze language and communication, particularly through transformer models. The present work investigates how such computational frameworks can be leveraged to address clinical domains in communication disorders. We used semantic embeddings from BERT's layers to analyze language-related adjustments used by participants with traumatic brain injury (TBI) in conversational transcripts. By examining semantic convergence patterns across different layers of the BERT model, we found that TBI participants demonstrated more pronounced "self" convergence -- they tended to stay closer to their own semantic contributions in the conversation -- compared to controls. This effect was particularly noticeable at earlier layers of the BERT model, suggesting that surface-level semantics play a significant role. The findings highlight the potential for language models to enhance our understanding of social interaction dynamics. We further discuss how bridging computational linguistics with clinical domains can address analytic challenges in the study of natural cognition and communication.

Neural responses of Interval Judgment in the Tritone Paradox

The Tritone Paradox is an auditory illusion in which a sequence of two complex tones is perceived as either ascending or descending, depending on the individual. It presents an interesting phenomenon for investigating pitch perception in contexts. however, no neurophysiological study has been conducted. This study identified event-related potential (ERP) correlates of pitch judgments under different pitch contexts. Twenty-seven participants judged whether the tritone pair was perceived as ascending or descending after listening to a sequence of ascending or descending tone pairs. Cortical auditory evoked responses to the second tone of the tritone pair were compared across contexts. In the Rise context, standard stimuli evoked larger responses at Fp1; in the Fall context, deviant stimuli elicited stronger responses across all sites. These results suggest that frontal and central brain regions are involved in processing ambiguous pitch stimuli, and that ERP responses reflect the interaction between stimulus context and perceptual.

Spatial-Energy-Aware Dynamic Filtering with Sparse Graph Convolutions for EEG Emotion Recognition

Accurate recognition of human emotions from EEG signals plays a critical role in affective computing and human-computer interaction. However, existing methods face significant challenges in effectively capturing the sparse, dynamic, and energy-dependent characteristics of brain activity during emotional experiences. To address these challenges, we propose a novel framework, Spatial-Energy-Aware Dynamic Filtering with Sparse Graph Convolutions (SEASGC), which rethinks EEG graph modeling from three perspectives: (1) sparse graph construction to adaptively capture the essential functional relationships between brain regions, (2) dynamic and location-dependent filtering to model nonlinear interactions between EEG nodes, and (3) energy-aware feature aggregation to leverage energy changes as critical indicators of emotional intensity. By explicitly integrating these principles, SEASGC provides a more comprehensive representation of EEG signals for emotion recognition. Extensive experiments on benchmark EEG emotion datasets demonstrate that SEASGC achieves state-of-the-art performance, highlighting its effectiveness and generalizability in modeling the complex spatial-spectral dynamics of EEG signals.

Reconciling Different Theories of Learning With an Agent-based Model of Procedural Learning

Computational models of human learning can play a significant role in enhancing our knowledge about nuances in theoretical and qualitative learning theories and frameworks. There are many existing frameworks in educational settings that have shown to be verified using empirical studies, but at times we find these theories make conflicting claims or recommendations for instruction. In this study, we propose a new computational model of human learning, Procedural ABICAP, that reconciles the ICAP, Knowledge-Learning-Instruction (KLI), and cognitive load theory (CLT) frameworks for learning procedural knowledge. ICAP assumes that constructive learning generally yields better learning outcomes, while theories such as KLI and CLT claim that this is not always true. We suppose that one reason for this may be that ICAP is primarily used for conceptual learning and is underspecified as a framework for thinking about procedural learning. We show how our computational model, both by design and through simulations, can be used to reconcile different results in the literature. More generally, we position our computational model as an executable theory of learning that can be used to simulate various educational settings.

Improving Brain-to-Image Reconstruction via Fine-Grained Text Bridging

Brain-to-Image reconstruction aims to recover visual stimuli perceived by humans from brain activity. However, the reconstructed visual stimuli often missing details and semantic inconsistencies, which may be attributed to insufficient semantic information. To address this issue, we propose an approach named Fine-grained Brain-to-Image reconstruction (FgB2I), which employs fine-grained text as bridge to improve image reconstruction. FgB2I comprises three key stages: detail enhancement, decoding fine-grained text descriptions, and text-bridged brain-to-image reconstruction. In the detail-enhancement stage, we leverage large vision–language models to generate fine-grained captions for visual stimuli and experimentally validate its importance. We propose three reward metrics (object accuracy, text-image semantic similarity, and image-image semantic similarity) to guide the language model in decoding fine-grained text descriptions from fMRI signals. The fine-grained text descriptions can be integrated into existing reconstruction methods to achieve fine-grained Brain-to-Image reconstruction.

Integrating talker and message in language processing: the influence of speaker gender on sentence prediction in Mandarin Chinese

Spoken sentence processing often requires integrating the linguistic message with information about the talker and the social context. Among other things, information about the talker's gender identity could influence how a listener processes what they hear, due to the prevalence of gender-related stereotypes in human societies. Several previous studies showed that listeners anticipated stereotype-consistent content, and that comprehension was affected when gender stereotypes were violated (e.g., when women serve in conventionally male-dominant professions, or vice versa). However, the existing findings are rather mixed. In this study, we examine the influence of talker gender information on language comprehension and prediction in Mandarin Chinese. We report the results of a cloze task where participants were asked to guess the last word of a sentence with or without information about talker gender. When talker gender information was available, we further varied whether gender information was revealed by personal names or voices. Participant responses were evaluated for gender bias by two pre-trained language models based on Word2Vec and GPT2. Results of statistical analysis revealed that participants adjusted their responses to align with the gender category cued by the sentence (i.e., more male/female-biased responses when the sentences implicated a male/female talker), but the effect was only present when gender information was implicated through names but not voices. The current study provides partial evidence for the effect of talker gender on sentence prediction. We discuss the implications of current findings for further study of the integration of linguistic and social information in language comprehension.

Can input statistics over-ride a prior bias in morpheme ordering? A test case with gender and number

In languages which mark both gender and number as distinct morphemes, there is a tendency to place gender closer to the noun stem than number. However, the typological data on this is sparse. Moreover, linguistic theories differ in how they explain ordering patterns of gender and number morphology: some theories focus on the structure of the representations of features in the speakers' minds, and other focus on the role of co-occurrence statistics. In a recent study, Saldana, Kanampiu, and Culbertson (2025) use artificial language learning to show that learners with a diverse range of language experience with grammatical gender and number exhibit a consistent bias for orders with gender closer to the noun stem than number. This order reflects the ordering in which most linguistic theories assume number and gender features are derived in word formation. Here, we build on this study to investigate how this bias interacts with the statistics of the linguistic input. In particular, we manipulate co-occurrence between stems and affixes so that learners are exposed to combinations of stems and number morphology more often than to stems and gender morphology. We test whether input statistics can push learners to reverse their natural preference, leading them to place number closer to the noun than gender. We find that our manipulation reduces, but does not eliminate or reverse, the preference for gender-closest order. However, our study also highlights some difficulties learners have in acquiring novel features from sparse data. Ultimately, our findings highlight the dynamic interplay between representations of meaning and input-based learning mechanisms.

AC-CDCN:A Cross-Subject EEG Emotion Recognition Model with Anti-Collapse Domain Generalization

Emotion recognition is a critical area in brain-computer interfaces, with electroencephalography (EEG) shown to be effective for emotional analysis. In domain generalization, cross subject emotion recognition encounters significant generalization challenges, including excessive feature collapse and insufficient capture of EEG features. To tackle these issues, we propose an Anti-Collapse Cross-Domain Consistency Network (AC-CDCN), which leverages Maximum Mean Discrepancy (MMD) to reduce distribution discrepancies between source domains, facilitating the capture of domain-invariant features, and innovatively introduces an Anti-Feature Collapse Strategy (AFCS), which incorporates an Anti-Collapse Domain Discriminator (ACDD) and the code rate loss function, effectively preventing excessive feature collapse. Furthermore, we propose a Flexible Feature Rebalance Module (FlexiReMod), a plug-and-play component that enhances generalization and dynamic feature capture through feature fusion and attention mechanisms. Experimental results indicate AC-CDCN achieved 87.14% (±5.60) and 71.77% (±12.92) accuracy on SEED and SEED-IV datasets, underscoring its significant generalization advantage.

Emotional Resonance in Film: Disentangling the Effects of Music on Moral Judgments in Dilemmas

The Social Intuitionist Model emphasizes the primary role of moral intuitions, rather than analytical moral reasoning, in forming moral judgments. In line with this model, moral emotions, rather than reasoning, are strongly associated with moral actions. This study explored the impact of emotionally evocative film excerpts, specifically those featuring moral dilemmas within varying contexts of social distance, on moral judgment processes. We invited 88 college students to view excerpts from foreign movies and their Chinese remakes under three music conditions: no music, positive music, and negative music. Participants evaluated the characters' actions based on three moral dimensions: Harm, Fairness, and Authority, while also rating their emotional responses. The results indicated that positive music during moral dilemmas elicited strong negative emotional responses. Additionally, viewers judged the protagonist's actions more harshly regarding fairness and placed greater emphasis on social order and authority in Chinese compared to foreign movie excerpts, regardless of the music's emotional valence.

Scroll-Time and Echo-Chambers: Effect of Mass Media on Ingroup Bias and Polarization

Information disseminated through mass-media has been known to significantly influence behaviors such as voting (Iyengar & Kinder, 1987), brand preferences (Tversky & Kahneman, 1981), and public opinion (McCombs & Shaw, 1972). More specifically, the way information is framed and presented in mass media (for instance through "breaking news" and "sensational headlines") can reinforce existing beliefs and contribute to political and ideological polarization, with partisan media creating "echo chambers" that deepen biases (Hobolt et al., 2024). The current study leverages these insights to investigate the effects of the echo-chambered media and the time available for information consumption in an intergroup context across three experiments. Going by the minimal group paradigm, Experiment 1 employed a randomized, untimed presentation (non-echo-chamber) of news about an ingroup and an outgroup, while Experiment 2 used a blocked, untimed design (echo-chamber), and Experiment 3 a blocked, timed design (echo-chamber, doomscroll). All experiments involved two news sources with varying reliability (low and high) disseminating valanced (positive/negative) intergroup news headlines, and the participants were asked about the degree to which they believed the specific news items. Results showed that the manner of news presentation and timing moderated ingroup favoritism, with higher propensity to believe positive ingroup than positive outgroup news and vice-versa for the negative news, with the most significant bias emerging in Experiment 2 and negative news bias in Experiment 3. These findings shed light on how patterns of media consumption may influence intergroup perceptions and lead to polarization in society.

Towards a Vision-Language Episodic Memory Framework: Large-scale Pretrained Model-Augmented Hippocampal Attractor Dynamics

Modeling episodic memory (EM) remains a significant challenge in both neuroscience and AI, with existing models either lacking interpretability or struggling with practical applications. This paper proposes the Vision-Language Episodic Memory (VLEM) framework to address these challenges by integrating large-scale pretrained models with hippocampal attractor dynamics. VLEM leverages the strong semantic understanding of pretrained models to transform sensory input into semantic embeddings as the neocortex, while the hippocampus supports stable memory storage and retrieval through attractor dynamics. In addition, VLEM incorporates prefrontal working memory and the entorhinal gateway, allowing interaction between the neocortex and the hippocampus. To facilitate real-world applications, we introduce EpiGibson, a 3D simulation platform for generating episodic memory data. Experimental results demonstrate the VLEM framework's ability to efficiently learn high-level temporal representations from sensory input, showcasing its robustness, interpretability, and applicability in real-world scenarios.

Simulating variation in infant-caregiver attachment using reinforcement learning

Infants' attachment to their caregivers is a central feature of their early social and emotional development. Attachment Theory posits that these relationships vary systematically across distinct styles, though there has been debate about the extent to which these differences reflect features of caregivers' responsiveness vs. infants' own temperament. We develop a simple reinforcement learning model of infant exploration that allows us to vary the characteristics of simulated infants and caregivers and analyze the resulting patterns of model behavior. A set of equilibria reliably emerges that corresponds qualitatively to canonical attachment styles; particular agents' equilibria are controlled by both caregiver and infant parameters. These simulations point the way towards a quantitative synthesis of prior theoretical debates about the nature of attachment.

Deaf Signers Adapt Their Eye Gaze Behaviour When Processing an Unknown Sign Language

Sign languages are perceived visually and externalized using a signer's hands, face, and upper body. During sign language comprehension, deaf signers primarily focus their gaze on the face, while hearing non-signers attend more to the hands of a signer. Little is known about whether deaf signers adapt their gaze behaviour when processing unknown signs. Here, we report eye-tracking data from 15 deaf native signers of German Sign Language (DGS) and 15 hearing non-signers who were presented with videos in either DGS or an unknown sign language, all containing no linguistic mouth actions. Our data confirm that deaf signers generally fixate more on the face of a signer than hearing non-signers who attend to the hands in sign space. Moreover, only deaf signers increase their attention to the hands when processing video stimuli consisting of unknown signs compared to familiar signs, suggesting similar adjustment behaviours as observed in spoken languages.

The Maze of Creative Thinking Pathways of Traits, States, and Intelligence in Shaping Creativity

Creativity is a complex and multifaceted construct that has been linked to various cognitive, emotional, and personality factors. This study explores the interplay between fluid intelligence, creative reasoning, personality traits, and mood states in predicting divergent thinking performance. We hypothesize that extraversion, openness to experience, positive moods, and creative reasoning would predict performance on divergent thinking tasks. A total of 120 young adults participated in the study, completing assessments on fluid intelligence (APM), creative reasoning (CRT-Reasoning), creativity (CRT-Creativity, TCT-DP), personality traits (Big Five Inventory 2), and moods (Current Mood Scale). Results revealed a negative relationship between fluid intelligence and divergent thinking performance, suggesting that higher fluid intelligence may be associated with more convergent thinking strategies. Extraversion emerged as a significant positive predictor of divergent thinking, supporting the idea that sociability and external engagement foster creativity. Openness to experience did not significantly predict divergent thinking, indicating that its impact may vary across domains of creativity. Mood states, especially hopelessness, were negatively associated with creativity, but the frequently found association between positive moods and creativity could not be replicated. These findings underscore the importance of considering cognitive, emotional, and personality factors in the study of creativity and suggest potential pathways for future research into how these elements interact to shape creative thinking.

Extracting Latent Dimensions from Multidimensional Response Timing Data

Computer-based assessments enable the collection of fine-grained response process data through log files. We propose a novel method for extracting latent dimensions from such multidimensional response timing data, based on applying the Weighted MDS (WMDS) model. In our method, dissimilarities among examinees in their response timing vectors are computed, one such matrix for each test item, then WMDS is applied to this collection of matrices. The resulting latent dimensions represent variation among examinees in their patterns of response timing variables, with the dimension weights of the WMDS model reflecting differences across items in the importance of the latent dimensions. Latent dimensions are interpreted via permutation-based importance, correlation analysis and network analysis. Our method is demonstrated using response data from the PISA 2018 Reading and Mathematics assessments. Results show that the extracted latent dimensions are statistically reliable, educationally interpretable, and boost predictive accuracy when used in conjunction with item scores.

Speaker knowledge modulates the effects of generic language on essentialist beliefs

This research examines how language (generic vs. specific) and speaker knowledge (knowledgeable vs. unknowledgeable) influence essentialist beliefs about a novel social category in children and adults. Across two studies (N = 448 children, 433 adults), adults were more likely to endorse essentialist beliefs when knowledgeable speakers used generic descriptions. Children's responses varied by evaluation timing. In Study 1, when test questions were delayed, children's essentialist beliefs were influenced by language but not speaker knowledge. In Study 2, when memory demands were reduced by having children evaluate claims immediately after hearing them, children showed sensitivity to speaker knowledge, mirroring adults' responses. These findings highlight the role of language and contextual cues in shaping essentialist thought about social groups, suggesting that the effects of generics on social thought are dependent on the cultural expertise of the speaker.

STEREONET: A Network Approach for Stereotype Change

Stereotypes change over time and across cultures, but they are hard to change in experiments. This paper proposes a solution by reconceptualizing stereotypes not as simple associations between groups and traits, but as rich networks of interconnected concepts spanning multiple domains. Building on cross-domain mapping, we used a simple question-answering task (''If X were a Y, what Y would it be?'') to create the expansive networks of 100 social groups to eight domains including animals, jobs, sports, colors, beverages, vehicles, musical instruments, and academic subjects. We found, for instance, the stereotype of women lacking agency is part of a larger network where women are associated with preferences for certain drinks such as wine, colors of pink, musical instruments of harps, or sports such as softball. We further tested whether rewiring these broader networks could change stereotypes more effectively than prior methods. Network-based interventions showed promising results for some groups (women were seen as more competent and Muslims were viewed as more friendly), but effects varied for different groups (minimal changes for criminals). This work suggests that successful stereotype change may require engaging with broader networks of subtle, seemingly unrelated associations rather than targeting individual stereotypical beliefs in isolation.

A minimalistic representation model for head direction system

We propose a minimalistic representational model for the head direction (HD) system, a crucial component of spatial navigation in mammals. Our model leverages the symmetry of the rotation group U(1) and the inherent circular geometry of the head direction. We develop fully connected and convolutional versions of the model, both aiming to learn a high-dimensional representation of head direction that captures essential properties of HD cells. Our learning method results in representations that form a perfect 2D circle when projected onto their principal components, reflecting the underlying geometry of the problem. We also demonstrate that individual dimensions of our learned representation exhibit Gaussian-like tuning profiles akin to biological HD cells. Our model achieves accurate multi-step path integration, effectively updating its internal heading estimate over time. These results demonstrate that simple symmetry-based constraints can yield biologically realistic HD representations, offering new insights into the computational principles underlying spatial navigation in mammals.

Who Likes What? Comparing Personal Preferences with Group Predictions based on Gender and Extraversion Across Common Semantic Domains.

Some people like coffee while others prefer tea, but little is known about whether preferences like these are shared among groups and whether they vary systematically across many common semantic categories. This study addresses this gap by examining two major sources of variation -- gender and extraversion -- across twelve categories or domains, ranging from fruit and animals to sports and personal qualities. In Study 1, participants rated their own preferences for a set of 300 exemplars. Results showed significant preference differences between men and women for 40% of items spread across all categories, and smaller but reliable differences between introverts and extraverts for 11% of items concentrated in domains like personal qualities. Study 2 used an allocentric categorisation task where the same participants categorised items based on which they thought would be preferred by men vs women or introverts vs extraverts. Using the ratings from Study 1 to score accuracy, the judgments from Study 2 showed that participants were sensitive to even subtle differences in preference, although accuracy varied by the judge's gender and extraversion: women were more accurate than men across many categories and introverts more accurate than extraverts for a few categories. We also found incorrect but widely shared judgments for about 20% of items, suggestive of inaccurate stereotypes about group preferences. Together, these results suggest widespread and systematic variation by gender (and to a lesser extent extraversion) that can be accurately predicted by others, although with systematic biases. Our results have implications for theories of semantic representation and social cognition.

Functional fixedness and cooties: Children solve insight problems faster when they learn functions from peers of a different gender

Knowing the intended use of an artifact impairs people's ability to think of alternative uses. Here we ask whether children consider not just "what" a tool is used for, but also "who" uses the tool in that way. We focused on gender roles since children are sensitive to these early in development and adapted a classic insight learning task (Defeyter and German 2003). Success on the task requires inserting a stick into a tube to remove a ball. We compared children's (N = 112; 27 children/condition) latency to solve the problem at Baseline and three Demonstration conditions. In all the Demonstration conditions, the long stick was used as a magnet wand to brush away iron filings. In the Researcher Demonstration condition– a direct replication of Defeyter and German 2003– this function was demonstrated by a single individual – the experimenter; in the Same and Different Gender Peer Group conditions, the function was demonstrated by a group of children whose gender matched or differed from the participant's. Both Peer Group Demonstration conditions induced functional fixedness comparable to the Researcher Demonstration, and children were slower to solve the problem in all Demonstration conditions than Baseline. Critically however, children were faster to solve the problem in the Different Gender Peer Group condition than in the Same Gender Peer Group Condition, suggesting that children encode attributes of a function's typical user in their representations of artifacts, and that functional fixedness is affected by children's identification with a social role.

When Machines Speak with Feeling: Investigating Emotional Prosody, Authenticity, and Trust in AI vs. Human Voices

Emotional prosody---vocal cues that convey affect---profoundly influences how listeners interpret a speaker's intentions. We conducted two studies comparing AI- and human-generated emotional speech. In Study 1 (N=38), participants categorized five emotions (happy, sad, angry, neutral, fear) expressed by human voices and by an advanced text-to-speech (TTS) system. Human recordings exhibited higher overall accuracy (79.82% vs. 72.65%) and were rated significantly more natural, an effect partially explained by micro-perturbations (e.g., jitter, shimmer) that enhanced perceived authenticity. In Study 2 (N=53), these validated stimuli were incorporated into short scenarios, with each speaker labeled as either ``human'' or ``AI.'' Even when participants heard identical clips, those informed that the speaker was human exhibited greater trust and empathy, resulting in higher donation and advice-following rates. Although contemporary TTS systems effectively convey broad affective states, explicit AI labeling reduces perceived credibility and social engagement, underscoring the critical role that preexisting expectations play in human--AI communication.

From Brainwaves to Understanding: A Study on EEG-Based Communication Systems

This research investigates the feasibility of a communication system using Electroencephalography (EEG) signals mediated by a deep learning model, designed to aid individuals with significant communication challenges. The study builds on prior work in EEG-based image generation, testing the accuracy and efficiency of the system in conveying intended meanings. In the experiment, participants were presented images generated from EEG signals and were asked to give titles according to their interpretation. These titles were analyzed using text embeddings derived from a large language model (LLM) to measure cosine similarity. Results indicate that while sender and receiver interpretations often diverged, consistent patterns emerged within and between receivers. This suggests that repeated communication trials will align interpretations over time, improving mutual understanding. The findings highlight the potential of the method to facilitate adaptive communication, though further research is required to optimize its reliability, scalability, and practical applicability.

Embodiment without Body: The Emergence of Body Ownership in AI through Integrated World and Self-Models

In contemporary consciousness studies, the sense of body ownership (SBO) stands as a key marker of subjective embodiment and self-awareness. Recent progress in multimodal and agent AI has prompted the question: Could an artificial system develop something analogous to SBO, and would this require consciousness? This paper refines two core distinctions: (1) functional versus phenomenal SBO, (2) world‑model versus self‑model. Building on a functional reconceptualization of "body" as an interaction boundary, we argue that AI systems equipped with semantic-centric multimodal world models and complementary self-models can, in principle, instantiate a form of SBO. By integrating diverse sensory inputs—visual, tactile, and linguistic—into a cohesive self-representation, this approach suggests the possibility of a virtual body that evokes the contours of the human embodied experience. Such an account questions the strict divide between physical and virtual embodiment, offering new insights into how embodied cognition underpins consciousness. Also, we locate SBO within the broader debate on AI consciousness. Concrete design proposals are linked to existing multimodal agents (e.g., DeepMind's MIA). This inquiry highlights how AI SBO may arise from the interplay of sensory and semantic frameworks, prompting ethical and theoretical reflection on AI consciousness.

Phonetic accommodation and inhibition in a dynamic neural field model

Short-term phonetic accommodation is a fundamental driver behind accent change, but how does real-time input from another speaker's voice shape the speech planning representations of an interlocutor? We advance a computational model of change in speech planning representations during phonetic accommodation, grounded in dynamic neural field equations for movement planning and memory dynamics. A dual-layer planning/memory field predicts that convergence to a model talker on one trial can trigger divergence on subsequent trials, due to a delayed inhibitory effect in the more slowly evolving memory field. The model's predictions are compared with empirical patterns of accommodation from an experimental pilot study. We show that observed empirical phenomena may correspond to variation in the magnitude of inhibitory memory dynamics, which could reflect resistance to accommodation due to phonological and/or sociolinguistic pressures. We discuss the implications of these results for the relations between short-term phonetic accommodation and sound change.

The Patchwork Approach: Toward a Perceptual Theory of Intuitive Physics

Human experience is rich with sensory information, from which physical regularities are internalized to guide behavior. This paper presents the Patchwork Approach, a method for modeling sensorimotor predictions based on these regularities, without explicitly encoding physics. This method leverages environmental regularities, enabling resource-rational perceptual predictions. Using data from previous studies (Deeb et al., 2021, 2024) on human perception, the model outperforms Newtonian-based models in capturing human prediction errors and interpolates across untested conditions. One test demonstrates its superiority in a projectile motion task, while another illustrates its ability to predict deflection angles in a collision event from untrained aiming angles. This approach provides valuable insights into how physical laws are internalized and used to guide perception and action–arguing that perception provides foundational input to intuitive physics reasoning, a role that can complement and be extended to higher-level cognitive processes like those explained by the Intuitive Physics Engine (IPE).

Re-evaluating the Numerical-Perceptual Distinction in the Attraction Effect

A widely studied cognitive bias in decision-making is the attraction effect, in which introducing a clearly inferior decoy option into a binary choice set increases the likelihood of selecting the dominating option (target) over the competitor. While this effect is robust with numerical stimuli, findings with perceptual stimuli have been inconsistent, with some recent studies even reporting a negative attraction effect. We argue that this distinction between numerical and perceptual stimuli is superficial, and that choice behavior is better explained by the inter-attribute relationships. In two experiments, we demonstrated positive attraction effects using combined perceptual-numerical stimuli and both positive and null effects with numerical stimuli by manipulating the asymmetry in pairwise comparison difficulty. The numerical stimuli used in Experiment 2 leveraged the findings from fraction research. Together, our results challenge the numerical-perceptual distinction and support a universal cognitive mechanism underlying the attraction effect.

The Effect of Timescale Dependence on Dyadic Interactions

Interactions between agents are supported through a continuous process of detecting and responding to behaviors that are contingent upon the other agent's behavior. Here, we explore the temporal dependence of these mechanisms, focusing on the role of timescale compatibility in inter-agent interactions. Using continuous-time recurrent neural networks (CTRNNs) to control embodied agents in a minimal social interaction task, we demonstrate that effective interactions require agents to operate on compatible timescales. Our results indicate that time scale mismatches disrupt agents' ability to distinguish other agents from non-social entities, revealing a timescale threshold beyond which agents begin misclassifying slower agents as static objects and faster agents as non-social animate objects.

Improving Category Learning through Graded Classification

Real-world categories often exhibit graded structure, yet learners struggle to acquire family resemblance categories compared to unidimensional ones in laboratory studies. We propose that part of this difficulty arises from the binary nature of Traditional Artificial Classification Learning. We introduce Graded Classification Learning, a paradigm integrating category and quality judgments into response and feedback phases of a learning trial. This allows higher fidelity feature space exploration, aligning more with naturalistic learning processes. The ‘graded' learners showed superior performance, with higher final accuracy and steeper learning curves than the ‘traditional' learners. While aggregate response patterns appeared similar across conditions, profile analysis revealed an apparent gradedness in the traditional condition that was masked by an overwhelming preference for (bisecting) unidimensional strategies, whereas graded participants mostly exhibited genuine graded responses. These findings suggest traditional binary tasks may inadvertently hinder learning of graded structure and that incorporating quality judgments fosters robust category representations.

A metacognitive appraisal of quitting

Stopping decisions are frequently modeled as decisions to switch to alternative activities once the current activity stops being adequately rewarding, such as in optimal foraging theory, as well as more recent metacognitive models. However, the sense of stopping and making decisions in such frameworks is highly platonic, with both decisions and stopping actions occurring instantaneously. In contrast, the phenomenology of quitting actions that one is undertaking appears to be temporally extended and metacognitively challenging. We study the metacognitive covariates of quitting decisions made by chess players using a large database of chess games sourced from an online chess portal. Our analysis reveals that players tend to persevere when they are playing against stronger opponents and after having played poor moves. We also find that a history of quitting games makes players more likely to quit in future games, but that having recently quit in a game offers some protective effect against quitting. Finally, we find that quitting a game makes it more likely that a player will play a game again soon. We place these results in the context of modeling quitting as a metacognitive choice affected by multiple competing goals.

Can (A)I Change Your Mind?

The increasing integration of large language models (LLMs) based conversational agents into everyday life raises critical cognitive and social questions about their potential to influence human opinions. Although previous studies have shown that LLM-based agents can generate persuasive content, these typically involve controlled English-language settings. Addressing this, our preregistered study explored LLMs' persuasive capabilities in more ecological, unconstrained scenarios, examining both static (written paragraphs) and dynamic (conversations via Telegram) interaction types. Conducted entirely in Hebrew with 200 participants, the study assessed the persuasive effects of both LLM and human interlocutors on controversial civil policy topics. Results indicated that participants adopted LLM and human perspectives similarly, with significant opinion changes evident across all conditions, regardless of interlocutor type or interaction mode. Confidence levels increased significantly in most scenarios. These findings demonstrate LLM-based agents' robust persuasive capabilities across diverse sources and settings, highlighting their potential impact on shaping public opinions.

An Explorative Investigation into Leveraging LLMs to Predict University Students' Learning Motivation

Learning motivation is a key variable in learning. Therefore, its assessment has consistently been a popular research topic. While traditional methods like self-report still dominate, methods integrating passive mobile sensing have emerged, using smartphones to collect behavioral data and assess learning motivation via statistical and machine learning techniques. Recent advances in large language models (LLMs) offer new perspectives for psychological measurements, yet their application in learning motivation assessment remains underexplored. To bridge this gap, we propose a novel approach that integrates LLMs with passive mobile sensing to assess and predict students' learning motivation. We constructed our dataset using mobile sensing data and self-report measures, then designed zero-shot and few-shot tasks embedded in LLMs to evaluate the performance. Ultimately, our findings indicate the feasibility and highlight the potential of leveraging LLMs to predict learning motivation levels based on mobile sensing data.

A Time-Aware Mental State Space for Multimodal Depression Detection on Social Media

Detecting depression from user-generated posts on social media platforms offers significant potential for early intervention on at-risk individuals. Existing works mainly concentrate on text processing, and only a limited number incorporate images posted by users. These image-integrated methods face challenges in modeling the intricate relationships between textual and visual features. Besides, the absence of approaches that explore psychological trajectory of users by analyzing their posts over time leaves a critical gap in capturing the progression of depressive symptoms. In this paper, we propose A Time-Aware Mental State Space (T-M2S) for detecting depression from social media posts. We introduce a Cross-Modal Learning that effectively integrates text and image embeddings into sentiment-oriented unified representations. Additionally, we design a Mental State Space to analyze users' posts over time, offering a nuanced understanding of emotional dynamics. Extensive experiments on Twitter and Reddit datasets demonstrate that T-M2S significantly outperforms state-of-the-art methods. Code and models are available at GitHub.

Cognitive-Inspired Hierarchical Attention Fusion With Visual and Textual for Cross-Domain Sequential Recommendation

Cross-Domain Sequential Recommendation (CDSR) predicts user behavior by leveraging historical interactions across multiple domains, focusing on modeling cross-domain preferences through intra- and inter-sequence item relationships. Inspired by human cognitive processes, we propose Hierarchical Attention Fusion of Visual and Textual Representations (HAF-VT), a novel approach integrating visual and textual data to enhance cognitive modeling. Using the frozen CLIP model, we generate image and text embeddings, enriching item representations with multimodal data. A hierarchical attention mechanism jointly learns single-domain and cross-domain preferences, mimicking human information integration. Evaluated on four e-commerce datasets, HAF-VT outperforms existing methods in capturing cross-domain user interests, bridging cognitive principles with computational models and highlighting the role of multimodal data in sequential decision-making.

Humans Learn to Weight Evidence Unevenly Over Time

In perceptual decision-making tasks, humans integrate noisy sensory evidence over time to guide their choices. The optimal integration process assumes that all evidence is weighted equally within a trial and that different trials are independent. However, humans exhibit systematic deviations from optimality, including uneven weighting of evidence within trials and influences from previous trials. Prior studies have demonstrated that biological constraints can account for this suboptimality. In this study, we present evidence that humans adapt their evidence integration strategies over time in response to task demands, and that the suboptimal uneven weighting is gradually learned over the course of the task. By explicitly modeling this adaptation through online gradient-based learning, our model outperforms existing approaches in capturing human behavior and unifies both observed forms of suboptimality in the Click task: dependence across trials emerges from an error-driven learning process that also gives rise to uneven integration weights within trials. We further propose a bounded-rational adaptation account to explain why humans progressively learn to weight evidence unevenly within a trial. Our modeling framework provides a general approach of resource-rational adaptation. It captures how initially uninformed agents can gradually update their strategies through error-driven learning and is applicable to a broad range of learning and decision-making scenarios.

Examining the Robustness of Neural Correlates of Infants' Sociomoral Evaluations

Research has shown that infants prefer prosocial characters over antisocial ones, suggesting that sociomoral evaluation is early-emerging. However, some have argued that infants' preferential responses stem from low-level perceptual processes rather than true social understanding. Using electroencephalography (EEG), past work has suggested that motivational and social, but not attentional, processes are implicated in infants' responses to prosocial versus antisocial acts and individuals, however, the majority of past work utilized a single type of prosocial/antisocial interactions: helping a character to climb a hill. To test the generalizability of past neural findings from the hill paradigm, here we examined infants' responses in a distinct helping/hindering scenario in which a character tries but fails to open a box and is alternatively helped or hindered. Largely replicating past work, infants showed greater activity in social (indexed by the P400) but not attentional (indexed by the Nc) ERP components when seeing hinderers versus helpers, consistent with claims that infants' responses to prosocial and antisocial agents are social. No evidence of differential approach/avoidance motivation during prosocial/antisocial events was found. These findings support the role of social processes in infants' sociomoral evaluations.

Differential Memory for Belief-Congruent versus Belief-Incongruent Arguments Cannot Explain Belief-Driven Argument Evaluation

People often rely more on their prior beliefs than the presented evidence when evaluating arguments. We investigate the cognitive mechanisms underlying this phenomenon. We hypothesise that when individuals encounter an argument that is congruent with their beliefs, it activates related information in memory. For belief-congruent arguments, people should therefore be more likely to both correctly recognise previously encountered information and incorrectly recognise new information as previously seen. To test this, we first investigated the effect of participants' beliefs about political claims on their evaluation of corresponding arguments that varied in quality. We then employed a surprise memory test to assess participants' recognition memory for these arguments. While we replicated the finding that prior beliefs drive argument evaluations, prior beliefs did not affect memory performance for all arguments in the same way. Our results indicate that individuals may use prior beliefs to aid memory only when the memory task is difficult.

Towards a Formal Theory of the Need for Competence via Computational Intrinsic Motivation

Computational modelling offers a powerful tool for formalising psychological theories, making them more transparent, testable, and applicable in digital contexts. Yet, the question often remains: how should one computationally model a theory? We provide a demonstration of how formalisms taken from artificial intelligence can offer a fertile starting point. Specifically, we focus on the "need for competence", postulated as a key basic psychological need within Self-Determination Theory (SDT)—arguably the most influential framework for intrinsic motivation (IM) in psychology. Recent research has identified multiple distinct facets of competence in key SDT texts: effectance, skill use, task performance, and capacity growth. We draw on the computational IM literature in reinforcement learning to suggest that different existing formalisms may be appropriate for modelling these different facets. Using these formalisms, we reveal underlying preconditions that SDT fails to make explicit, demonstrating how computational models can improve our understanding of IM. More generally, our work can support a cycle of theory development by inspiring new computational models, which can then be tested empirically to refine the theory. Thus, we provide a foundation for advancing competence-related theory in SDT and motivational psychology more broadly.

Wanting to be Understood: Modeling Interaction in Early Language Learning

Human language acquisition involves diverse learning resources, including self-supervised learning (sequence prediction) and communicative interactions (talking to caregivers). While recent advancements in language models highlight the power of self-supervised learning, the role of communicative interaction remains unclear. This study uses Reinforcement Learning (RL) and parent-child agent simulations to model interactions and investigate their role in language acquisition, as well as whether RL-like mechanisms may function in children. We pretrained a small transformer model as a child agent, which then interacted with Google's Gemini, acting as a parent agent, to learn language with the goal of being understood. Model evaluations show that the interactive training enhances intelligibility of model's communication and increases behavioral similarity to real child speech. However, minimal pertaining alone provides noticeable syntactic and semantic competence, with RL yielding no consistent gains. These findings imply that interaction may play a more critical role in pragmatic aspects of language learning than in the development of linguistic structures, and that learning through interaction is a mechanism used by children.

Modeling Processing Speed in Developmental Language Disorder using Drift Diffusion Modeling

Children with Developmental Language Disorder (DLD) exhibit longer reaction times (RTs) than age-matched neurotypical children. Drift diffusion models estimate parameters influencing RT distributions: drift-rate represents speed of information accumulation, non-decision time represents other factors contributing to longer RTs, e.g., poor attention or motor coordination. Using a hierarchical Bayesian framework, we modeled RT data from visual search and mental rotation tasks completed by 3rd graders (N = 248). Children with impaired verbal abilities without accompanying nonverbal impairment (DLD) and those with global verbal/nonverbal impairments were compared to neurotypical children. Across tasks, children with DLD exhibited a lower drift rate than neurotypical children, indicating slower information accumulation, and no difference in non-decision time. Children with global impairments showed lower drift rates and higher non-decision times than neurotypical children. Results suggest directions for nuanced tests of the generalized slowing hypothesis of DLD. Clinical implications for diagnosis and treatment of DLD are discussed.

Language Control during Bilingual Word Production in Cantonese–English Speaking Autistic and Typically Developing Children

Being bilingual is becoming increasingly common for children worldwide, including those with autism spectrum disorder (ASD). While recent studies have examined the effect of bilingual experience on cognitive and linguistic abilities in autistic children, few have focused on how autistic status influences bilingual language switching and control. For the first time, the present study investigates language control during bilingual word production in autistic and typically developing children with a cued language switching paradigm. Results revealed that autistic children tended to make more cross-language mistakes and had more difficulties when switching between languages, while the overall naming latency and language mixing costs were similar across the two groups. The preliminary findings highlight the potential challenges encountered by autistic children on different levels of language control during bilingual production and also suggest that some aspects of language switching performance are comparable between the groups. Clinical implications are also discussed.

On Valence: A Self-Predictive Processing Model of Emotion Regulation

Emotion regulation is a fundamental process that shapes cognitive, affective, and behavioral responses to emotional stimuli. Traditional emotion regulation models conceptualize regulation as a sequential modulation of emotional responses. However, they do not fully explain how emotions are constructed in a way that allows such regulation to occur. Predictive processing (PP) provides a mechanistic framework for understanding emotion generation by proposing that the brain minimizes prediction errors (PEs) to optimize perception and behavior. Yet, standard PP accounts reduce valence to PE minimization, failing to explain how PEs can generate different subjective feelings. To address these limitations, we propose a valence-focused model of emotion regulation that integrates predictive processing with self-referential cognition. We incorporate emotional valence as interpretative processes of the self-model, which assigns emotional significance based on goals, values, and autobiographical context. This model bridges the gap between emotion generation and regulation, highlighting the dynamic interplay between prediction errors, subjective valuation, and self-referential processes. This approach not only advances theoretical understanding but also opens new avenues for computational modeling and empirical research into the adaptive and maladaptive aspects of emotional experience.

Self-supervised EEG Representation Learning based on Temporal Prediction and Spatial Reconstruction for Emotion Recognition

Affective Brain-Computer Interfaces has achieved remarkable advancements, enabling researchers to interpret labeled EEG data accurately. However, the annotation of EEG data is time-consuming and requires substantial effort, which limits the application in practical scenarios. In this paper, we propose a self-supervised EEG representation learning framework based on temporal prediction and spatial reconstruction (EEG-TPSR) to learn EEG representations from a large amount of unlabeled data. Our model consists of two stages: 1) In the pre-training stage, we use contrastive temporal prediction and spatial reconstruction as proxy tasks, which utilize the spatio-temporal information to learn the generic representations from EEG data; 2) In the fine-tuning stage, few data is used to calibrate the pre-trained model. We conduct extensive experiments on three emotion EEG datasets. The results demonstrate that our proposed model achieves excellent performance, with over 20% relative accuracy improvement and more than 15% improvement using only 1% labeled data.

The Effects of Late Sign Language Acquisition on Emotion Recall and Expression in Deaf Children

Children's emotional development is linked to language development for typically developing children and deaf children with native sign language exposure. However, approximately 90% of deaf children are born to hearing parents who are not familiar with sign language. These deaf children begin learning a sign language when they attend a school for the deaf. Late sign language exposure has negative consequences on several aspects of language development. We investigate whether acquiring sign language late affects children's emotion recall and channel of emotion expression. After watching a silent video depicting emotions, late- and native-signing deaf children retold the story in Turkish Sign Language. Results showed that late signers recalled fewer emotions and used fewer signs and facial expressions compared to native signers. Manual gestures (non-sign hand movements), head and body movements did not differ across groups. The findings suggest that late sign language acquisition negatively impacts deaf children's ability to recall and express emotions, highlighting the importance of early language exposure for the development of emotion recall.

How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Creativity assessment in science and engineering is increasingly based on both human and AI judgment, but the cognitive processes and biases behind these evaluations remain poorly understood. We conducted two experiments examining how including example solutions with ratings impact creativity evaluation, using a finegrained annotation protocol where raters were tasked with explaining their originality scores and rating for the facets of remoteness (whether the response is ``far'' from everyday ideas), uncommonness (whether the response is rare), and cleverness. In Study 1, we analyzed creativity ratings from 72 experts with formal science or engineering training, comparing those who received example solutions with ratings (example) to those who did not (no example). Computational text analysis revealed that, compared to experts with examples, no-example experts used more comparative language (e.g., ``better/worse'') and emphasized solution uncommonness, suggesting they may have relied more on memory retrieval for comparisons. In Study 2, parallel analyses with state-of-the-art LLMs revealed that models prioritized uncommonness and remoteness of ideas when rating originality, suggesting an evaluative process rooted around the semantic similarity of ideas. In the example condition, while LLM accuracy in predicting the true originality scores improved, the correlations of remoteness, uncommonness, and cleverness with originality also increased substantially --- to upwards of $0.99$ --- suggesting a homogenization in the LLMs evaluation of the individual facets. These findings highlight important implications for how humans and AI reason about creativity and suggest diverging preferences for what different populations prioritize when rating.

Content-agnostic online segmentation as a core operation

We approach the problem of explaining segmentation --- the human capacity to partition input streams into representations of appropriate form and content for efficient downstream processing --- by exploring a theoretically minimalistic and computationally plausible account of phoneme-to-word chunking. Through computational models, mathematical proofs, algorithm design, and observer model simulations in two languages, we suggest that online segmentation can be guided by content-agnostic properties of internal memory structures (i.e., lexicality and length type frequency). Our theoretical and empirical findings point to a formal link between such properties with practical performance benefits. Together, these contributions make progress on a fully explicit computational- and algorithmic-level account with plausible implementational-level primitives.

Is the Past a Different Culture? Tracking Changes in Prosodic Features of Child-Directed Broadcasting Across Six Decades

While research has explored cross-cultural variation in child-directed speech (CDS), little is known about if and how it may have changed over time. We explore whether CDS has undergone historical change by analyzing prosodic features in child-directed (CD) broadcasts from a German children's bedtime program (1959–present) and comparing them to adult-directed (AD) weather forecasts from the same period. The program originated in East Germany and continued after German reunification in 1990, potentially reflecting a socio-cultural shift toward more child-centric attitudes characteristic of Western liberal democracies. Pitch variation in CD broadcasts, although higher than in AD broadcasts, remained stable over time. In contrast, articulation rates showed no register difference pre-1990; only after 1990 did CD broadcasts exhibit the slower articulation rates typical of CDS. This suggests that some features of CDS may be subject to cultural evolution over historical time, which can be accelerated by major historical events.

Dual-Path Parallel Graph Convolution Combining Brain Region Partitioning and Data-Driven Learning for EEG Emotion Recognition

Electroencephalogram (EEG) has become an important indicator reflecting emotions. Due to its natural graph structure characteristics, it has made significant progress in the emotional recognition using graph convolutional networks (GCN). However, existing methods face limitations: (1) insufficient integration of psychological prior knowledge, limiting the utilization of brain activity patterns, and (2) simplistic node relationship construction, neglecting the universality and functional connectivity of brain regions. Therefore, we propose a dual-path parallel graph convolutional network (DP-GCN). The first path leverages psychological prior knowledge to segment electrodes into brain regions and employs an attention mechanism to integrate features. The second approach employs a data-driven method, using a sparse stacked autoencoder to reconstruct brain region features, while a learnable, input-independent adjacency matrix captures EEG patterns associated with emotions. Finally, a cross-attention mechanism integrates features from both paths. DP-GCN has been evaluated on public dataset, achieving an accuracy of 82.69%±4.16%, demonstrating its competitive performance.

Difference in the Cognitive Mechanism of Predictive Processing in Computer-Mediated Communication: A Comparison Study of L2 Speakers

Previous studies of predictive mechanisms in computer-mediated communication (CMC) suggested that native speakers (L1) rely on auditory cues and emotion in conversation processing. To understand how the prediction mechanism differs for non-native speakers (L2) in CMC, this study assessed how the loss of multi-modal cues affects word predictability in turn-taking, considering language background and social factors. L2 watched videos, listened to audio, or read a transcript of conversations, and predicted the same set of omitted words with different levels of predictability and semantic relatedness in different CMC. Results showed that, similar to the L1 study in He et al. (2025), higher response similarity but longer response time were observed in conditions with richer cues in L2 predictive processing. Semantic relatedness, self-emotion, attention, and language proficiency did not affect predictability. Participants reporting negative emotions and more limited L2 exposure demonstrated reduced prediction accuracy, particularly in cue-rich environments. These findings expand our understanding of L2 predictive processing in CMC by highlighting how multimodal cue integration operates differently for L2. The results have implications for developing communication technologies and language pedagogies tailored to L2 across various mediated communication contexts.

Common Ground Building through Generative Cognitive Modules: Examining the Roles of Initial Perception, Imaging and Captioning

To advance our understanding of referential communication and common ground formation, this study presents a novel generative cognitive model that integrates deep neural networks for visual perception, image generation, and language captioning. Using the Tangram Naming Task (TNT), we simulate the sender–receiver interaction with modular processes replicating holistic cognitive strategies. Through controlled simulation experiments, we reveal that language generation plays a more crucial role than visual perception in establishing common ground, while intermediate image generation enhances linguistic diversity—a key aspect of natural communication. Our results bridge cognitive modeling and large generative models, demonstrating how internal cognitive dynamics can be visualized and quantitatively evaluated. This study contributes to the growing field of cognitive-inspired human–AI communication and provides a blueprint for grounding-rich simulations in collaborative tasks.

Assessing the Role of Attention in the Animacy Effect Through Directed Forgetting

A growing body of research suggests the presence of an attentional component in the memory advantage of animate items. Most research on this animacy effect is focused on remembering, but its effects on forgetting are less well-researched. With the use of pupillometry, we investigated the attentional processes present in the selective remembering and directed forgetting of animate and inanimate items. More specifically, we investigated whether external and internal attention are affected by the animacy status of to-be-remembered and to-be-forgotten words in an item-method directed forgetting task with retro cues, followed by a recognition task for all previously presented items. Our behavioral results demonstrate the directed forgetting effect: accuracy on the recognition task was higher for to-be-remembered words than for to-be-forgotten words. Additionally, we found an advantage of animacy, with higher accuracy for animate words than for inanimate words, regardless of cue. The pupillometry results demonstrate differential internal attention for to-be-remembered and to-be-forgotten words during cueing, with larger pupil sizes for remember cues than for forget cues, but no effect of animacy. We did however find an animacy effect on external attention during stimulus presentation, with smaller pupil sizes for animate words than for inanimate words, which may reflect precision of encoding. This suggests that not internal attention, but external attention is influenced by animacy status, potentially explaining the memory advantage of animate items through the richness-of-encoding hypothesis.

Bilinguals exhibit semantic convergence while maintaining near-optimal efficiency

Systems of semantic categories vary across languages, but this variation appears to be constrained by pressure for optimizing a complexity-accuracy tradeoff known as the Information Bottleneck (IB) principle. This finding, however, has been based primarily on individual languages and it remains largely unknown how bilinguals navigate the category systems of two different languages, particularly when these languages' category boundaries do not overlap. Here, we address this gap in the literature by combining theory-driven experiments with an extension of the IB framework to bilinguals. Specifically, we investigate bilingual vs. monolingual category boundaries in English and Mandarin via a two-alternative forced-choice (2AFC) labeling task on six continua that interpolate between two distinct everyday objects (e.g., plate and bowl). We find that: (1) bilinguals do not maintain two monolingual-like systems but rather exhibit a converged semantic system influenced equally by both languages; and (2) this departure from monolinguals is nonetheless constrained by the same pressure for efficiency that operates in monolinguals. These findings provide new insight into how bilinguals navigate cross-linguistic semantic variation and suggest that despite having to accommodate myriad sociolinguistic factors, a drive for efficiency is also a key factor that shapes bilingual category systems.

Linking Strategies to Think Aloud in A Stochastic Learning Task

Understanding human thoughts is a key goal of cognitive science. Behavioral observations alone limit insight into cognition. The think-aloud protocol, where participants verbalize thoughts, offers a direct probe into reasoning but is underutilized due to challenges in subjectivity and scalability. Advancements in natural language processing (NLP) enable computational analysis of think-aloud data, yet little work explores its role in strategy learning. We test whether think-aloud reports reveal strategy use in a stochastic learning task where participants verbalized their strategies. Our results show diverse strategy usage, with a preference for persistent choices. Think-aloud analysis suggests participants rely on distinct meta-strategies to guide learning. Clustering and predictive modeling reveal strong alignment between choices and verbalized strategies. These findings highlight think-aloud as a scalable tool with NLP techniques for studying high-level cognition, shedding light on a promising paradigm for cognitive sciences.

Linking Production of Mandarin Tonal Contrasts with Musicality in Adult Learners

Mandarin Chinese is a tonal language where variations in voice pitch distinguish word meaning. Acquiring tonal contrasts presents challenges for adult second-language learners. Undergraduates (N = 59) completed a computer-assisted language learning protocol, where they engaged in listening and repeating Chinese disyllabic nouns and matching them with corresponding pictures. Tone production accuracy was assessed at pretest/posttest using complexity invariant distance (CID), a quantitative metric of the distance between time series (learner vs. native-speaker productions). Word-level analyses found lower CID scores at posttest, indicating improvements in tone production after three blocks of word-picture matching. Fine-grained syllable-level analyses showed lower CID scores for first vs. second syllables, suggesting a primacy advantage. Accuracy on the Music Ear Test predicted lower CID scores, linking musicality with aptitude in learning tonal contrasts. No effects of nonverbal intelligence or language background were found. CID offers a robust method of assessing tone production accuracy for future studies.

Influence of a Partner's Behavioral Process on the Sense of Joint Agency During Collaborative Task

We frequently interact with others daily and experience a sense of joint agency—the feeling of performing an action together. Recent studies suggest that this sense of joint agency is influenced by the perceived "human-likeness" of partner. This study examined how a partner's behavioral process, specifically adaptation and fluctuation, affects joint agency in a cooperative task mediated by human-likeness. Participants completed a cursor-tracing task simulating collaboration, with cursor movement determined by combining their input with pre-recorded data. In this experiment, adaptation was approximated by preprogrammed changes in the cursor movement. The results revealed that adaptation enhanced joint agency, whereas fluctuation had no significant effect. Human-likeness is thus positively correlated with joint agency. Moreover, individual traits such as extraversion and attachment shaped these perceptions in unexpected ways. Poor task performance increases joint agency. These findings contribute to this field by identifying factors that influence the sense of joint agency.

Reducing Negative Attitudes Towards Immigrants – The Role of Prior Attitudes and Argument Style

Xenophobia and anti-immigrant sentiments have been increasing in Western democratic countries, and it is important to understand how messaging can improve attitudes towards immigrants. Past studies show prior attitudes are associated with how individuals evaluate related arguments. The present study (N = 349) explores if people's prior attitudes influence how they evaluate the strength of arguments in the context of immigration. We also test whether the style of argument (i.e., narrative or statistical) influences argument evaluation. We measured participants' attitudes towards immigrants before and after an argument evaluation task, where participants rated the quality of a narrative and statistical argument. Participants with high pre-existing negative attitudes towards immigrants rated pro-immigrant arguments poorly and anti-immigrant arguments strongly, and we see the opposite relationship for participants with pre-existing positive attitudes towards immigrants. Our findings demonstrate that people can evaluate the same arguments about immigrants very differently depending on their pre-existing attitudes and that argument style can affect argument evaluation.

Exploring the Speech-to-Song Illusion: A Comparative Study of Standard Korean and Dialects

The Speech-to-Song Illusion (STS) phenomenon, where repeated short speech utterances transform into perceived song, has drawn attention to its underlying mechanisms and cross-linguistic differences. This study examines the STS effects among Korean speakers, comparing standard Korean (non-tonal) and dialects such as Gyeongsang (pitch-accent, tonal) and Jeju (non-tonal but intonation-rich), which exhibit varying levels of linguistic tonal features. Participants (N = 60), evenly divided between standard and dialect users, evaluated 180 auditory stimuli comprising standard Korean, Gyeongsang, and Jeju utterances under controlled repetition conditions. Results revealed significant STS effects across all groups and stimuli, with stronger effects observed for dialectal stimuli, particularly Jeju, compared to standard Korean. Interestingly, differences between standard and dialect speaker groups in STS perception were not statistically significant, suggesting that exposure to diverse linguistic environments, facilitated by modern Korean media, may homogenize perceptual responses to tonal variations. The study highlights the influence of tonal and rhythmic elements in STS perception and underscores the cultural and linguistic uniqueness of Korean as a fertile ground for exploring auditory illusions. This research contributes to understanding the interplay of linguistic and perceptual factors in STS and opens avenues for cross-cultural comparisons and neuroscientific investigations of auditory illusions.

A Cognitively Plausible Visual Working Memory Model

Visual working memory (VWM) plays a fundamental role in cognitive processes, such as perception, attention, and reasoning. However, existing approaches to modelling VWM are not integrated into cognitive architectures and lack interpretability with respect to their parameters. To address this limitation, we propose a novel VWM model based on the well-established Semantic Pointer Architecture (SPA). In contrast to previous works, our model is the first to integrate a VWM model with a cognitive attention model. It only requires three interpretable hyper-parameters: spatial capacity, feature certainty, and memory decay. We experimentally show that our base model without memory decay replicates the set-size effect and swap errors of human data on a continuous reproduction task. More importantly, we show that by introducing a memory decay, we can achieve a statistically significant (p ≪ 0.001) improvement in model fit, suggesting a potentially important role of memory decay in VWM. Further, our VWM model can be easily extended to model pre- and post-cue conditions, consistently achieving KL divergence between modelled and human performance of less than 0.05.

Hierarchical Cognitive Graph Autoencoder for Multi-Agent Reinforcement Learning

Communication is essential for enhancing the cognition and cooperation of agents in multi-agent reinforcement learning (MARL). However, existing methods often rely on predefined and rigid cognitive patterns, which cannot adapt to dynamic environmental changes and complex inter-agent interactions. In this work, we introduce the Hierarchical Cognitive Graph Autoencoder (HCGA), an adaptive framework that addresses these limitations. HCGA represents inter-agent messages as nodes in a graph with learnable edges, employs a grouping mechanism to integrate related local information into compact latent representations, and then applies hierarchical aggregation to construct a comprehensive global cognition. This approach effectively distills essential information and adaptively uncovers cognitive patterns from dynamic environments, thereby enhancing the overall robustness and efficiency of cognitive processing in MARL tasks. Experimental results demonstrate that HCGA significantly outperforms state-of-the-art methods across various MARL tasks, highlighting its robustness, adaptability, and efficiency.

The Ungrounded Alignment Problem

Modern machine learning systems have demonstrated substantial abilities with methods that either embrace or ignore human-provided knowledge, but combining benefits of both styles remains a challenge. One particular challenge involves designing learning systems that exhibit built-in responses to specific abstract stimulus patterns, yet are still plastic enough to be agnostic about the modality and exact form of their inputs. In this paper, we investigate what we call The Ungrounded Alignment Problem, which asks How can we build in predefined knowledge in a system where we don't know how a given stimulus will be grounded? This paper examines a simplified version of the general problem, where an unsupervised learner is presented with a sequence of images for the characters in a text corpus, and this learner is later evaluated on its ability to recognize specific (possibly rare) sequential patterns. Importantly, the learner is given no labels during learning or evaluation, but must map images from an unknown font or permutation to its correct class label. That is, at no point is our learner given labeled images, where an image vector is explicitly associated with a class label. Despite ample work in unsupervised and self-supervised loss functions, all current methods require a labeled fine-tuning phase to map the learned representations to correct classes. Finding this mapping in the absence of labels may seem a fool's errand, but our main result resolves this seeming paradox. We show that leveraging only letter bigram frequencies is sufficient for an unsupervised learner both to reliably associate images to class labels and to reliably identify trigger words in the sequence of inputs. More generally, this method suggests an approach for encoding specific desired innate behaviour in modality-agnostic models.

Reducing Traumatic Memory Intrusions by Timing Their Re-Encoding: An Application of Computational Modeling to Mental Health

Intrusive memories are disruptive to daily functioning and detrimental to well-being; unfortunately, the presence of these memories is a defining characteristic of post-traumatic stress disorder (PTSD). Although increasingly understood as a memory disorder, current trauma management and recovery strategies for PTSD do not often take principal memory theories into account. Many common practices, such as delayed processing after trauma exposure and spaced therapy sessions, might inadvertently strengthen the retention of intrusive memories in the long-term. In this paper, model simulations show that altering the timing of different presentations of emotional stimuli might affect subsequent intrusions. Experimentally, we demonstrated through a two-day within-subject image presentation task that when emotional images are presented in spaced intervals (as opposed to consecutive, "massed" presentation), the perceived frequency of intrusions for the mass-presented emotional images during the 24 hours after first exposure were significantly lower than spaced images. Our study presents a novel strategy that can potentially mitigate the frequency of intrusive post-traumatic memories, highlighting the advantages of translational applications of computational cognitive models to mental health.

Alpha band activity over the sensorimotor cortex during passive music listening correlates with beat tapping performance

Theories of music perception argue over whether observed motor area activation during passive music listening actively contributes to perception or is the product of a distributed representation. There is a growing amount of evidence linking Alpha rhythms in the motor cortex to action inhibition and imagination during passive music listening. In this work, we examine Alpha band power modulation and its association to beat perception using a sensorimotor synchronization task and a natural music listening task with electroencephalography (EEG). We sought to find an association between Alpha band modulation over the primary motor cortex and beat tapping performance. We found that greater Alpha power correlated with worse tapping performance. These results may point to a negative association between motor inhibition and beat perception and a complementary positive association between movement imagination and beat perception and production. We address these findings in terms of the HAPEM theory proposed by Schubotz (2003). This framework suggests that motor activation reflects a predictive representation formed from audio-motor association cortices, lacking proprioceptive information which could be acquired through musical training.

Preschoolers Compute Literal and Pragmatic Meanings of Conditionals with Contextual Support

Understanding conditional inferences is fundamental to human reasoning, allowing us to predict the consequences of actions. For instance, the conditional, "If you eat your broccoli, you'll get a candy" can be interpreted literally, meaning eating broccoli is one way to get a reward, or pragmatically, implying it is the only way. Past studies show school-aged children (ages 7-12) struggle to arrive at literal meanings but, interestingly, compute adult-like, pragmatic interpretations at this age. A key limitation of past research is the lack of testing in contexts that favor literal meanings. We conducted two studies to examine whether children can derive literal interpretations when supported by context, focusing on scenarios where adults prefer literal over pragmatic interpretations. We found that preschoolers, as young as 4 years old, are adult-like in computing literal meanings of conditionals when contextually supported, and also can arrive at pragmatic meanings of conditionals. These findings inform theories of logical reasoning and implicature acquisition.

Polite Speech Generation in Humans and Language Models

When we give feedback, we face a delicate balancing act – we want to convey accurate information, but we also don't want to hurt someone's feelings. While computational pragmatic models have elegantly shown how politeness emerges from these principles, they've mainly focused on choices from limited predefined responses. Large language models (LLMs) enable the study of open-ended politeness strategies, but their ability to balance informational and social goals like humans remains uncertain. First, replicate previous work using restricted utterance sets, finding that sufficiently large LLMs (≥70B parameters) capture key human politeness patterns, particularly the strategic use of negation. We then extend this investigation to open-ended contexts, collecting and evaluating naturalistic feedback from both humans and LLMs. Surprisingly, human evaluators preferred LLM responses, which demonstrated sophisticated goal sensitivity and diverse politeness tactics. These findings suggest remarkable pragmatic competence in LLMs' polite language generation while raising questions about the underlying mechanisms.

Reconstruction of Time-Varying Appeal Inputs that Induce Blink Rate Synchrony

The time-varying appeal of an audiovisual stimulus cannot be directly observed because it is not only determined by the expression itself but also involves the viewer's information processing. In this study, we attempted to reconstruct a time-varying common input that is an appeal to induce blink rate synchrony, i.e., blink rate is suppressed at appealing scenes. In the experiment, 44 (22 male and 22 female) university students watched two videos promoting a local area in Japan while detecting blinks using an eye-tracking device. The results showed that the reconstruction ability was less dependent on the embedding parameters. The results showed that the peak of the reconstructed common inputs did not always correspond to the most impressive scene. In the future, it would be beneficial to apply this method to physiological index data according to the type of attractiveness.

Base Rate Neglect in Linguistic Category Learning

This paper presents a categorization experiment supporting the hypothesis that base-rate-neglect occurs in linguistic category learning (and, thus, is a cross-domain phenomenon), and that it is more likely for those learners who engage in explicit problem-solving, rather than implicit learning. We find that among those participants who were able to verbalize the cue that was probabilistically associated with category-membership (correct-staters) in a phonological learning task, about a third respond in a way consistent with base-rate-neglect. On the other hand, non-staters are more likely to respond randomly or by probability matching the base-rates. These results suggest that explicit learning is associated with base-rate-neglect to a greater extent. We found that no learners integrate the two probabilistic patterns present in the experiment into a single Bayesian estimate. Rather, some of them focus on the base-rates, others focus on category-internal cues, and others simply fail to learn anything.

From Curiosity to Competence: How World Models Interact with the Dynamics of Exploration

What drives an agent to explore the world while also maintaining control over the environment? From a child at play to scientists in the lab, intelligent agents must balance curiosity (the drive to seek knowledge) with competence (the drive to master and control the environment). Bridging cognitive theories of intrinsic motivation with reinforcement learning, we ask how evolving internal representations mediate the trade-off between curiosity (novelty or information gain) and competence (empowerment). We compare two model-based agents using handcrafted state abstractions (Tabular) or learning an internal world model (Dreamer). The Tabular agent shows curiosity and competence guide exploration in distinct patterns, while prioritizing both improves exploration. The Dreamer agent reveals a two-way interaction between exploration and representation learning, mirroring the developmental co-evolution of curiosity and competence. Our findings formalize adaptive exploration as a balance between pursuing the unknown and the controllable, offering insights for cognitive theories and efficient reinforcement learning.

Implicit and Explicit Knowledge after Limited Exposure to Artificial Grammars of Various Complexity

Research on implicit learning using the artificial grammar learning (AGL) paradigm has traditionally relied on tasks that promote active engagement, such as memorization, repetition, or rule discovery during the exposure phase. This study examined whether limited exposure, devoid of active engagement tasks, enables participants to distinguish between grammatical and ungrammatical sequences in both simple and complex artificial grammars. Participants performed above chance on the grammaticality task across both conditions but appeared to rely on explicit strategies to a greater degree than reported in previous AGL studies. These findings highlight the critical role of exposure conditions and suggest that exposure to letter strings without active engagement may not sufficiently restrict learning to implicit processes.

"There Is No Such Thing as a Dumb Question," But There Are Good Ones

Questioning has become increasingly crucial for both humans and artificial intelligence, yet there remains limited research comprehensively assessing question quality. In response, this study defines good questions and presents a systematic evaluation framework. We propose two key evaluation dimensions: appropriateness (sociolinguistic competence in context) and effectiveness (strategic competence in goal achievement). Based on these foundational dimensions, a rubric-based scoring system was developed. By incorporating dynamic contextual variables, our evaluation framework achieves structure and flexibility through semi-adaptive criteria. The methodology was validated using the CAUS and SQUARE datasets, demonstrating the ability of the framework to access both well-formed and problematic questions while adapting to varied contexts. As we establish a flexible and comprehensive framework for question evaluation, this study takes a significant step toward integrating questioning behavior with structured analytical methods grounded in the intrinsic nature of questioning.

Quantitative Qualitative Correspondence in Grammaticalization

The gradual nature of historical language change is widely acknowledged. We explore a syntactic change model that offers a new view into the theoretical difference between classical and neural network claims about language encoding. Most prior treatments of grammaticalization fail to account for how, exactly, new forms arise, focusing instead on change following innovation. A phenomenon relevant to the innovation puzzle is Quantitative Anticipation of Qualitative Change in Grammaticalization (QAQCG)---gradual statistical changes anticipate structural changes. Although prior researchers have given phenomenological descriptions, we know of no rigorous method for testing whether QAQCG exists. Here, we quantitatively examine the case of English "a lot" which has grammaticalized an Adverb function from a Noun Phrase function. A simple feedforward neural network implements QAQCG, predicting a curving trajectory in probability space. Bayes Factor analysis supports the network over a classically-motivated linear model, highlighting continuity and nonlinearity as distinctive theoretical claims.

Who notices object repeats? Individual differences in inner experience influence repetition priming

Category labels such as `dog' and `green' appear to induce more categorical representations--highlighting category diagnostic features and helping to distinguish category members from non-members. Here we investigate whether covert language use has a similar effect by taking advantage of natural variation in people's reported use of inner speech. To measure categoricality, we use a repetition-priming task in which people make a semantic judgment of repeated images. We find a robust repetition effect of categories such that people are faster to respond to a cat if they have seen a previous image of a cat. These differences in inner speech are not associated with differences in repetition priming, but interacted in complex ways with differences in visual imagery and susceptibility to a verbal interference task.

Self-verification and the perceived reliability of uncertain feedback sources

People often have a preference for "self-verifying" feedback that confirms their existing self-views. Self-verification can reinforce existing self-views and prevent opportunities to learn from alternative perspectives, as when people with low self- esteem prefer feedback that validates negative self-beliefs. Past work suggests that a major driver of self-verification is a desire for accurate self-assessment, where disconfirmatory feedback that contradicts existing self-views creates doubt about the credibility of the feedback source. The aim of this study was to develop a formal account of self-verification based on a Bayesian model of source reliability. Findings from a behavioral experiment aligned with the model's prediction that confirmatory feedback about traits central to one's self-concept enhances the perceived reliability of a source, while disconfirmatory feedback leads to lower reliability and disinterest in further feedback. This approach clarifies why seemingly biased feedback seeking behaviors may be motivated by rational epistemic concerns about source credibility.

Can reasoning make you humble? Experimental tests to improve intellectual humility

In the present study, we tested whether inducing people to reflect on their knowledge may increase their intellectual humility. We hypothesized that asking participants to answer knowledge tests would prompt them to recalibrate their perception of their own knowledge, thereby fostering intellectual humility. Study 1 demonstrated a significant increase in intellectual humility following the intervention, whereas Study 2 replicated and extended these findings in a larger sample, confirming the effect despite its small magnitude. The observed increase may be due to the activation of analytical reasoning style or to the acknowledgement of one's knowledge limitations. However, further research is needed to corroborate these conjectures and explore the long-term effects of interventions to enhance intellectual humility.

Balancing Conventional Pairings and Semantic Fit: Classifier Production in Mandarin-Speaking Children

This study examined how classifier-noun conventions and classifier semantic compatibility influence the selection of classifiers in Mandarin- speaking children aged five to seven. Results indicated that children's classifier use was shaped not only by conventional associations between classifiers and noun categories but also by semantic congruence, particularly in the absence of explicit noun labels. Older children demonstrated greater sensitivity to labels in guiding classifier selection than younger children. Furthermore, explicit noun labels most strongly boosted children's choice of conventional classifiers for non‑prototypical stimuli, while having a much smaller impact on prototypical stimuli. Overall, these results highlight the interplay between memorization and semantic compatibility in classifier acquisition and underscore the importance of semantic and perceptual features in shaping language learning.

Effects of jointly recalling emotions in dyads on emotional valence and arousal: A preregistered study using Light Detection and Ranging (LiDAR)

Emotions play a crucial role in social interactions, yet little is known about the effects of experiencing emotions together with others on expressed emotional valence and arousal. We compared changes in adults' body posture when recalling experiences of positive and negative basic emotions (happiness & sadness) and social emotions (pride & shame), either jointly (dyadic condition) or by themselves (individual condition). To capture the dynamic unfolding of the emotional experience, we used a novel depth sensor imaging technique based on LiDAR- technology integrated in a commercial tablet. Adults (N = 80) displayed greater postural chest-height elevation and upper-body chest expansion (measuring valence) following positive compared to negative emotion recalls. Furthermore, participants showed more overall movement (measuring arousal) after positive compared to negative emotions, especially in the dyadic condition. These results suggest that recalling emotions together affects non-verbal expression of emotions, and we discuss our findings in light of recent advances in emotion science.

Spatial language and intuitive physics in children and adults: It's not so simple

Do simple spatial terms such as in or on map directly to intuitive physical judgements about spatial relationships between objects that underlie these terms' meaning? We explored this question in the domain of physical support. Adults (N=120) and 4-year-old children (N=42) were shown videos in which a puppet placed an L-shaped object in contact with a table at locations that varied in whether the object was supported or not. Half of the participants were asked for linguistic judgments ("Is X on Y?") and half were asked for intuitive physics judgments ("Will X fall if (agent) lets go?"). Results revealed that linguistic judgments were largely categorical, with child and adult participants labeling objects as on even when the object was not truly supported. In contrast, intuitive physics judgments aligned closely with the object's actual possibility of true support. However, responses also varied by the orientation of the L-shaped object, with on applying categorically to a regularly oriented L, but in a more graded fashion for a mirror image oriented L. Our findings suggest that the mapping between the simple spatial term on and physical reasoning systems are not completely coupled, and that the ways in which language draws on intuitive physical reasoning is complex.

Cognitive Mechanisms in Loan Marketing: Insights from Concept Mappings

Understanding the cognitive mechanisms of loan borrowers at peer-to-peer (P2P) loan platforms is helpful for improving communication strategies during loan marketing. Recent research studies the cognition of loans based on field studies or interviews with limited research samples, which cannot gain comprehensive cognitive insights in real-world context. In this work, we use a concept mapping-driven method and a large amount of real-world data for loan cognitive analysis. We find that there are statistical differences between the concept mappings of loan borrowers who received lenders' support and those who did not. We also found the representative conceptual factors of borrowers that impact the lenders' decision-making. For example, lenders are more inclined to support borrowers who present a mindset focused on stability, adaptability, and purposeful transformation, whereas non-recipients often express uncertainty, lack of readiness, or superficial changes in their loan requests. Applicants with an abstract cognitive orientation tend to request loans with higher interest rates, in contrast to those with a more concrete conceptualization. In practice, loan borrowers and lenders can refer to the cognitive findings to support their marketing and decision-making.

Towards a curriculum for neural networks to simulate symbolic arithmetic

Understanding and operating on multi-digit numbers is a critical step in the development of mathematical skills and concepts. Empirical and computational modeling evidence suggests that multi-digit numbers are processed decomposed into units, tens, hundreds, etc. for magnitude comparison. Yet, there is currently no computational model to simulate multi-digit number arithmetic. Accordingly, we developed a neural network model with three interconnected modules reflecting single-digit additions, recognition of a carry-over, and decision-making. We then compared model performance after following two training curricula: i) a \textit{ step-by-step} curriculum, where single-digit addition are trained before progressing to multi-digit problems and ii) an \textit{all-at-once} curriculum, where all modules of the model were trained simultaneously. Our results indicated that only the \textit{step-by-step} curriculum made the model learn multi-digit addition successfully as reflected by replicating empirical effects of carry-over and problem size. These findings highlight the importance of structured, incremental learning in both cognitive modeling and education.

The Influence of Generics on Inherent Reasoning and the Endorsement of Gender Stereotypes

The transition from descriptive regularities to prescriptive expectations is linked to the inherence heuristic (a cognitive shortcut attributing observed associations to inherent properties), reinforcing the perception of internal characteristics as defining features of social categories and contributing to gender stereotype endorsement. We investigated how inherent reasoning and moderating factors (i.e. generics and individual characteristics) influence the endorsement of gendered activities. Using a 3 (framing: generics vs. "most" vs. "some") × 2 (typicality: typical vs. countertypical gender associations) design, 241 French participants provided descriptive and prescriptive judgments about gendered associations, with justifications coded for inherence. Results showed that generic statements increased prescriptive judgments and reliance on inherent reasoning compared to "most" statements. Inherent justifications increased prescriptive judgments for typical and reduced them for countertypical gender associations. Inherent reasoning fully mediated the effect of generics on prescriptive judgments. These findings underscore the role of language and cognition in sustaining normative gender expectations.

Fast and robust Bayesian inference for modular combinations of dynamic learning and decision models

In cognitive neuroscience, there has been growing interest in adopting sequential sampling models (SSM) as the generative choice function for reinforcement learning (RLSSM) to jointly account for decision dynamics within and across trials. However, such approaches have been limited by computational tractability due to lack of closed-form likelihoods for the decision process or expensive trial-by-trial evaluation of complex reinforcement learning (RL) processes. We enable hierarchical Bayesian estimation for a broad class of RLSSM models, using Likelihood Approximation Networks (LANs) in conjunction with differentiable RL likelihoods to leverage fast gradient-based inference methods including Hamiltonian Monte Carlo or Variational Inference (VI). To showcase the scalability and faster convergence with our approach, we consider the Reinforcement Learning - Working Memory (RLWM) task and model with multiple interacting generative learning processes. We show that our method enables accurate recovery of the posterior parameter distributions in arbitrarily complex RLSSM paradigms, and moreover, that in comparison, fitting data with the equivalent choice-only model yields a biased estimator of the true generative process. Moreover, leveraging the SSM with efficient inference allows us to uncover a heretofore undescribed cognitive process within the RLWM task, whereby participants proactively adjust the decision threshold as a function of WM load.

How grammatical gender supports efficient communication

The apparent redundancy of grammatical gender systems presents a puzzle to which information theory offers a solution: since nouns are the least predictable part of language, dividing nouns into semi-arbitrary classes reduces the uncertainty associated with them. Corpus studies show both how gendered articles make nouns more predictable, and how in languages that lack noun class, prenominal adjectives serve a similar function. This raises the question of whether language users are sensitive to this information and actually employ it in communication. In an elicitation study, we manipulated the contextual information provided by gendered articles to German speakers, and compared their behavior to English speakers (whose articles are always uninformative). When German articles were uninformative, German and English speakers produced prenominal adjectives at the same rates. However, when articles were informative, German prenominal adjective production decreased. These results illustrate how languages use both articles and prenominal adjectives to support communicative efficiency.

How Panel Layouts Define Manga: Insights from Visual Ablation Experiments

Manga has gained global popularity, yet how its visual elements, such as characters, text, and panel layouts, reflect the uniqueness of individual works remains underexplored. This study investigates the contribution of panel layouts to manga identity through both quantitative and qualitative analysis. We trained a deep learning model to classify manga titles based solely on facing page images, and performed ablation experiments by removing characters and text, retaining only panel frame structures. Using 10,122 images from 104 works in 12 genres in the Manga109 dataset, we demonstrate that panel layouts alone enable high-accuracy classification. Grad-CAM visualizations further reveal that the models focus on layout features such as size, spacing, and alignment. These findings suggest that panel layouts encode work-specific stylistic patterns and support visual narrative comprehension, highlighting their role as a key component of manga's visual identity.

KnowJudge: A Knowledge-Driven Framework for Legal Judgment Prediction

Large Language Models (LLMs) have been extensively employed in Legal Judgment Prediction (LJP) in recent years. However, existing LLM-based methods often fail to effectively simulate the cognitive processes of human judges, particularly in keyword extraction, leading to suboptimal predictions. Inspired by cognitive science, we propose KnowJudge, a knowledge-driven framework, which explicitly models the cognitive process of legal decision-making, leveraging keyword extraction and precedent-based enhancement to guide LLMs in structured legal reasoning. By integrating external legal knowledge tailored to fact descriptions, it refines keyword identification and selects relevant case precedents, thereby mitigating ambiguity in legal judgment. Unlike conventional methods that rely on fine-tuning, KnowJudge improves performance purely through cognitive-process simulation. Experiments on five benchmarks show that KnowJudge outperforms baseline methods, including both general and legal LLMs.

Contextual restriction and faultless disagreement about generics across development

We report a study examining developmental changes in perceptions of disagreements among speakers who use generics to describe contextually-restricted or unrestricted regularities. Sixty-five adults and 222 5-12-year-olds reacted to generic claims from speakers who attributed ostensibly contradictory attributes to a biological kind ("Xs are striped"; "Xs are spotted"). Crucially, we manipulated the scope of each speaker's claim, described as restricted to a specific context, or unrestricted. Participants assessed faultless disagreement (whether the speakers could "both be right"). Adults were sensitive to contextual restriction: they allowed for faultless disagreement when contextual restrictions mis-aligned, and denied it when both speakers restricted to one context. Young children demonstrated striking partial competence in faultless disagreement judgments much earlier than prior developmental literature suggested. This is the first study to document faultless disagreement between differentially-restricted generics, both in adults and in children. We discuss the developmental trajectory, and implications for social functioning and learning.

Gaze signatures of cognitive conflict while choosing and solving

Current measures of cognitive conflict in experimental settings focus either on whole trial-level measures, such as reaction time and proportion of cohort disagreement, or intrusive on-task measures such as think-aloud paradigms. Consequently, granular within-trial measurements of the experience of cognitive conflict have been missing from the literature, and consequently, from formal models and theories of decision making. By combining the recently proposed switch paradigm for measuring cognitive conflict with on-task eye-tracking, we ask one such theoretical question: is the experience of cognitive conflict different when choices have clear normative answers and when they don't? Our results answer this question affirmatively and characterize it quantitatively by means of gaze signatures for both classes of experience of cognitive conflict.

Detecting Critical Collapsed Nodes in Social Networks: A Cognitive Model of Resilience under Spatial Constraints

The resilience of social networks hinges on identifying users whose departure causes cascading collapse, influenced by both topology and social cognition, such as spatial relationship constraints. Existing studies often overlook how cognitive and behavioral factors shape network fragility. This paper introduces a cognitive-computational framework to detect critical collapsed nodes under spatial constraints, using the (k, σ)-core model to integrate social cohesion (k-core) and spatial thresholds (σ). We propose a pruning algorithm leveraging spatial locality for efficient querying of collapsed nodes and formalize the problem of finding optimal collapsed nodes as an NP-hard task. Our greedy heuristic prioritizes nodes with the most significant cascading impact, similar to human strategies in crises. Experiments on eight real-world networks show our model outperforms topology-only baselines in predicting collapse patterns, especially in spatially-embedded communities. Our findings highlight how spatial constraints and social cohesion amplify systemic fragility, providing insights for designing cognitively-aligned interventions to boost network resilience, bridging computational analysis with cognitive science.

Does Language Stabilize Quantity Representations in Vision Transformers?

Whether language is essential, sufficient, or a tool for numerical cognition has been hotly debated. Here, we investigate the influence of language on quantity representations by comparing embeddings from vision-only Transformer models (ViTs) and vision-language models (VLMs) exposed to image pairs depicting either the same or different stimulus quantities. If linguistic exposure stabilises quantity representations, VLMs should produce more distinct representations for image pairs with differing numerosity and more similar representations for those with identical numerosity than ViTs. We operationalized this as the variance in Cosine Similarity in response to either categorical (same/different) or continuous differences in stimulus numerosity. We find that VLMs and ViTs are sensitive to the numerosity of visual stimuli, that this sensitivity increases with layer depth, and that VLMs exhibit slightly more sensitivity to image numerosity than ViTs. This work provides initial support for the claim that linguistic exposure can, in principle, stabilise quantity representations.

MS-NHHO: A Swarm Intelligence Optimization Algorithm Incorporating Cognitive Science for Malicious Traffic Detection

The diversification of attacks jeopardizes cyberspace's normal operation. This paper proposes a new Harris Hawks Optimization Based on Multiple Strategies (MS-NHHO), inspired by humans' limited cognitive load, collective decision-making, and dynamic learning mechanisms for processing complex information. This paper utilizes the elite chaos reverse learning strategy to improve the algorithm's convergence speed and population diversity. Then, the dynamic adaptive weights are introduced into the escape energy decline mechanism to improve the algorithm's global exploration and local exploitation ability. Finally, the Gaussian random walk strategy enhances the algorithm's anti-stagnation ability. The experimental results confirm the usefulness of the three optimization strategies. Meanwhile, MS-NHHO exhibits satisfactory performance in terms of computational cost, detection performance, and efficiency in several scenarios.

Can Sequential Persuasion Strategies Referencing Specific Purposes Enhance the Persuasiveness of Online Requests? A Case Study

How to improve the persuasiveness of online requests is crucial to achieve acceptance and foster positive social relationships. The effectiveness of persuasion strategies and the influence of the sequence in which these strategies are applied have been demonstrated in the literature. However, existing research has largely overlooked the importance of linking the sequential persuasion strategies to the specific purpose. In this study, we first employ a few-shot Iterative Collaboration Method (ICM) to identify the purpose of the online requests, referencing human needs as well as the persuasion strategies used. Then, the sequential patterns of persuasion strategies supporting respective purposes are mined. Finally, Large Language Models (LLMs), incorporating the identified effective sequential strategies are used to rewrite the original requests. The results indicate that the sequence of strategies used for different purposes, can significantly increase the level of persuasion. The code and dataset can be found at https://github.com/phillip2f/Seq-Strategies-and-Purpose.

Eye movement behavior during mind wandering in older adults

Aging is associated with task-specific changes in eye movements, and thus eye movements during mind wandering (MW) in older adults may differ from young adults. We showed that changes in number of fixations or fixation duration were associated with MW in young adults when searching for information but not older adults, possibly due to aging-related changes in these measures. Similarly, larger variance in pupil diameter change was associated with MW when imagining a scenario in young but not older adults, possibly related to aging-related affect stability. In contrast, lower eye movement consistency was associated with MW when implementing well-learned visual routines in older but not young adults, possibly related to their higher susceptibility to MW interferences. Reduced joint attention with another participant was associated with MW for tasks involving clearly defined strategies for both young and older adults. These results have important implications for monitoring task engagement through eye tracking.

Cause and Blame Attribution to AI and Human Agents in Mental Health Context

The present study examined how participants (N = 298) assessed causality, blameworthiness, foreseeability, and counterfactuality of an AI or human therapist, across three levels of empathy, in comparison to their supervisor and a recommending clinician. We found that participants judged the human therapist as more causal and blameworthy than their supervisor when medium or low empathy levels were displayed, whereas no difference emerged between the judgments of the AI therapist and its supervisor across all of the empathy levels. Additionally, participants did not differentiate causality and blameworthiness between the AI and human therapists, regardless of the empathy level. However, they did perceive the human therapist as foreseeing the outcome more than the AI therapist in the medium and low empathy levels. Qualitative analysis revealed that participants considered the directness of the causes to the outcome, counterfactual reasoning, and inherent limitations of AI when making judgments.

Reinforcement learning produces efficient case-marking systems

Many languages mark either accusative case (for objects of transitives) or ergative case (for subjects of transitives), but some `split ergative' languages mix the two systems depending on the type of nominal. It has been noted that these languages tend towards marking the less frequent case for each nominal type. This raises the question of what mechanism could underlie the emergence of such an efficient system. We propose a model that can provide an explanation, based on a simple reinforcement learning framework and simple assumptions about asymmetries between the kinds of nominals (e.g., pronouns vs. full noun phrases) that appear in subject vs. object position.

Interfering with inner speech during action encoding impacts their execution

Most of the studies so far overlooked the role of inner speech in action, especially new actions. We conducted a behavioral experiment asking participants to observe videos to acquire two actions. In the experimental group, participants performed an articulatory suppression task, continuously repeating a syllable, so as to interfere with the inner speech. In the control group, participants were requested to continuously tap their middle finger on the table. We hypothesized that, interfering with inner speech, participants could not provide themselves with instructions about how to perform the actions, consequently impacting their ability to acquire them. The results confirmed the hypotheses. Compared to the Dual-task Control Group, the Articulatory Suppression Group is overall more impaired in action acquisition and, in part, in motor performance quality. These results confirm the role of inner speech in cognition, providing new evidence about its function in action learning and execution.

Hearing Beyond Categories: General Adaptation to Nonnative Speech

Listeners can rapidly adapt to non-native accented speech, yet the underlying mechanisms remain debated. This study examines whether accent adaptation reflects adjustments in phonetic representations or shifts in decision-making processes. Using a pretest-exposure-posttest paradigm, we examined native English listeners' perception of the Mandarin-accented /θ/-/s/ contrast across two exposure conditions: exposure to Mandarin-accented sentences (Experiment 1) or to pure tones (Experiment 2). In both experiments, listeners showed increased acceptance of ambiguous /θ/ and /s/ tokens when they formed real words, suggesting that adaptation stems from changes in lexical decision criterion reinforced through task repetition rather than accent exposure alone. Additionally, we observed evidence suggestive of rapid within-test distributional learning from limited trials. Our findings support the notion that listeners lower lexical decision criterion when processing accented speech, while also demonstrating remarkable adaptability to novel accent features even with minimal exposure.

Effect-prompting shifts the narrative framing of networked interactions

Narrative interaction plays an important role in shaping people's beliefs and behaviors both online and in the offline world. We present an experiment examining whether a simple intervention of effect prompting---asking participants to list the effects of complex events---impacts the narrative framing of their networked interactions. After reading a text-based narrative about the Fukushima nuclear disaster, participants in a fully connected network interacted with their neighbors and received rewards for submitting hashtags that matched those of their network partners. Half of the groups received an \textit{effect-prompting} intervention, which shifted participants toward producing more effect-oriented hashtags during networked interactions. We found that the effect-prompting instruction influenced the hashtags participants generated during the network interaction. However, the extent of this shift in hashtags depended on how likely the group was to achieve global coherence.We also examined these dynamics with networks of interacting large language model (LLM) agents using Llama-3.1-8B-Instruct. The study highlights how language-based prompting can subtly shift the narrative framing of online communication.

Efficient compression in locomotion verbs across languages

Converging evidence suggests that languages are shaped by a drive for efficient communication. In particular, it has been shown that languages efficiently compress meanings into words via the Information Bottleneck (IB) principle in domains ranging from visual percepts, such as colors and objects, to non-visual high-level concepts, such as pronouns and number. These domains, however, capture only static elements described by adjectives, nouns, function words, or grammatical markers, leaving open the question of whether the same theory could also apply to verb meanings, which often refer to dynamical aspects of the environment. We address this question by considering locomotion verbs (e.g., walk, run, and jump) across four languages (English, Dutch, Spanish, and Japanese). We show that locomotion verb meanings across languages are shaped by pressure for efficiency, which resonates with similar findings in other domains and suggests that the IB principle may apply more broadly across the lexicon. Our results also open a new avenue for future work to explore whether semantic categories of actions are rooted in a strictly perceptual representation, or perhaps in motor and functional representations as well.

Complexity in Complexity: Understanding Visual Complexity Through Structure, Color, and Surprise

Understanding human perception of visual complexity is crucial in visual cognition. Recently (Shen, et al. 2024) proposed an interpretable segmentation-based model that accurately predicted complexity across various datasets, supporting the idea that complexity can be explained simply. In this work, we investigate the failure of their model to capture structural, color and surprisal contributions to complexity. To this end, we propose Multi-Scale Sobel Gradient which measures spatial intensity variations, Multi-Scale Unique Color which quantifies colorfulness across multiple scales, and surprise scores generated using a Large Language Model. We test our features on existing benchmarks and a novel dataset containing surprising images from Visual Genome. Our experiments demonstrate that modeling complexity accurately is not as simple as previously thought, requiring additional perceptual and semantic factors to address dataset biases. Thus our results offer deeper insights into how humans assess visual complexity.

VGG-19 Displays Human-like Biases in Statistical Judgment from Visual Graphs

Convolutional neural networks (CNNs) not only recognize objects with high accuracy, but also acquire from images abstract statistical concepts such as numerosity and correlations. However, it remains unclear whether the CNN architectures implement inductive biases that mimic human biases in statistical judgments. In this paper, we examined whether VGG-19 models, a popular CNN architecture, that are trained to make correlation judgments from scatterplots display human-like biases. In comparisons between model predictions and human data, we found that there was a high correspondence between human biases and machine biases in VGG-19 models. Using explainable AI visualization with saliency maps to unpack the regions on which VGG-19 rely to make correlation judgments, we found that the late layers of the model tend to focus on regions similar to human participants' fixation distributions as captured by eye tracking. We further demonstrate that such models were nearly sufficient to predict human data at an accuracy level rivaling the state-of-the-art model trained on human data in three large-scale correlation discrimination datasets. Our results suggest that VGG-19 models may employ strategies that are similar to those used by human participants for statistical judgments from visual graphs and, therefore, pave the way to address human cognitive biases in visualization-based statistical judgments through the lens of deep neural networks.

Experience-driven discovery of planning strategies

One explanation for how people can plan efficiently despite limited cognitive resources is that we possess a set of adaptive planning strategies and know when and how to use them. But how are these strategies acquired? While previous research has studied how individuals learn to choose among existing strategies, little is known about the process of forming new planning strategies. In this work, we propose that new planning strategies are discovered through metacognitive reinforcement learning. To test this, we designed an experiment to investigate the discovery of new planning strategies. We then presented metacognitive reinforcement learning models and demonstrated their capability for strategy discovery as well as show that they provided a better explanation of human strategy discovery than alternative learning mechanisms. However, when fitted to human data, these models exhibited a slower discovery rate than humans, leaving room for improvement.

Metacognition as a domain of skill

This paper presents a framework for understanding metacognition as a distinct domain of skill, drawing on established research in motor and cognitive domains. It proposes that metacognitive expertise shares key characteristics with other skill domains, including goal-directed action, hierarchical organization, declarative and procedural knowledge, and automatization. By integrating theoretical and empirical insights, this paper aims to establish a comprehensive model of metacognitive skill development, with implications for research and practical applications in education, therapy, and beyond.

Acute Stress Impairs Visual Narrative Comprehension in Younger but Not Older Adults

Visual narrative comprehension is crucial in today's visually driven world, yet research on aging and this ability remains inconclusive. The role of stress in this process is largely unexplored, despite stress becoming an ever-present factor in modern life. This online study investigated how acute stress affects visual narrative comprehension in younger (N = 203, 18-57 years) and older adults (N = 212, 60-85 years). Participants were assessed under stress and neutral conditions. Stress was induced through timed mathematical and logical tasks with social stress elements. Pictorial stories, consisting of three panels with the second intentionally left blank, were presented. On the next page, participants judged whether the given inference for the blank panel was correct. Results showed that acute stress impaired comprehension and confidence in younger adults, while older adults remained unaffected. These findings suggest that with age and experience, individuals develop more differentiated event schemas, enhancing resilience to stress.

Cognitive Measurement with Generative AI: A Novel Interactive Situational Assessment of Learning Motivation and Strategy Using LLM Multi-Agents

Assessing learning motivation and strategy (LMS) in specific situations can more accurately reflect students' self-regulation learning ability. However, traditional assessment methods, such as subjective evaluations and self-reports, are time-consuming, burdensome, and not well-suited to the dynamic nature of situational assessments. To address this, we presented the LLM-based agents, which enable intelligent generation of situational tasks and interactive assessment. Specifically, Master defines the theme and storylines, Designers generate situational tasks, Evaluator reviews the content quality, and Interactor controls the interactive assessment with users. The results of a user study with 97 university students demonstrated the reliability and validity of our approach and the significant enhancement of the user experience. The results further clarify the relationship among indicators of LMS. This study provides a novel paradigm and solution for situational assessment of LMS and offers valuable theoretical insights for intervention research targeting related indicators.

How Generative Music Affects the ISO Principle-Based Emotion-Focused Therapy: An EEG Study

Recently, AI-generated content (AIGC) technologies have made remarkable advancements, even achieving superhuman performance across various domains. However, few previous studies have investigated its impact on emotion-focused therapy with artistic content, e.g., music. In this paper, we conducted an EEG experiment to explore the effects of generative music on emotion-focused music therapy based on the ISO principle. This experiment compared AI-generated and human-created music regarding the changes in participants' valence and arousal following negative emotion induction with the ISO principle adherence and non-adherence. The results show that generative music, with its harmonic consistency and simple rhythm, is more effective in supporting positive emotions and improving temporal lobe activity. Besides, the therapeutic effectiveness of generative music adhering to the ISO principle has also been validated. This study highlights the distinct emotional and neural mechanisms of AI-generated music, offering valuable insights into future AI-powered emotion-focused therapy strategies.

Decoding EEG Signals to Explore Next-Word Predictability in the Human Brain

Humans invented reading and have passed down this complex skill across generations through language. This study provides empirical evidence of the neural mechanisms underlying bottom-up (related to high-order linguistic structure) and top-down (related to next-word predictability) processes, which interact to guide comprehension during reading. While previous studies have focused on either the N400 effects of predictability or lexical categories, research on how predictability influences N400 responses across different lexical categories is limited, mainly due to constraints in publicly available datasets. Here, we examine how predictability influences brain responses, recorded at millisecond resolution using electroencephalography (EEG), with a focus on the N400 time window (300-500 ms post-stimulus) across different lexical and grammatical categories. Our results indicate that significant differences in N400 responses between high and low cloze probability levels were more pronounced for content words than function words. Among the two primary content categories, verbs exhibited greater N400 differences than nouns, while nouns carried more distinct information about their predictability than verbs. Moreover, we demonstrate that the decoding technique is more effective than the event-related potential (ERP) traditional analysis in capturing more detailed and distinct representations of cognitive processes over time.

Clicking, Fast and Slow: Towards Intuitive and Analytical Behaviors Modeling for Recommender Systems

Recommender systems personalize content delivery based on user's interaction history. However, not all clicks result from deliberate decisions—many arise from intuitive reactions. Inspired by the dual process theory, we argue that intuitive clicks are primarily driven by System 1, reacting to superficial cues, while analytical clicks involve deeper processing by System 2, considering the semantic meaning and long-term preference. However, existing models overlook these cognitive mechanisms. To address this, we propose DualRec, a novel recommendation method that models both intuitive and analytical behaviors. DualRec encodes items using language models, leveraging shallow layers for superficial understanding (System 1) and deep layers for semantic comprehension (System 2). It employs Transformer-based encoders with two attention mechanisms to capture intuitive "fast" and analytical "slow" click patterns. A learnable fusion layer balances these behaviors. Extensive experiments demonstrate that DualRec outperforms existing methods and highlights the importance of integrating both cognitive processes in recommendations.

An Incremental Program Induction Model of Slow Mapping Words to Meanings

The process by which people adjust, enrich, and revise their understanding of word meanings over time – so-called ‘Slow Mapping' – has often been overlooked, particularly in terms of how a computationally bounded learner might approach such a task. To address this gap, we propose a process model of incremental word-meaning induction. This proposal is inspired by recent work on concept and theory change grounded in a probabilistic language of thought (pLOT). We focus on the problem of fixing the meanings of words from usage examples, taking kinship terms as our test domain. We frame word meaning induction at a computational level as a program induction problem, and hypothesize that individual learners search for possible meanings as evidence arrives via a mutative Markov-Chain Monte-Carlo search scheme. We show this idea provides a better description of how participants' generalizations and tentative definitions of alien kinship words shift as evidence arrives, outperforming normative accounts and other baselines.

Functional category induction with theory-neutral cognitive biases

This paper probes the influence of a particular kind of domain-general cognitive bias in first language acquisition with the aid of computational models. We introduce a novel task: inducing functional categories from morphemically tokenised sentences, and supply a manually annotated dataset of English child-directed speech (CDS). We operationalise a widely assumed type of cognitive bias, "less-is-more", as three computational principles—ordering input, gradually increasing model complexity, and priming the learner—and develop a theory-neutral experimental setup to evaluate their impact on functional category induction. Our experiments with CDS demonstrate that models incorporating reflexes of "less-is-more" outperform the purely statistical baseline. As part of our exploration of ordering effects, we employ the morpheme acquisition order proposed by Brown (1973) and, for the first time in literature, present statistical evidence that Brown-compliant orders outperform non-Brown-compliant ones.

Efficient Audience Design in LLMs

During human communication, speakers balance informativeness and effort by tailoring their language to their audience. Large Language Models (LLMs) appear human-like in their communication and succeed at some tasks thought to involve social reasoning about interlocutors (in humans). Here, we tested audience design in LLMs using tasks modeled on (Isaacs & Clark, 1987). In Experiment 1, replicating findings with humans, LLMs produced longer responses when producing descriptions of pictures from a city while addressing an audience unfamiliar with that city and used more proper nouns when addressing a familiar audience. In Experiments 2 and 3, similar to previous findings with humans, LLMs used fewer words to describe pictures over the course of a multi-turn interaction. However, this pattern appeared to be sensitive to whether the user prompts also got shorter across turns, suggesting that efficient audience design in LLMs reflects patterns in training data and reinforcement learning, rather than an inherent drive towards least effort.

Summarization Reflects Characteristics of Memory Recall

Memory has traditionally been studied in well-controlled laboratory environments, which, while effective, do not fully capture the range of dynamics and behaviors shown in real-world contexts. To address this gap, we propose using summarization as a novel task to study memory recall in naturalistic settings. We argue that a key component of summarization is the ability to represent and retain information from the original material. Inspired by approaches in the free recall literature to analyze temporal dynamics of memory recall, such as how recall begins and transitions to subsequent items, we analyzed the temporal dynamics of summary patterns. Using three publicly available summarization datasets and a naturalistic narrative recall dataset, we found alignments between the summarization patterns and established free recall patterns, including primacy, recency, temporal contiguity, and the effect of list length. These results support that summarization involves processes of memory recall and open up opportunities to use summarization as a naturalistic task to study memory recall in the future.

Personalized Knowledge Tracing Based on Generative Models: Cognitive Exploration of Learning Preferences

PLGAN is a generative model-based framework for personalized knowledge tracing, designed to explore and model individual learning preferences. By integrating a Personalized Attention Mechanism (P-Attn), PLGAN effectively extracts learners' distinct learning patterns and behavioral tendencies, addressing the limitations of traditional knowledge tracing models that assume homogeneous learner groups. Unlike conventional approaches, PLGAN dynamically adjusts the weighting of behavioral features, enabling a more nuanced representation of learning preferences and improving the accuracy of knowledge state predictions across diverse learning environments. Experimental results on multiple public datasets demonstrate that PLGAN achieves an average performance improvement of 3.5% in knowledge tracing tasks. Furthermore, the generative nature of PLGAN enhances its generalization and robustness, effectively capturing individualized learning dynamics. This work advances the study of personalized learning by leveraging generative models to model and analyze learner behavior, providing a cognitively informed approach to knowledge tracing.

Non-literal Understanding of Number Words by Language Models

Humans naturally interpret numbers non-literally, effortlessly combining context, world knowledge, and speaker intent. We investigate whether large language models (LLMs) interpret numbers similarly, focusing on hyperbole and pragmatic halo effects. Through systematic comparison with human data and computational models of pragmatic reasoning, we find that LLMs diverge from human interpretation in striking ways. By decomposing pragmatic reasoning into testable components, grounded in the Rational Speech Act framework, we pinpoint where LLM processing diverges from human cognition --- not in prior knowledge, but in reasoning with it. This insight leads us to develop a targeted solution --- chain-of-thought prompting inspired by an RSA model makes LLMs' interpretations more human-like. Our work demonstrates how computational cognitive models can both diagnose AI-human differences and guide development of more human-like language understanding capabilities.

Preschool Children's Learning and Generalization of Continuous Causal Functions

Many causal relations can be represented by continuous functions that map inputs to outputs. Can young children learn continuous causal functions and generalize them from observed data to new scenarios? We found that 4- and 5-year-olds can represent continuous functions with different abstract forms. After observing a few input-output pairs, children can accurately infer positive linear and step functions by predicting the outputs of novel input values. They also have emerging knowledge of negative linear and triangular functions. While children do not yet make consistently accurate predictions for these functions, they can distinguish these functions from the positive linear function and show inferential patterns that are consistent with the respective functions. Like adults and older children, preschoolers show an inductive bias towards positive linear functions. Their understanding of negative linear functions--which strongly requires inhibiting this inductive bias--improves with age.

Prior beliefs impair logical reasoning about syllogisms on sexual violence

Belief bias in syllogistic reasoning occurs when individuals' agreement with a conclusion influences their assessment of its logical validity. While this effect has been widely studied in domains such as politics and personality-related reasoning, its role in evaluating arguments about sexual violence remains underexplored. In a pre-registered study, we examined whether participants' sexist beliefs and cognitive reflection influenced their ability to assess the validity of syllogisms related to sexual violence. Participants (N = 104) completed a syllogistic reasoning task with sexism-supportive, sexism-challenging, and neutral syllogisms, as well as the Ambivalent Sexism Inventory and a Cognitive Reflection Test. The results indicate that when evaluating such syllogisms, participants' beliefs play a significant role. People tend to perceive syllogisms as logically valid if the conclusions align with their beliefs and as logically invalid if the conclusions contradict their beliefs. Furthermore, cognitive reflection moderated belief bias effects, but only for sexism-supportive syllogisms. These findings highlight the extent to which reasoning about gender and sexual violence is shaped by preexisting beliefs and suggest that cognitive reflection may help mitigate some bias-driven reasoning errors.

Mapping between Telicity and Event Representations

How does linguistic telicity map onto mental representations of events? Recent work suggests considerable flexibility in how people mentally represent temporal event structure, yet we know little about how linguistic cues modulate these representations. We investigated how different forms of quantization correspond to event construal using a novel experimental paradigm that bridges event perception and linguistic processing. Participants first learned to distinguish bounded from unbounded events, then categorized sentences varying in quantization strength. Our results revealed a systematic relationship between linguistic form and event representation: strongly quantized expressions ("drink one beer") reliably corresponded to bounded event construal and activity descriptions ("did some writing") to unbounded interpretations, while bare plurals ("drink beers") showed genuine flexibility in interpretation. This graded pattern indicates that temporal boundedness in cognition operates along a continuum, with linguistic cues providing weighted probabilistic constraints on event representations. The findings demonstrate how different linguistic forms correspond to varying degrees of flexibility in event understanding, contributing to our knowledge of how language interfaces with event cognition.

DDSPR: Dynamic Domain Selection and Pseudo-label Refinement for Cross-Subject EEG-based Emotion Recognition

Automatic emotion recognition using electroencephalography (EEG) signals has garnered significant attention in recent years. While multi-source domain adaptation methods provide a promising framework for cross-subject emotion recognition, the distributional discrepancies among different source domains often result in negative transfer. To address these challenges, we propose a two-stage Dynamic Domain Selection and Pseudo-label Refinement (DDSPR) model. In the first stage, we introduce a novel Dynamic Domain Selection (DDS) module and an Agent Domain Adaptation Strategy (ADAS) to dynamically select and align source domains. In the second stage, a confidence-based pseudo-label correction strategy is employed to refine target domain labels and mitigate noise. We evaluate the proposed model through cross-subject experiments on the SEED and SEED-IV datasets, achieving accuracies of 91.50% ± 7.05 and 78.05% ± 13.56, respectively.The results demonstrate its effectiveness in emotion recognition performance.

Idiosyncratic but not opaque: Linguistic conventions formed in reference games are interpretable by naïve humans and vision–language models

When are in-group linguistic conventions opaque to non-group members (teen slang like "rizz") or generally interpretable (regionalisms like "roundabout")? The formation of linguistic conventions is often studied in iterated reference games, where over repeated reference to the same targets, a describer--matcher pair establishes partner-specific shorthand names for targets. To what extent does the partner-specificity of these linguistic conventions cause them to be opaque to outsiders? We use computational models and experiments with naïve matchers to assess the opacity of descriptions from iterated reference games. Both human matchers and the computational model perform well above chance, suggesting that conventions are not fully arbitrary or opaque, but reflect aspects of shared semantic associations.

Communicating through Acting: The Role of Contextual Affordance in Intuitive Pantomimetic Gestural Communication

When observing an action, how do people intuitively determine if it has communicative intent, such as in the case of pantomimes? We focus on two alternative theories: one suggests that instrumental intention competes with communicative intention, where the weaker the former is, the stronger the latter is; the other proposes that instrumental intention is nested within communicative intention, where the presence of the former facilitates the latter. To test these theories, we introduced the concept of contextual affordance, which manipulates the degree to which the instrumental components of an action are revealed. Through behavioral experimentation, we found a non-linear relationship between contextual affordance and communicativeness rating: partial affordance–providing implicit cues to an action's instrumental component without being fully rational–elicited the strongest perception of communicative intention, whereas full affordance or no affordance resulted in a weaker perception of communicative intention. Our study provides a novel definition of communicative intention and reveals that recognizing the instrumental goal and perceiving the suboptimality in achieving it work together to create a strong communicative signal. This work represents a step toward developing an integrated theory of pantomimes, specifying how the rationality principle can be applied to serve multiple purposes simultaneously.

Behavioral Evidence is Still Insufficient to Identify Consciousness

Researchers have started seriously considering the epistemic issue of whether and when we can claim an artificial intelli- gence (AI) has developed machine consciousness. Most cog- nitive theories of consciousness employ a functional character- ization of the property of consciousness. That is, they are com- mitted to an account of consciousness as a rule-governed pro- cess over mental states. Some cognitive scientists concerned with AI advocate an epistemically behaviorist approach to ma- chine consciousness; however, such approaches taken ontolog- ically, systematically fail to satisfy reasonable intuitions about in what consciousness ought to consist, and taken epistemi- cally, fail to provide sufficient evidence to individuate any in- ternal property, including consciousness, in non-human sub- jects. Therefore, in order to assess consciousness in ways that adequately account for reasonable intuitions as to its proper definition, such that we can reasonably assert the presence of machine consciousness in some AI, it is necessary to propose, test, and revise, functional theories of consciousness.

Exploring the Cognitive Diversity of Political Concepts

Prior research has shown that people vary considerably in how they interpret political concepts, a variability often attributed to liberal–conservative differences underlying political polarization. In this study, rather than focusing on the liberal–conservative dichotomy, we considered personality and morality variables as possible predictors of cognitive diversity in subjects' interpretation of political concepts. Participants completed brief personality (HEXACO) and morality (MAC) assessments, followed by a series of association ratings for the concepts of freedom, justice, and authority. We found that certain personality traits and moral dimensions correlate with higher associations between probe concepts. Furthermore, clustering of political inclination on morality dimensions and concept ratings suggested that the latter made a limited contribution to political diversity, only raising the number of clusters from 2 to 3. Keywords: personality; morality; conceptual diversity; freedom; justice; authority

A Quantum Statistical Model of Decision Making in a Single-Cell Eukaryote

The single-celled protist Stentor roselii has long been observed to exhibit complex decision-making behaviors, yet existing machine learning and classical computational models have struggled to replicate its actions. In this paper, we propose a novel quantum-statistical framework to model S. roselii's behavioral responses to environmental stimuli. By leveraging quantum circuits with amplitude dampening and memory effects, we construct a quantum behavioral model that captures the probabilistic and hierarchical nature of S. roselii's decision making. Our results suggest that quantum statistical theory provides a powerful tool for representing and simulating biological decision processes.

Enactment and Embodiment Impact the Recall of Object-Location Associations

The role of self-generated movement in memory retrieval has been demonstrated in enactment paradigms. However, in the context of object-location memory, the impact of action during learning has not yet been investigated, despite the ecological relevance of such behaviors. In the current project, we present new evidence that actively placing an object in a target location during learning leads to more precise, and faster, subsequent recall of the object-location associations than simply observing this placement. We further demonstrate differences in object- location memory depending on the category of stimuli that participants are engaging with by showing that images of objects with high manipulability are placed more precisely, more quickly, and more directly (mouse-tracking) than images of objects with low manipulability. We suggest that these latter differences are due to the motor information implicitly activated during processing of high manipulability items, and reflect the embodied nature of concepts. Although both enactment and manipulability impacted object-location recall, they did not interact. This research extends findings on enactment to associative encoding processes, and informs our understanding of the relationship between enactment and embodiment in human memory.

Instruction tuning modulates discourse biases in language models

Instruction tuning (IT) has been a fruitful technique for aligning Large Language Models with human preferences. However, the linguistic implications of IT remain unclear. In two experiments on coreference and coherence biases in the context of Implicit Causality, we investigate how IT modulates these discourse biases in relation to model size. Our results show that IT interacts with model size -- instruction-tuned models display enhanced coherence biases and more human-like coreference patterns, sometimes exceeding human performance. However, this effect appears size-dependent, suggesting that IT causes some linguistic patterns to emerge that are dormant in the respective foundation models.

Task Resolution Time Estimation through Cognitive Load: An EEG Study of Chess Players

Assessing attention is essential for optimizing performance. This study identifies a single-channel EEG biomarker based on cognitive load to estimate Task Resolution Time (TRT). Thirty-seven chess players were recorded with an 8-channel EEG headset while solving chess problems under two conditions: distracting noise (65 dB) and ambient noise (40 dB). Participants were grouped by chess expertise (ELO rating), and cognitive load was measured via theta (4-8 Hz) power on the C4 channel. EEG signals underwent preprocessing with a bandpass filter, Artifact Subspace Reconstruction (ASR), and Independent Component Analysis (ICA). Power estimation (Welch) was normalized to a resting 30-second Eyes Open (EO) period. TRT analysis indicated shorter engagement times and slightly lower performance in novices under noise, while experts remained relatively stable, possibly due to better cognitive resilience. This biomarker could be further integrated into portable EEG systems for real-time neurofeedback in educational and workplace settings.

Drawing Privacy: How Children Conceptualize Regulation and Content Across Development

Children's understanding of privacy develops as they navigate both physical and digital spaces. This study examines how children aged 3- to 13- years-old conceptualize privacy through their drawings, analyzing data from the Privacy Illustrated dataset. We explored two key dimensions: regulation (mechanisms controlling access, such as doors) and content (what children consider private, such as bedrooms or intellectual property). Our findings suggest that as children get older, they are more likely to view privacy as something that can be actively managed using physical barriers and control mechanisms. In contrast, younger children often depicted privacy as simply being alone. Content-related depictions remained relatively stable across ages, though older children included more abstract ideas, such as digital privacy. This study provides a novel framework for examining privacy development, highlighting distinct but interrelated dimensions of privacy.

Decomposed Inductive Procedure Learning: Learning Academic Tasks with Human-Like Data Efficiency

Human learning relies on specialization---distinct cognitive mechanisms working together to enable rapid learning. In contrast, most modern neural networks rely on a single mechanism: gradient descent over an objective function. This raises the question: might human learners' relatively rapid learning from just tens of examples instead of tens of thousands in data-driven deep learning arise from our ability to use multiple specialized mechanisms of learning in combination? We investigate this question through an ablation analysis of inductive human learning simulations in online tutoring environments. Comparing reinforcement learning to a more data-efficient 3-mechanism symbolic rule induction approach, we find that decomposing learning into multiple distinct mechanisms significantly improves data efficiency, bringing it in line with human learning. Furthermore, we show that this decomposition has a greater impact on efficiency than the distinction between symbolic and subsymbolic learning alone. Efforts to align data-driven machine learning with human learning often overlook the stark difference in learning efficiency. Our findings suggest that integrating multiple specialized learning mechanisms may be key to bridging this gap.

Age Inference on both Textual and Social Perspectives with Semi-supervised Learning

Large-scale annotated data is essential for age prediction in social media, yet obtaining such data is costly. Age is a key psychological and cognitive marker influencing communication and social behavior. Understanding age-related patterns in online interactions can provide insights into cognitive development and identity formation. To address data limitations, we propose a semi-supervised multimodal regression model leveraging Transformer-based variational autoencoders to infer age from textual and social features. This approach aligns with cognitive and social science theories on age-related behavioral patterns. Our framework effectively utilizes unlabeled data, reducing annotation dependency while enhancing predictive accuracy. Empirical results demonstrate superior performance over traditional classification and supervised baselines, advancing interdisciplinary research in age inference and online behavior modeling.

Evaluating the Structure of Chunk Hierarchies in a Naturalistic Educational Task Using Gaussian Mixed Models

Our knowledge of a topic such as mathematics is reliant upon the hierarchies of chunks we build in our memories. The time course of knowledge-based tasks, such as the transcription of algebraic formulas, provides rich signals that reflect the structure of the chunk hierarchies being processed. By building Gaussian Mixture Models, this paper provides evidence that decomposing the overall dis-tribution of pauses between actions in a sequential task can give meaningful characterizations of the structure of the chunk hierarchy. We also examine whether individual competence in mathematics can be measured using a metric derived from the models.

Large Language Model Discourse Dynamics

With the rise of Large Language Models (LLMs), interest in simulating interaction dynamics has grown, raising questions about their validity as cognitive models of human discourse. While extensive research focuses on their performance in various applications, we aim to quantify LLM conversational processes akin to traditional human studies. By analyzing how convergence entropy evolves across different conversational tasks, we propose a framework for quantitatively assessing LLMs' ability to exhibit specific features. This approach offers a pathway to characterizing LLMs for agent-based modeling and broader discourse analysis.

Sense of Joint Agency: The Role of Prior Partner Information

This study examines how prior information about a partner affects the sense of joint agency during a collaborative task. Previous research suggests that perceiving a partner as human enhances this sense. However, the effects of prior instructions remain unclear. We designed a 2 × 2 factorial experiment in which participants were told their partner was human or a program, while their actual partner was either human or a program. In the experiment, each participant and their partner jointly controlled a cursor to trace a circle. Additionally, we measured the sense of agency to compare it with joint agency. The data revealed that the instructional factor only influenced the sense of joint agency, while the actual partner factor solely affected the sense of agency. These findings suggest that beliefs about the partner prior to interaction influence the sense of joint agency, offering insights into the cognitive processes underlying collaborative actions.

Cross-Cultural Emotion Concept Representation: A Comparison of English, Korean, and Large Language Model Representations

Each person develops a unique emotional landscape shaped by their experiences and linguistic-cultural contexts, partly personal and partly shared with others. This enables personally unique emotional experiences while maintaining shared understanding. This work aims to advance a framework for investigating what's shared and distinct across individuals, beginning with linguistic communities as an essential level of analysis, using English and Korean speakers as our case study. We examined how emotion concept representations differ between English and Korean speakers using representational similarity analysis and network analysis. English and Korean speakers' judgments of pairwise similarity between 57 emotion concepts evidenced both substantial shared structure and language-specific patterns (Spearman's � = 0.72, indicating 48% unshared variance). While valence emerged as a key organizing dimension in both languages, network analyses with strength centrality showed distinct patterns for each language. First, the Korean emotion concept network demonstrated higher strength centrality across all emotion concepts than the English network, indicating higher interconnectedness between concepts. Second, high-centrality emotions were predominantly negative in both languages but formed language-specific local networks with different sets of neighboring concepts. The statistics of language usage encode a substantial part of the conceptual structure of emotion, enabling large language models to capture aspects of human emotion. Despite their advanced multilingual capabilities, GPT4-o and Claude-3.5 showed stronger alignment with English speakers' representations, regardless of prompt language. These findings demonstrate that while languages reflect common principles in emotion representation, they shape distinct patterns, with implications for cross-linguistic/cultural emotion understanding and AI system development.

Behavioral signatures of temporal context retrieval during continuous recognition

An influential mathematical model of memory, the temporal context model (TCM), posits that we encode items and their associations with temporal context (Howard & Kahana, 2002). Temporal context is conceived of as a recency-weighted av- erage of past experiences. Critically, the model assumes that when an item is retrieved later, the associated temporal con- text is also obligatorily retrieved. Existing evidence for the idea of retrieved temporal context primarily comes from free- recall studies. However, free recall introduces some critical confounds that are difficult to resolve (Folkerts et al., 2018) and also encourages memory strategies that may mimic tem- poral context effects (Hintzman, 2011). To address these con- founds, we investigate temporal context using an image recog- nition task. Schwartz et al. (2005) examined temporal con- text in an image recognition task using a short-list-based ex- perimental design, and found that temporal context influenced recognition performance. Building on this, we use the Natural Scenes Dataset (NSD) to show that reinstating temporal con- text enhances recognition accuracy even across substantially longer timescales. We demonstrate that images that were tem- porally closer during encoding facilitated the recognition of each other. Critically, we show that this influence falls off with temporal distance at encoding only when the temporal context is successfully retrieved, as predicted by TCM. Furthermore, the slope of this temporal gradient increases as a function of the strength of the influence of the retrieved temporal context. These findings extend our understanding of temporal context effects in episodic memory by showing that temporal context is retrieved even in tasks that do not encourage linking between items as a memory strategy.

To Honor or Dishonor Student Choices? The Impact of Self-Regulation on Instructional Methods and Learning Outcomes

Self-regulated learning involves making decisions about how to study, but students often choose suboptimal strategies. Our experiment investigated how preferences for instructional methods—video, practice, or both combined—affect learning outcomes. We randomly assigned 130 participants to either receive their preferred method of instruction (honoring initial choice) or a different method (dishonoring initial choice). Contrary to previous research showing preferences for lectures, our participants initially selected practice-based approaches. However, when asked again after instruction, the majority of participants chose the combined approach. This shift in preferences suggests that students may overvalue comprehensive approaches, even when practice alone was equally effective and reduced instructional time by 66%. Whether preferences were honored or dishonored did not significantly affect performance or efficiency, thus control over instructional methods may be less important than the methods themselves. Based on our findings, future research should focus on guiding students to utilize practice to optimize learning efficiency.

EvoAgents: A Cognitive-Driven Framework for Personality Evolution in Generative Agent Society

Generative artificial intelligence (GenAI) is rapidly advancing, providing innovative tools and methods for a wide range of applications. Among these, Generative Agents, a key domain in GenAI, are valued for their fine-grained definitions and simulations of human-like behaviors. These agents provide new avenues for studying and modeling various domains, including social interactions, education, and cognitive sciences. However, existing works suffer from cognitive dynamics disconnect and affective absence, which hinder researchers from exploring human-related cognitive processes in depth. To address these limitations, we propose EvoAgents, a novel cognitive-driven framework that pioneers the exploration of dynamic personality evolution in Generative Agents. By defining the emotional content of agents and integrating a cyclical personality evolution cycle, EvoAgents represent a significant step toward creating more adaptive and authentic agent behaviors. Comprehensive simulations and evaluations show that EvoAgents achieve superior performance in key automated metrics compared to prior work, while uniquely enabling reasonable and robust personality evolution processes that align with cognitive and psychological expectations. By constructing a new simulation environment, SmallClassroom, based on the EvoAgents framework, we validate the framework's ability to provide deeper cognitive insights into social dynamics, aligning closely with established psychological theories.

When Rules Don't Cut It: The Relative Frequency of Inductive and Deductive Language During Real-World Surgical Training

There are two competing predictions for how experts will teach novices. According to one account, general rules and principles are better for equipping learners to perform the complex tasks that are characteristic of many domains of expertise. On the other hand, synthesizing a rule can be challenging for an expert, particularly in real-life environments with complex tasks, and learners may not be able to apply the rule to novel cases. In the present study, we examine a sample of six expert-novice pairs in the context of a robot-assisted surgical procedure in a teaching hospital. The nuances of this domain allow us to examine how an expert's decision to teach inductively or deductively varies systematically across task complexity. We find initial evidence that experts in this domain primarily teach using specific case-by-case examples, as opposed to general rules and principles, regardless of the complexity of the surgical task.

Automatic Detection of Phonestheme-like features in Japanese – Insights for Cognitive Sound Symbolism Research

Sound symbolism is a linguistic feature that may suggest a non-arbitrary link between sound and semantic content that has been shown to play a role in language acquisition, and potentially language evolution. One sound-symbolic structure known as phonesthemes has been identified in many languages. Phonesthemes are sub-morphemic sound patterns associated with specific meanings more frequently than chance. Cognitive approaches to phonesthemes typically rely on a known example. Phonesthemes are not well attested in Japanese, so this approach is not an easy option. This research provides a route for research into the cognitive effects -- in particular language acquisition -- of Japanese sound symbolism. I present a model for identifying phonestheme-like features in Japanese adapted from a model originally used with English and identify two candidates in Japanese. I also outline methods for empirically testing the psychological reality of these clusters based on existing English and Japanese sound symbolism literature.

Integrating Textual and Emotional Dynamics for Accurate Detection of Mental Health Disorders in Social Media

Mental health disorders impact nearly one billion people worldwide, yet stigma and insufficient awareness often prevent individuals from seeking timely professional help. The proliferation of social media platforms has introduced new opportunities for detection of mental health conditions, enabling the analysis of user-generated content to identify whether one has a mental disorder. Traditional approaches to this task have largely relied on content-based models, such as n-grams or language embeddings, which are prone to domain-specific biases and often fail to account for the emotional dynamics inherent in mental health expressions. In this work, we propose a novel framework for detecting mental disorders through the analysis of Reddit conversations, integrating both temporal textual data and emotional cues. Our model addresses the limitations of prior methods by explicitly capturing the evolving relationship between textual content and emotional expression over time. Experimental results demonstrate a significant improvement in detection accuracy compared to existing approaches, while ablation studies highlight the critical role of temporal emotional information in enhancing performance. These findings suggest that a more nuanced, emotion-aware approach offers substantial promise for advancing computational mental health diagnostics.

Rhesus monkeys show no preference for a left-to-right number-space mapping

Humans who use a left-to-right writing system often associate smaller numbers with the left side of space and larger numbers with the right. Whether this left-to-right number-space mapping is innate or culturally learned is unclear. Here, we test whether monkeys who lack human cultural practices show a left-to-right number-space mapping. Previous work in monkeys has found mixed evidence on whether monkeys show a left-to-right bias in their number-space mappings. Replicating the methods of Drucker and Brannon (2014), monkeys were trained to touch the fourth circle from the bottom in a vertical array of five circles. Then, they were tested with a horizontal array of five circles. Overall, monkeys showed no preference for the fourth circle from the left compared to the fourth from the right. This suggests monkeys may not have a directionality bias for number-space mappings. Therefore, the left-to-right bias in humans may be due to specific cultural practices.

Watch out for Bears: Do People Behave Differently in Perceptual and Financial Decisions?

Financial decisions such as those involved in stock trading should, at least partly, be based on similar features as detecting trends in time-series data. When presented as a purely perceptual task, people's accuracy in detecting trends is generally considered good, but real-world individual investors underperform the market globally. In a series of controlled experiments, we contrast financial decisions to perceptual ones, presenting participants with real-time evolving time series whilst manipulating the reward structure and the context. Our results show that participants' decisions were not affected by trend direction in a classic perceptual decision-making scenario, whilst in a classic trading scenario they performed worse in both speed and accuracy during downward (i.e., bear markets) compared to upward trends (i.e., bull markets). In a final experiment where we carefully controlled the reward structure of both scenarios and the only relevant differentiating factor were the labels of the decisions, we did not find evidence for this difference between scenarios, but participants were slower in the trading scenario. Employing the Drift-Diffusion Model, we found evidence of lower efficiency in the classic trading, compared to the classic perceptual decision-making scenario. Our results provide much-needed insight into the cognitive basis of trading decisions and the general underperformance of real-world individual investors.

A cognitive model of the factors controlling the characteristics of Shiritori word sequences

Recent advances in large language models (LLMs) have renewed interest in understanding the cognitive mechanisms underlying human language use. In this study, we focus on the Japanese word game ``Shiritori'' as a simple task related to language use, and aim to clarify the cognitive factors involved in its characteristics. To this aim, we model the execution of Shiritori based on the basic memory mechanisms of a cognitive architecture and investigate the characteristics that appear in Shiritori word sequences through simulations with different parameter settings. The results show that word sequences with different properties were obtained depending on the levels of lexical activation, inhibition, and semantic association. To complement the simulation findings, we also conducted a preliminary human evaluation in which participants rated the robot-generated word sequences. By constructing a model that can flexibly control Shiritori behavior, we explore its potential applications as a stimulus for research on human–robot interaction and language acquisition support.

From positive to negative "craziness": Changes in emotional valence of words across adulthood

Accumulating knowledge and experience across the lifespan are bound to have an impact on the meaning of words. Here, we investigated this idea using primarily emotional valence of words as a test-case. We used French databases that gather psycholinguistic variables including emotional valence of words, in four groups of individuals including young (18-25; 26-39 years), middle-aged (40-59 years) and older (>60 years) adults. Following the hypothesis that words may display age-related differences in their psycholinguistic properties, we computed linear regressions over all individual words as a function of age-groups. Results revealed notably that between 5 and 10% of words show significant linear changes of emotional valence as a function of age. This pattern highlights the situated and flexible nature of word meanings and suggests that self-relevance of experience affects semantic memory.

Optimizing Learning Efficiency: Balancing Spacing and Repetition Under Time Constraints

Spaced retrieval practice has been repeatedly demonstrated to improve learning, but its implementation is often constrained by real-world time limitations. This study investigated whether, under fixed study durations, learners should prioritize spacing or repetition. Across two experiments (total N = 1589), participants practiced Indonesian vocabulary under four conditions that varied in spacing and repetition. Item difficulty was also manipulated. Results showed that increasing repetitions at the cost of spacing enhanced immediate test performance, particularly for harder items. These findings suggest that spaced retrieval practice is effective only when learners have sufficient prior repetitions to retrieve information successfully. This study highlights the trade-offs between spacing and repetition under time constraints and offers practice insights for optimizing learning strategies.

Investigating the Role of Sensorimotor Dominance in Semantic Feature Listings: When I Say Dog, Will You Say Tail?

Object knowledge comprises a virtually limitless feature space. When cued to generate attributes for an exemplar (dog), any feature is possible (has molecules). Yet, some features are definitively favored (tail). We hypothesize a sensorimotor bias in feature generation wherein perceptually salient features upon first pass are more cognitively accessible than abstract, verbally mediated knowledge. We examined the role of sensorimotor dominance in semantic feature generation by yoking concreteness values to cues (N=4436) and features (N=69,284) within the Buchanan et al. (2019) norms. We predict that cues regardless of their concreteness evoke relatively more concrete features (e.g., dogs evoke tails, justice evokes lawyers). The data moderately supported this hypothesis. Feature concreteness increased linearly with cue concreteness (R=.83) but the y-intercept (2.78) indicates that overall, features were more concrete than their cues. We discuss alternate factors (e.g., frequency, familiarity) that may moderate the likelihood that people retrieve tail when cued with dog.

The Moral Costs of Growth Mindset: Blaming People for Their Intellectual Struggles

Research on growth mindsets, emphasizing the malleability of intelligence through effort, often highlights their benefits of boosting performance and reducing achievement gaps. Across four studies (N = 785), we investigated the unintended consequences of the growth mindset, hypothesizing that its emphasis on intelligence as controllable would lead to greater blame toward others for intellectual failure, compared to the fixed mindset, which views intelligence as largely innate. Study 1 found that participants primed with the growth mindset assigned more blame for low-difficulty intellectual failures than those primed with the fixed mindset. Study 2 showed that this effect diminished when intellectual failures involved highly challenging tasks. Study 3 highlighted the harm caused by an individual's intellectual failures and found that participants in the growth mindset condition still assigned greater blame than those in the fixed mindset condition. Study 4 explored a possible mechanism, finding that a growth mindset, compared to a fixed mindset, increased blame by leading participants to perceive less effort from the protagonist in the vignettes, even when both conditions were faced with identical intellectual failures. These findings underscore the need for nuanced implementations of the growth mindset.

Interactions Between Linear Order and Lexical Distributions in Artificial Language Learning

How do children learn the appropriate scope of linguistic generalizations? One proposal is that prediction error and cue competition enable them to implicitly reduce their uncertainty about the various cues to linguistic patterns. Previous work has employed artificial language studies to test the predictions of error-driven models against the performance of (adult) human participants (Ramscar et al., 2010). A critical prediction of these models - that linear relations between linguistic and environmental cues can critically affect generalization - has received much empirical support. For example, Vujovic et al. (2021) found that suffixing languages supported the learning of discriminating cues, and overgeneralization avoidance, better than equivalent prefixing languages. The current study addresses a limitation of previous studies: the use of unnatural flat distributions, which contrast to the skewed distributions ubiquitous in natural language. Although some of our results are consistent with model predictions, there were divergences. Possible reasons for these are discussed.

The impact of engagement and partisan influence campaigns in an isolated social media environment

Despite growing concerns about the effect of social media en- gagement on people's beliefs and behavior, estimating the ac- tual impact is difficult. Here we present preliminary results from our own isolated social media platform named Magpie Social. In it, participants could interact with each other like typical social media, but we had control over the platform and measured people's beliefs and behavior before and after us- ing it. This allowed us to more closely approximate the eco- logical validity of naturally occurring social-media data, while retaining the ability to measure variables and infer causation. Our week-long task had three between-subject conditions (to- tal N = 311): a CONTROL in which people engaged on Mag- pie with no external influence, and two (LEFT and RIGHT) in which a small number of posts were secretly made by us, shar- ing typical talking points from one political side. We found small but statistically reliable effects suggesting that, relative to the CONTROL, the presence of right-wing trolls resulted in a higher level of right-wing belief and a greater perception of po- litical division in the US. Conversely, the left-wing troll cam- paign did not appear to have any statistically reliable effect on these measures. We also found considerably more overall en- gagement in both troll conditions, probably because content with a clear political stance tended to receive more activity. However, participants (especially those on the left) disliked the RIGHT condition more than the others.

Exploring Neural Correlates of Predictability in Natural Face-to-Face Conversation

Prediction is central in human language processing, as the brain continuously predicts upcoming words using prior knowledge and context. Surprisal theory quantifies predictability using word surprisal. While previous studies link neural activity to surprisal during passive listening or reading, we investigate how surprisal is tracked in dynamic face-to-face conversations. Two key challenges arise: estimating surprisal as well as identifying predictions in EEG data in natural conversation. We address the first challenge by adapting a pre-trained large language model to a dataset of spontaneous conversation capturing features like hesitations and repetitions. We then relate the surprisal estimated by the adapted model to EEG data using temporal response functions. Our experimental results show neural tracking of surprisal at different time lags after word onset, supporting the surprisal theory in face-to-face conversation. To the best of our knowledge, we are the first to address the application of surprisal theory in such interactive settings.

What or where? Infants Interpret Pointing as Referring to a Location Rather Than to a Specific Object

Humans often interpret pointing as referring to an object, however, it can also indicate a direction or relevant spatial location. We investigated which one of these interpretations can explain 14-month-olds responses in a two-alternative choice task. We conducted three experiments, in which an experimenter pointed at one of the two lateral objects, swapped their positions in full view of the infant, and then allowed the infant to choose. Pointing was either produced in an Ostensive Addressing (Experiment 1), Nonostensive Addressing (Experiment 2), or Ostensive Labelling context (Experiment 3). In the Ostensive Addressing and Ostensive Labelling experiments infants chose the non-indicated object in the indicated direction significantly more often than predicted by chance. In contrast, in the Nonostenive Addressing experiment, infants' performance was on chance. These findings suggest that infants follow the direction of pointing rather than interpreting it as indicating a specific object in a communicative context.

Perceived clusters may not explain people's judgments of approximate numerosity

The approximately number system (ANS) helps people quickly estimate the numerosity of objects in their environment. In this study, we explore one proposed mechanism for visually perceiving numerosity: visual clustering. Participants completed a magnitude comparison task, magnitude estimation task, and a clustering task using the same set of numerosity stimuli. The stimuli varied in the spatial configuration of the points (cluster structure -- clustered or dispersed) and in the number of points present. Participants judged stimuli with dispersed cluster structure to be more numerous in the magnitude comparison task. However, there was a minimal effect of cluster structure in the magnitude estimation task and no effect of the number of clusters perceived in both tasks. We also found that the clusters people perceived in the third task did not explain the effects of cluster structure. These findings go against strong claims that people use visual clustering to judge numerosity and set the stage for further investigations into the mechanisms underlying the ANS.

When Simple Counting Fails: Young Children Understand Event Prevalence Using Proportional Reasoning

Proportional reasoning is essential for many real-world tasks, yet its developmental trajectory remains debated. Children's performance in nonsymbolic proportional reasoning varies across tasks and plummets when numerical information is misleading. The present study investigates whether 5- to 7-year-old children can accurately compare proportions in a naturalistic context where counting strategies are ineffective. Children listened to short stories in which a subset of people from each of two groups experienced an event (e.g., catching the flu). Given the equal numbers of affected individuals in both groups and different group sizes, children needed to rely on proportional reasoning to compare the prevalence of the event. Results showed that children performed significantly above chance overall. Moreover, they were more accurate in adverse scenarios (e.g., avoiding illness) than in favorable ones (e.g., acquiring rewards). These preliminary findings suggest that the ability to compare nonsymbolic proportions emerges by age 5 but varies depending on context.

A Similarity-Aware Graph Transformer-enhanced Probabilistic Case-based Reasoning Model for Knowledge Graph Reasoning

Knowledge Graph Reasoning (KGR) is an effective way to ameliorate sparsity and incompleteness problems by inferring new knowledge based on existing knowledge. The probabilistic case-based reasoning (CBR) model can gather reasoning paths from similar entities and relations in KG, thus outperforming rule-based and embedding-based KGR methods. However, it is still limited by some problems, such as insufficient learning of similarity features and sparse intermediate activations. This paper proposes a \textbf{S}imilarity-\textbf{A}ware graph transformer-enhanced probabilistic CBR model for \textbf{KGR}, namely SA-KGR. The proposed model regards the reasoning task as the KG query answering and is composed of two phases. The first phase is similarity-aware graph transformer-based graph feature encoding, which equips the similarity matrix and Mixture-of-Expert network to obtain fine-grained similarity features that are more helpful for reasoning path generation. The second phase is similarity-enhanced probabilistic case-based reasoning, which can retrieve and infer query answers from the generated candidate paths to complete brain-like cognitive reasoning. Extensive experimental results on various benchmarks unambiguously demonstrate that the proposed SA-KGR model can obtain the state-of-the-art results of current CBR-based methods.

ECCoT: A Framework for Enhancing Effective Cognition via Chain of Thought in Large Language Model

In the era of large-scale artificial intelligence, Large Language Models (LLMs) have made significant strides in natural language processing. However, they often lack transparency and generate unreliable outputs, raising concerns about their interpretability. To address this, the Chain of Thought (CoT) prompting method structures reasoning into step-by-step deductions. Yet, not all reasoning chains are valid, and errors can lead to unreliable conclusions. We propose ECCoT, an End-to-End Cognitive Chain of Thought Validation Framework, to evaluate and refine reasoning chains in LLMs. ECCoT integrates the Markov Random Field-Embedded Topic Model (MRF-ETM) for topic-aware CoT generation and Causal Sentence-BERT (CSBert) for causal reasoning alignment. By filtering ineffective chains using structured ordering statistics, ECCoT improves interpretability, reduces biases, and enhances the trustworthiness of LLM-based decision-making. Key contributions include the introduction of ECCoT, MRF-ETM for topic-driven CoT generation, and CSBert for causal reasoning enhancement.

GIBRNet: A Multimodal Spatiotemporal Reasoning Network Integrating Emotion, Gaze, and Position for Gaze Interaction Behavior Recognition

Gaze Interaction Behavior Recognition (GIBR) plays a significant role in understanding social behaviors and diagnosing mental health conditions. However, existing methods are limited by inadequate task modeling, resulting in suboptimal performance. To address this issue, we model the GIBR task as a spatiotemporal reasoning problem integrating three modalities: emotion, gaze, and position. Based on this, we propose GIBRNet, which enhances the representation of gaze interaction tendencies through an Emotion-Aware Refinement Matrix and dynamically aggregates multi-frame, multi-modal information using GP-GNN, enabling more precise interaction behavior reasoning. Comparative experiments on the VACATION dataset demonstrate that GIBRNet significantly outperforms existing approaches. Additionally, we constructed a GIBR dataset suite, consisting of three extended datasets, for generalization evaluation, demonstrating GIBRNet's superiority. All datasets and code are publicly available.

Unveiling Cultural Cognition in AI: A Systematic Investigation of Horizontal-Vertical Individualism-Collectivism Traits in Large Language Models

This study investigates the Horizontal-Vertical Individualism-Collectivism (HVIC) traits of Large Language Models (LLMs), addressing the gap in understanding their cultural and social cognition. HVIC, a cross-cultural psychology framework, offers insights into cognitive patterns shaped by culture. We systematically evaluate multiple LLMs using quantitative (INDCOL scale) method, assessing their intrinsic HVIC traits and ability to simulate cultural and gender-based differences. Our findings reveal LLMs' capacity to capture HVIC nuances, providing a unique lens for studying human cognition through human-LLM comparisons. This research contributes to developing culturally sensitive AI systems and offers new perspectives on human HVIC traits, advancing both theoretical understanding and practical applications of AI.

Variable Properties of Auditory Image Analysis: A Case Study of Selected Musical Works

The aim of this study is to analyse the auditory scene of musical works and to demonstrate that different compositions may prompt the emergence of distinctly interpreted perceptual streams in the listener's mind. The research focuses on selected excerpts from works by Alexandre Guilmant, Ludwig van Beethoven, and Antonio Vivaldi, which, due to their unique characteristics, elicit diverse auditory impressions. By combining score analysis with auditory scene analysis, this paper seeks to explain how different interpretations of the same sounds result in dissimilar auditory impressions. The auditory scene analysis presented here provides deeper insight into the process of stream formation and its implications for musical performance and aesthetic perception. The findings indicate that perceptual stream formation in music is considerably more complex and context-dependent than previously assumed, with implications for how listeners interpret auditory scenes.

Exploring a Problem Before Instruction Using Graphs versus Tables

Traditional education is instructor-centered. Providing exploratory learning activities before instruction typically engages students and improves learning. However, the design of exploratory learning activities can impact learning processes. This study investigated whether using tables or graphs in a statistics activity impacted exploratory learning processes and outcomes. Undergraduate students (N=252) in classroom and lab settings were taught about standard deviation. In instruct-first conditions, students received instruction, then an activity including either graphs or tables. In explore-first conditions, students explored either activity before instruction. After exploring, participants in the explore-first condition reported greater knowledge gaps and curiosity than the instruct-first condition. Graphical materials reduced cognitive load compared to tabular. However, instructional order and activity design did not impact learning outcomes (procedural knowledge, conceptual knowledge, representational transfer). Conceptual understanding was highest if students attempted multiple solutions while exploring graphical materials. Depth of exploration may affect conceptual benefits, especially when using graphical materials.

Using Exploratory Learning Methods to Challenge Sociopolitical Beliefs

Exploratory learning before instruction has been effectively employed in STEM education to promote deeper conceptual understanding. However, its application to sociopolitical reasoning is underexplored. This research investigated whether exploratory learning can mitigate biased information processing, fostering more reflective evaluation of counter-attitudinal sociopolitical information. We examined how the order in which participants engaged with exploratory (data table) versus directly persuasive (verbal message) stimuli influenced the strength and confidence in sociopolitical beliefs. Participants reported increased support for positions they had initially opposed, and to large effect, regardless of stimuli order. However, an interaction of time and order on confidence levels hinted at potential metacognitive benefits for participants who explored first. Exploratory learning may be less beneficial in the context of sociopolitical decision making—at least when individuals are likely to update their beliefs anyway. However, exploratory learning might impact individuals' metacognition when the messenger contradicts their political position.

Advancing Adolescent Depression Detection through Multi-Task EEG Signals and Biosignal Learning

The rising prevalence of adolescent depression is a major public health concern. Current diagnostic methods, often burdensome and lacking objective biomarkers, hinder early detection. This paper proposes a novel multi-task framework integrating attention and resting EEG signals using Biosignal Learning and Agent Transformer (BLAT) for depression detection. EEG data is segmented, channeled, and positionally encoded, followed by feature extraction and classification via Agent Transformer. Experiments with 50 depressed adolescents and 50 controls achieved 85% accuracy using BLAT, offering a promising method for early adolescent depression screening.

Exploring the Face Inversion Effect as an Indicator of Age Bias: The Impact of External Facial Features

We report here two behavioural experiments that investigated the Own Age Bias (OAB), better recognition performance of own-age vs. other-age faces, measured by the Face Inversion Effect (FIE). Both experiments employed an old/new recognition task where upright and inverted own (young adults) and other (older adults) where presented intermixed. Experiment 1 (n=48), used real-life faces, and revealed a robust OAB, where a significantly larger FIE (higher recognition for upright vs. inverted faces) was found for own vs. other age faces due to a reduced performance for upright other age faces. Experiment 2 (n=48) used standardised faces and revealed no effect of OAB, and no difference between upright own vs. other age faces. We interpret our results in the context of the perceptual learning and faces recognition literature.

SAMM: A Selective Attention Sequential Model for EEG-EOG Vigilance Estimation

Driver vigilance estimation plays a critical role in preventing fatigue-related traffic incidents. Current multimodal methods leveraging EEG and EOG signals often suffer from high computational costs due to the reliance on self-attention mechanisms like Transformers. To address these challenges, we propose a novel framework, Selective Attention Sequential Model (SAMM), which integrates a dynamic channel attention mechanism and the Mamba sequence modeling approach. By replacing traditional Transformer modules with Mamba's selective state spaces, our model achieves linear-time complexity while effectively capturing both local and global features.The SAMM framework fuses EEG and EOG signals using early fusion and employs a deep channel attention mechanism to enhance localized feature extraction. Mamba further complements this by efficiently modeling global dependencies in multimodal data, thus reducing computational costs while maintaining high accuracy. Extensive experiments on public datasets, SEED-VIG and SADT, demonstrate that SAMM achieves state-of-the-art performance with a significant reduction in inference time.

Detection of AI-Generated Contents Based on Dyadic-Brain Neural Synchronization

The proliferation of DeepFake has engendered widespread societal concerns, positioning its detection as a pressing imperative. Although existing studies have utilized single-subject EEG to distinguish between real and AI-generated content (AIGC), there is still a lack of research exploring dual-brain EEG and multimodal experimental paradigms. This study introduced a novel experimental paradigm, employing EEG hyperscanning to construct a dyadic EEG dataset for AIGC detection. This study employed inter-subject correlation (ISC) analysis to investigate the differences of interpersonal neural synchronization (INS). Additionally, this study proposed a novel neural network model named Squeeze-and-Excitation Depthwise Separable Convolution (SEDSC) for predicting the authenticity of real vs. AIGC. ISC analysis revealed apparent differences in INS under different modalities, valences, and animacy. Specifically, across the four frequency bands, both text and audio modalities elicited higher inter-brain synchronization under real materials than under AIGC materials. SEDSC utilized the phase locking value to assess inter-brain functional connectivity and weighted the inputs from four frequency bands before feeding them into the network for classification. This approach achieved a classification accuracy of 92.42% in distinguishing real from fake content. This study designed a new experimental paradigm and constructed a dataset, confirming that there were evident differences in INS during tasks involving real and AIGC materials. Furthermore, SEDSC successfully predicted the authenticity of the content.

Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions

This paper examines the role of cognitive biases in the decision-making processes of large language models (LLMs), challenging the conventional goal of eliminating all biases. When properly balanced, we show that certain cognitive biases can enhance decision-making efficiency through rational deviations and heuristic shortcuts. By introducing heuristic moderation and an abstention option, which allows LLMs to withhold responses when uncertain, we reduce error rates, improve decision accuracy, and optimize decision rates. Using the Balance Rigor and Utility (BRU) dataset, developed through expert collaboration, our findings demonstrate that targeted inspection of cognitive biases aligns LLM decisions more closely with human reasoning, enhancing reliability and suggesting strategies for future improvements. This approach offers a novel way to leverage cognitive biases to improve the practical utility of LLMs across various applications.

Age-related differences in processing event knowledge during real-time language comprehension

To understand language, we use knowledge about everyday events to create rich internal (situation) models. Although knowledge increases with age, fluid cognitive abilities tend to decline, potentially making it more difficult to access that knowledge. Here, we asked how aging affects the ability to use event knowledge during real-time language comprehension. We recorded event-related brain potentials as younger and older adults read vignettes about everyday events. Both groups showed facilitation on the N400 (a neuroelectric marker of semantic processing) for words that fit the context. However, only younger adults showed facilitated N400s to anomalous but event-related words compared to unrelated anomalies. Among older adults (aged 53-80), there was a negative correlation between age and N400 effects of event-relatedness. We conclude that real-time access to event knowledge during language comprehension may shift across the course of the adult lifespan such that older adults restrict activation to the most immediately relevant content.

Investigating children's performance on object- and picture-based vocabulary assessments in global contexts: Evidence from Kisumu, Kenya

Assessments of early cognitive and linguistic abilities typically involve picture stimuli. As these assessments spread worldwide, researchers make an implicit assumption: that children across contexts understand pictures in the same way, at the same developmental timepoint. What if this assumption does not hold for some or all kinds of pictures? In the present research, a preregistered sample of 128 3- to 7-year-olds from Kisumu, Kenya participated in a Swahili vocabulary assessment. Using a within-subjects design, each participant completed vocabulary trials in four formats (i.e., objects, photographs, cartoons, black-and-white line drawings). Preregistered analyses showed that children performed equally accurately across object, photograph, and cartoon formats, but less accurately in the line drawing format. However, exploratory analyses suggested that a subset of line drawings drove this difference. These findings suggest that caution is necessary in the use of picture stimuli and that assessments involving line drawings may sometimes underestimate children's capacities.

Coherence-Based Evidence Filtering: A Computational Exploration

This study explores the role of coherence-based reasoning in belief updating within uncertain environments. We develop a novel computational model where agents update their beliefs based on observed evidence, with some evaluating the coherence of their belief set before accepting new evidence. Our results show that coherence-based evidence filtering improves belief accuracy in noisy environments and when agents' prior beliefs are accurate. However, when agents encounter systematically misleading evidence, coherence considerations lead to less accurate beliefs. These findings shed light on how coherence interacts with evidence quality and belief accuracy.

A simulation-heuristics dual-process model for intuitive physics

The role of mental simulation in human physical reasoning is widely acknowledged, but whether it is employed across scenarios with varying simulation costs and where its boundary lies remains unclear. Using a pouring-marble task, our human study revealed two distinct error patterns when predicting pouring angles, differentiated by simulation time. While mental simulation accurately captured human judgments in simpler scenarios, a linear heuristic model better matched human predictions when simulation time exceeded a certain boundary. Motivated by these observations, we propose a dual-process framework, Simulation-Heuristics Model (SHM), where intuitive physics employs simulation for short-time simulation but switches to heuristics when simulation becomes costly. By integrating computational methods previously viewed as separate into a unified model, SHM quantitatively captures their switching mechanism. The SHM aligns more precisely with human behavior and demonstrates consistent predictive performance across diverse scenarios, advancing our understanding of the adaptive nature of intuitive physical reasoning.

Structuralist Approach to AI Literary Criticism: Leveraging Greimas Semiotic Square for Large Language Models

Large Language Models (LLMs) excel in understanding and generating text but struggle with providing professional literary criticism for works with profound thoughts and complex narratives. This paper proposes GLASS (Greimas Literary Analysis via Semiotic Square), a structured analytical framework based on Greimas Semiotic Square (GSS), to enhance LLMs' ability to conduct in-depth literary analysis. GLASS facilitates the rapid dissection of narrative structures and deep meanings in narrative works. We propose the first dataset for GSS-based literary criticism, featuring detailed analyses of 48 works. Then we propose quantitative metrics for GSS-based literary criticism using the LLM-as-a-judge paradigm. Our framework's results, compared with expert criticism across multiple works and LLMs, show high performance. Finally, we applied GLASS to 39 classic works, producing original and high-quality analyses that address existing research gaps. This research provides an AI-based tool for literary research and education, offering insights into the cognitive mechanisms underlying literary engagement.

The self-regulated learning paradox: Or, one reason why educational interventions might fail

Why do large-scale field experiments in education often have muted effects? Drawing on system dynamics and self-regulated learning theory, we sought answer this question by simulating the behavior of self-regulated (discrepancy-reducing) learners over time affected by different types of educational interventions. We analyze three types of interventions: changing students' learning rates (learning strategies), intercepts (prior knowledge or teaching effectiveness), and norms of study (achievement goals). We uncover situations where educational interventions can affect achievement in the short run, but typical cross-sectional analyses do not find a measurable effect in the long run. Results indicate that highly motivated, self-regulated learners may resist external interventions, particularly those targeting learning strategies or prior knowledge. In contrast, interventions show the greatest effect on achievement when students are under time constraints and struggling to achieve their desired performance. Ultimately, self-regulated learners may be the hardest to help, a phenomenon we call the "self-regulated learning paradox."

Co-Constructing Meaning with Large Language Models: A Longitudinal Analysis of Human–AI Dialogues in Emotional Support Contexts

This study investigates how Large Language Models (LLMs), specifically Baidu's Ernie Bot, shape personal narratives when users seek emotional support over repeated sessions. Sixteen participants from China engaged in weekly chatbot interactions for four weeks, supplemented by reflective diaries and pre-/post-study interviews. Conversation analysis and quantitative measures (e.g., mood ratings, meaning-making scales) revealed incremental shifts in user language, including increased lexical alignment with AI-generated phrases and more positive emotion words. In-depth interviews highlighted the complex process by which participants alternately embraced or resisted the AI's framing, with many reporting newfound perspectives and a sense of empathic resonance. However, some voiced skepticism regarding the AI's genuine capacity for emotional understanding, underscoring ethical dilemmas related to anthropomorphism and data privacy. Overall, the findings suggest that iterative dialogues with an empathic-seeming LLM can facilitate meaningful narrative reframing, albeit with notable variations in user experience and potential risks of over-reliance.

MGHGCN: Boosting EEG-based Emotion Recognition Through Multi-granular Hypergraph Convolutional Networks

Emotion recognition using electroencephalography (EEG) represents a significant area of study in brain-machine interfaces. To address this multifaceted challenge, it is crucial to improve the ability of EEG features to represent emotional states. A hypergraph-based methodology allows for the depiction of higher-order spatial correlations to develop distinguishing emotional features. However, the original hypergraph may lack robustness due to potential interference among local channels. In addition, excessively coarse hypergraph granularity can result in the loss of critical information. To mitigate these issues, we propose hypergraph group learning, which aims to balance robustness with the retention of detailed information. In this study, we model temporal and spatial dependencies across varying granularities using Hypergraph Group Learning to achieve a discriminative representation of emotional features. We used multiple CNN convolutions to map EEG signals from different brain regions and time segments into a unified distribution. The multi-granularity hypergraph convolutional network (MGHGCN) is specifically designed to capture long-term temporal correlations among channels effectively. By integrating multiview fusion, we significantly improved the accuracy and robustness of EEG-based emotion recognition. Experimental results from publicly available datasets, including SEED, SEED-IV, and EMOT, validate the effectiveness of our approach, achieving precisions of 98.51 (2.46) %, 89.20 (6.13) % and 97.79 (1.31) %, respectively. These results demonstrate that our hypergraph effectively maintains both robustness and detailed information.

Prediction of Cognitive Impairment in Middle-aged and Elderly People: A Method Based on Granger Causality

Cognitive impairment is a common disease among middle-aged and elderly people, which seriously affects health outcomes and quality of life, and carries a risk of progressing to severe stages such as dementia. Early identification is beneficial for timely intervention and treatment. This study proposes a new model for predicting cognitive impairment that integrates static and dynamic data, including medical, demographic, and social relationship features. It combines Granger causality with deep learning and uses multiple metrics to evaluate model performance. The performance comparison results between our model and the baseline model demonstrate that our model's predictions have a certain level of accuracy. In addition, causal features derived from Granger causality analysis are used to identify cognitive impairments. Statistical analysis shows that the selected features have statistical significance, further verifying the robustness of our model and its potential for predicting cognitive impairment.

Precise SO-ripple coupling facilitates the signal transmission during slow-wave sleep

The interaction between hippocampal ripples and cortical slow oscillations (SO) has been proposed to play a critical role in memory consolidation during sleep. However, the neuronal mechanisms underlying the transmission of ripples within cortical regions remain poorly understood. In this study, we used a computational model to investigate how ripple events propagate through cortical networks. We found that sparse and weak inter-areal connections impede ripple propagation, while dense and strong inter-areal connections facilitate it. Notably, our findings reveal that when cortical networks exhibit slow oscillations (SOs), characterized by alternating up and down states, ripples occurring before the SO peak can propagate to distant cortical areas even in the presence of sparse and weak inter-areal connections. These results indicate that the precise coordination between SO and the ripple promotes efficient communication in cortical regions during sleep. This study offers new mechanistic insights into the role of SOs during slow-wave sleep, deepening our understanding of the processes underlying memory consolidation.

The Discovery of the Artificial and the Use of Synthetic Method between Physics and Cognitive Science

In this work, we outline a methodological analogy between cognitive sciences and physics regarding the use of models and the synthetic method. Beginning with brief historical remarks on ‘the discovery of the artificial' (as defined by Roberto Cordeschi) in early 20th century behavioral sciences and the methodological turning point in statistical physics at the same time, we demonstrate that the ‘envy' for the use of analytical theories in the exact sciences—often referred to as physics envy—which has significantly influenced the development of psychology and the sciences of human behavior, is ultimately unfounded. Finally, we use this ‘overcoming' of the physics envy, along with some brief considerations on notable 20th century theoretical results related to ‘limitation' of computability and complexity, to demonstrate how cognitive sciences —like the natural sciences— must necessarily rely on models and simulations. This necessity arises from the inherent complexity of the systems under study, which precludes their treatment in analytical and exact terms.

The Impact of Similar Place Avoidance on Novel Word Learning in Adults

Similar Place Avoidance (SPA) is the cross-linguistic tendency whereby languages avoid transvocalic consonants with the same place of articulation within a word. In this study, we examine if SPA is the result of learning biases against words where the consonants share a place of articulation. In two experiments we examine whether adults show a learning difference between place-disagreeing novel words (e.g. [tip]) and place-agreeing novel words (e.g. [tid], where [t] and [d] are coronal). Participants are taught novel words and are then tested in an object-mapping or lexical decision task. We measure participants' learning performance based on accuracy and reaction times. Results indicate that, while accuracy is comparable for place-agreeing and place-disagreeing words in both tasks, participants' lexical decision responses are generally slower for place-agreeing words. These results suggest that participants experience processing difficulties when accessing newly-formed representations of place-agreeing words, which may contribute to the existence of SPA.

A Mechanistic Perspective of Face Perception Latency: Predictive Coding

Face processing is widely regarded in cognitive science as the integration of individual features into a holistic percept. However, recent neuroscience research highlights a more nuanced interplay between holistic and featural mechanisms, with specific facial features receiving greater emphasis during early perception. Event-related potential studies reveal that the number and type of parafoveal features significantly influence neural response delays, yet the underlying mechanistic model remains unclear. This paper examines these phenomena through the lens of the predictive coding network, a biologically plausible alternative to traditional deep neural networks. Our findings show that predictive coding networks accurately simulate the influence of parafoveal features on neural response times while upholding the saliency hierarchy of facial features. These results provide a computational explanation for the observed neural delays and highlight the potential of predictive coding as a robust framework for understanding face perception in the human brain.

Cognitive Priming Prompting Facilitates Knowledge Elicitation in Multilingual Large Language Models

Multilingual large language models (MLLMs) typically underperform when answering questions in non-native languages compared to their native language. Although existing translate-then-answer prompting methods partially alleviate this issue, their performance remains suboptimal compared to directly answering questions in the native language. Moreover, current studies lack a clear explanation for this gap. In this study, we attribute this issue to incomplete Cognitive Priming, a phenomenon observed from human cognition. And we point out that while existing methods achieve Language Priming (LP), they overlook Domain Priming (DP). To address this, we propose Cognitive Priming Prompting (CogPrim), which employs a Role-Enhanced Multi-MLLM Collaboration strategy to ensure both LP and DP, thereby improving knowledge elicitation for non-native QA tasks. Across five language question-answering benchmarks, CogPrim outperforms all state-of-the-art related methods. This approach contributes to advancing the understanding of human-like cognitive behaviors in MLLMs, fostering better MLLM service to a broader range of multilingual users.

Cognitive Insights into Document Comprehension: The Role of Reading Order and Visual Attention in Human and Large Language Models

This study investigates how integrating human eye-tracking data into Large Language Models (LLMs) and Visual Large Language Models (VLLMs) can enhance document comprehension in tasks that require both linguistic understanding and visual attention, specifically Semantic Entity Recognition (SER) and Document Question Answering (DQA). Despite rapid advancements in AI-based document understanding, LLMs still face challenges in replicating the depth of human cognition, particularly in how reading order and visual attention affect comprehension. The results demonstrate that human reading order and the regions they focus on significantly impact performance in both tasks. Additionally, while LLMs do not need to fully mimic human reading sequences, their performance improves when their attention patterns align more closely with human visual strategies. This highlights the importance of incorporating cognitive-inspired attention mechanisms in AI systems, offering a path to better AI models that reflect human cognitive strategies in complex document understanding.

Linguistic Creativity affects Discourse Expectations related to Contiguity Relations but not Implicit Causality

The present study employed a creativity-on-demand task to investigate the variability of discourse biases associated with Implicit Causality verbs when language users are demanded to come up with original discourse continuations. While Implicit Causality turned out to be unaffected by the creativity manipulation, the likelihood to continue with a contiguity relation was influenced by the creativity manipulation. These findings are in line with the Two-Mechanism Account by Solstad & Bott (2022, 2023) grounding Implicit Causality in semantic constraints imposed by lexical semantics and relating other next-mention biases to the Contiguity Principle, a more general discourse pragmatic principle.

Is baseline pupil size a good measure to assess attentional control?

Research has indicated that baseline pupil size may reflect attentional control capabilities. Several studies have demonstrated links between resting pupil measurements and various cognitive functions, including working memory capacity and fluid intelligence. The existing literature presents mixed evidence, with some studies supporting this relationship while others fail to replicate these findings. Moreover, the validity of this relationship across different populations has come into question, particularly when considering age-related physiological changes that affect pupil size. Our study sought to investigate whether the relationship between baseline pupil size and attentional control remains consistent across age groups. Results showed no meaningful correlation between baseline pupil size and attentional control. In examining possible explanations for these disparate results, we identified several methodological challenges, including inconsistencies in testing environments and variations in pupil measurement protocols, that may account for the conflicting findings in previous research. These observations highlight the importance of developing more standardized experimental approaches to properly evaluate baseline pupil size as an indicator of attentional control, particularly when studying aging populations.

Research on Urban Data Visualization Based on Big Data: Transforming Insights into Action

This paper presents a big data urban visualization platform for Guangdong Province, aimed at enhancing the efficiency and intelligence of urban planning. The platform utilizes Python to collect city-specific data, with data storage implemented using a MySQL database, complemented by NoSQL technologies to support the integration of unstructured data. The Flask backend employs deep learning and data mining algorithms to identify complex relationships among urban data, and we have also integrated graph neural network methods to capture spatial dependencies across different geographic regions within the city. Meanwhile, the ECharts frontend generates dynamic charts to present diverse information. Through a front-end and back-end separation architecture, the system ensures real-time updates, enhancing user experience. This research further emphasizes the need to explore challenges in real-time data integration, expanding data sources, optimizing user interactions, and protecting data privacy, providing important directions for future AI-driven urban planning.

DHRec: A Debiased Hyperbolic Recommendation Model

Personalized recommendation aims to recommend candidate items to users based on their preferences by simulating their cognitive decision-making process. User-item interaction data typically follows a power-law distribution. However, existing works usually learn the representations of users and items in Euclidean space, resulting in a mismatch between the data volume space and the embedding space, which causes significant distortion in the representations. Moreover, the presence of cognitive biases, such as conformity, can also introduce distortion in representation learning. Therefore, we propose a Debiased Hyperbolic Recommendation model, called DHRec. Specifically, first, we choose to model the representations of user and item in hyperbolic space, which has exponential growth capabilities. Second, in addition to the user-item interaction graph, we also construct semantic graphs to capture the semantic neighbor information of users and items. Then, by adjusting the weights of neighbor nodes, we learn debiased representations of users and items, effectively alleviating the bias caused by conformity. Finally, we compute the predicted scores between user and candidate items in hyperbolic space. Extensive experiments on three datasets demonstrate that our model surpasses the strongest baseline, achieving a 11.04% and 10.09% improvement on Recall and NDCG, respectively.

A Multimodal In Vitro Diagnostic Method for Parkinson's Disease Combining Facial Expressions and Behavioral Gait Data

Parkinson's disease (PD), characterized by its incurable nature, rapid progression, and severe disability, poses significant challenges to the lives of patients and their families. Given the aging population, the need for early detection of PD is increasing. In vitro diagnosis has garnered attention due to its non-invasive nature and low cost. However, existing methods present several challenges: 1) limited training data for facial expression diagnosis; 2) specialized equipment and acquisition environments required for gait diagnosis, resulting in poor generalizability; 3) the risk of misdiagnosis or missed diagnosis when relying on a single modality. To address these issues, we propose a novel multimodal in vitro diagnostic method for PD, leveraging facial expressions and behavioral gait. Our method employs a lightweight deep learning model for feature extraction and fusion, aimed at improving diagnostic accuracy and facilitating deployment on mobile devices. Furthermore, we have established the largest multimodal PD dataset in collaboration with a hospital and conducted extensive experiments to validate the effectiveness of our proposed method.

JUDICIOUS: Evaluating Robustness of Large Language Models in the Legal Realm

In recent years, the remarkable performance of large language models (LLMs) in tasks such as legal judgment prediction (LJP) has garnered widespread attention. An increasing number of LLMs have been successfully implemented to assist judges in performing various legal tasks. However, their robustness and reliability in complex judicial scenarios remain a subject of debate, particularly when confronted with real-world legal cases. Existing research often overlooks the systematic evaluation of these LLMs in terms of judicial fairness, robustness and other ethical considerations. To fill this gap, we propose a novel benchmark that integrates authentic legal cases to evaluate the robustness of LLMs in the legal judgment prediction (LJP) task. Our work establishes foundational safety standards for applying LLMs in the legal domain.

The Role of Object Attention in Relational Mapping Changes Over Development

Relational reasoning develops slowly. Children's difficulty may stem from the difficulty of inhibiting object attention, but the role of object attention in relational reasoning remains unclear. Experiment 1 tested the hypothesized trade-off between processing relational and object information using a novel match-then-recognize paradigm. The relational-mapping task required participants to match ordinal positions of objects, followed by an object memory test. In adults, there was a clear trade-off: object recognition was negatively correlated with relational mapping. Children's object recognition was lower than adults' but positively correlated with relational matching. Experiment 2 further tested the role of inhibition by occluding object matches. Surprisingly, older children and adults actively removed occluders, to their own detriment, whereas the youngest children neither removed occluders nor were helped by them. Findings suggest that selective attention can be crucial for relational reasoning, but ignoring object information may be less important for success than total allocated attention.

Verbs are sometimes redundant: Korean preschoolers' comprehension of Korean active transitive construction

Motivated by the two contrasting forces in shaping linguistic knowledge—efficiency and redundancy, the present study examines sentence comprehension behaviour amongst Korean children aged three to six, focusing on verbs (relative to case markers) in interpreting transitive events. Through picture-selection experiments that systematically omit and obscure portions of transitive sentences, we find (1) a reduced role of verbs in sentence comprehension and (2) age-related variations in the application of case-marking knowledge. These findings suggest that verbs may sometimes become redundant during comprehension, which is attributable to early maturation and strong automatisation of verbs. This provides support for verb-periphery strategies so as to maximise efficiency in language activities amongst Korean preschoolers.

Korean monolingual children's comprehension of suffixal passive construction: A webcam eye-tracking study

Building on Shin (2022), the present study examines how Korean monolingual children comprehend suffixal passive constructions by employing a webcam eye-tracking method, aiming to test two theoretical accounts of grammatical generalisation (gradual vs. early abstraction). Twenty-eight children aged three to six, alongside 20 adults, joined picture-selection experiments paired with eye-gaze measurements. The findings indicate that children's utilisation of passive-voice heuristics remains limited yet developing, overshadowed by well-entrenched active-voice knowledge. In particular, the eye-gaze data reveal processing challenges related to the passive voice, mainly interpretive difficulties arising from passive morphology. These results replicate those of Shin (2022), offering further support for a moderate version of each account that emphasises the pivotal role of linguistic exposure in mastering linguistic knowledge. From a methodological standpoint, this study enhances the accessibility of webcam eye-tracking research for understudied languages in the field.

The Computational Mechanism of How Music Influences Food Choices: A Drift-Diffusion Model Analysis

Food choices, as a type of value-based decision, are affected by environmental cues. We conducted three studies to investigate the computational mechanisms through which background music influences the food choice-making process. Hierarchical drift-diffusion modeling revealed that nature-related music led to higher drift rates than urban-related music, indicating faster evidence accumulation toward certain choices. Specifically, participants processed the value of vegetable-forward meals more efficiently when exposed to nature-related music compared to urban-related music. Moreover, the effect of music on the drift rate varied with the vividness of music-induced mental imagery and the perceived identity of performers (human or robot). Collectively, these findings reveal the computational mechanisms underlying the influence of environmental cues on value-based decision-making.

Think outside the box: Making up casual hypotheses from unreliable evidence

Human adults think of the natural world as orchestrated by rules, yet many of them are neither equally rigid nor clearly evident. Some are beset by exceptions, and others are not intuitive. The problem of rule learning is especially salient in development, as children are continuously learning how the world works. What cognitive mechanisms underpin this rule learning? We propose a computational model for formulating and testing hypotheses in naturalistic contexts, that combines Bayesian inference under uncertainty over self-generated and social evidence with formal rules and optimistic information-seeking heuristics. We validate our model experimentally, showing that it explains 7- to 10-year-olds' behavior in a rule-based, physical task, including the distribution and the types of evidence children sampled. The proposed model outperforms both a purely rule-based Bayesian hypothesis search and a resource-rational random sampling approach. Our results suggest that children implement an internal mechanism for generating and testing a limited number of hypotheses, including formal programmatic rules and heuristics generated from salient problem features to seek more evidence when formal rule generation fails.

RPW-EEG:An Unified Framework for Robust and Practical Watermark of EEG

With the growth of the metaverse and XR technologies, BCIs are expanding from medical applications to various consumer industries. However, this broader adoption has raised concerns about the privacy and security of EEG data. To address this challenge, we propose the RPW-EEG framework, which embeds copyright information as perturbations to enhance the security and traceability of data, while ensuring the robustness and usability of the data. The framework adopts an encoder-decoder architecture for end-to-end training, incorporating a noise layer to enhance the stability and anti-attack capabilities of the watermark data. Additionally, to prevent the loss of task-related features in EEG data, we introduce a plug-and-play fine-tuning module that restores these features within the watermark-embedded signals. Experimental results show that RPW-EEG outperforms baseline models in watermark quality, with extraction accuracy exceeding 61% under various attacks, and achieves 88.5% classification accuracy for task paradigms, effectively balancing copyright protection and EEG data usability.

Condensed Representation Learning for Interactive Driving Styles Recognition

Automated vehicle (AV) validation faces the "billions of miles" challenge, requiring high-fidelity simulations to replicate diverse interactive driving behaviors for safety. Traditional methods oversimplify by using uniform behavioral models, ignoring the diversity of human driving styles, which are deeply influenced by individual psychological traits. This research introduces a condensed framework for representing interactive driving styles, by incorporating these psychological dimensions, balancing completeness and complexity. Key features include: i) individual style recognition via attention mechanisms and hierarchical contrastive learning, capturing subtle cognitive-based interaction patterns that reflect underlying differences in driver psychology (e.g., risk tolerance, decision-making heuristics); ii) scenario-independent style compression, filtering external factors to extract intrinsic driver intentions; iii) dimensionality-aware refinement, mapping complex behaviors to low-dimensional psychological axes for efficient computation. Tests on the NGSIM dataset reduced testing complexity by decoupling styles from scenarios. Compared to traditional methods, style distinctiveness improves by 28% (entropy-based), with 85% edge-case behavior coverage. This framework supports scalable AV testing by integrating diverse, psychologically-informed driving styles without combinatorial complexity.

Tactile Perspective-Taking: Cognitive Process of Estimating Others' Subjective Tactile Similarity

Understanding how others perceive the tactile world is essential for human communication. Each individual has a unique and multidimensional tactile perceptual space, which makes it challenging to understand others' tactile perceptions within the spaces. This study investigates the cognitive ability and process of estimating another person's subjective similarity of various textures, a key aspect of tactile perceptual space. Participants performed tasks to estimate how a target individual would rate the similarity of tactile stimuli pairs. Results showed that participants could partially infer target's similarity ratings, and their estimated tactile similarity ratings converged toward an intermediate point between their own and the target's. The estimation process included exploration, adaptation, and overgeneralization phases. Moreover, participants' own similarity ratings shifted closer to those of the target after performing the task. These findings suggest that estimating another person's subjective similarity between textures involves incorporating elements of the target's tactile perceptual space into one's own.

Cognitive maps are generative programs

Making sense of the world and acting in it relies on building simplified mental representations that abstract away aspects of reality. This principle of cognitive mapping is universal to agents with limited resources. Living organisms, people, and algorithms all face the problem of forming functional representations of their world under various computing constraints. In this work, we explore the hypothesis that human resource-efficient planning may arise from representing the world as predictably structured. Building on the metaphor of concepts as programs, we propose that cognitive maps can take the form of generative programs that exploit predictability and redundancy, in contrast to directly encoding spatial layouts. We use a behavioral experiment to show that people who navigate in structured spaces rely on modular planning strategies that align with programmatic map representations. We describe a computational model that predicts human behavior in a variety of structured scenarios. This model infers a small distribution over possible programmatic cognitive maps conditioned on human prior knowledge of the world, and uses this distribution to generate resource-efficient plans. Our models leverages a Large Language Model as an embedding of human priors, implicitly learned through training on a vast corpus of human data. Our model demonstrates improved computational efficiency, requires drastically less memory, and outperforms unstructured planning algorithms with cognitive constraints at predicting human behavior, suggesting that human planning strategies rely on programmatic cognitive maps.

Are object state changes represented during language comprehension? A non-replication and extension

Previous work suggests that non-visual object properties like weight are automatically integrated into event models during language comprehension. Horchak and Garrido (2021) found that Portuguese speakers' response times were faster when the state of a presented object (e.g., a smashed tomato) matched the event implied by the preceding sentence (e.g., You drop a bowling ball on a tomato). In an exact replication in English (Experiment 1), we failed to replicate this weight-state match effect. In Experiment 2, we examined the potential role of sentence focus, manipulating whether the target item served as the subject or direct object in the sentence. Response times revealed a weight-state match effect, but only when the target object was the focus (i.e., subject) of the sentence. Overall, these findings suggest that the representation of object state changes during language comprehension may depend on the interaction of object properties and language-specific syntactic constraints.

Investigating the Impact of Vocabulary Size on Lexical Networks using Latent Space Modeling

Lexical networks may vary as a function of individual differences in vocabulary knowledge and word-level features. Analyses often rely on descriptive network statistics, which do not support robust inferences. This study introduces the latent space model as a method for assessing the degree to which network structure is accounted for by word-level features. We analyze lexical networks from adults with below-average vs. above-average receptive vocabulary knowledge (n = 22 per group). We used latent space models to assess effects of semantic, taxonomic, and phonological similarity between words on network structure as well as effects of part-of-speech, concreteness, age-of-acquisition, and word frequency. For both groups, we found significant effects of semantic and taxonomic similarity, with additional effects of phonological similarity and concreteness for the low vocabulary group. These findings suggest increased reliance on fewer cues in lexical networks of adults with larger vocabularies. Implications for inferential modeling of lexical networks are discussed.

Neural Thurstone Model: Leveraging Latent Spaces for Collective Intelligence in Ranking Predictions

Thurstone models have been widely applied in wisdom-ofthe-crowd applications to aggregate individual rankings due to their ability to represent individual knowledge and achieve high accuracy. However, they lack the ability to generalize even across highly similar items and cannot leverage external knowledge bases or learned machine representations. In this work, we extend Thurstone models for partial ranking data by introducing a latent construct that maps pretrained vector representations to latent truths. These representations are finetuned through a single neural network layer, enhancing the model's ability to capture meaningful ranking structures. We evaluate our neural Thurstone model across objective ranking tasks, including animal speeds, material hardness, and the longitudinal positioning of U.S. states from west to east. Our results demonstrate that the extended model improves aggregation accuracy in sparse data settings and generalizes to novel items with moderate predictive accuracy, highlighting its potential to enhance collective intelligence in ranking-based inference.

Modeling word overextension in a Grey Parrot

Word meaning extension refers to the process by which a single word form develops multiple related meanings. Young children exhibit the capacity to extend word meaning, and previous research shows that such word overextension relies on multimodal semantic knowledge. We explore the evolutionary trace of word meaning extension by asking whether nonhuman animals might have this shared capacity. We compare meaning extension in children with the attested cases of overextension collected from the YouTube channel of a Grey Parrot, Apollo, who has acquired some English words. Our results show that parrot overextension can be predicted by a multimodal child overextension model better than baselines, which demonstrates that Grey Parrot may be using semantic knowledge similar to children for choosing words to express new referents. Our finding suggests that meaning extension is a cognitive ability identifiable in species about 320 million years apart from humans.

Spacing Meets Cross-Situational Word Learning: How the Temporal Structure of Labeling Events Affects Word Learning

Limited work has considered how the temporal distribution of labeling events affects word learning in ambiguous contexts, such as the cross-situational word learning paradigm, over real-world timescales. The temporal distribution of learning events can impact how well information is retained: spacing out information promotes retention more than presenting information in close succession. In the current study, adults were presented with novel object-word pairings across six different temporal schedules over four consecutive training days. Word learning was assessed either immediately after the final training session (N = 50) or after a week (N = 54). Results revealed that adults successfully disambiguated word-object mappings across all learning schedules at both test times, except for the massed and most spaced schedules at the 1-week delay. These findings suggest that temporal distribution effects emerge across extended timescales, but there might be constraints on the amount of spacing that is optimal for word learning.

Testing counterintuitive predictions about cost-based inferences in learning from the Rational Speech Act model

The Rational Speech Act (RSA) model has been employed to explain word learning and inferences based on the costs of forms. Here, we focus on a hitherto untested and counterintuitive cost-based effect predicted by RSA: In learning a lexicon with two forms and meanings, learners should prefer an ambiguous costly form over an ambiguous cheap form. We demonstrate this prediction in an RSA model. We then measure reaction times and lexicon ratings in a novel word learning task to test whether a lexicon with an ambiguous costly form is less surprising than one with an ambiguous cheap form. We found no clear evidence for this effect in either measure. We discuss alternative explanations for documented cost-based effects, and the possibility that cost-based inferences may not occur during learning.

Evidence-Enhanced Triplet Generation Framework for Hallucination Alleviation in Generative Question Answering

To tackle the issue of hallucination in generative question answering (GQA)—where the generated answer is nonsensical or unfaithful to the provided document—we introduce a novel framework called evidence-enhanced triplet generation (EATQA). This framework incentivizes the model to generate all possible combinations of ⟨Question, Evidence, Answer⟩ triplets by reversing the source pair and target label to grasp their logical interrelationships. Specifically, the model predicts the Answer (A), Question (Q), and Evidence (E) given the QE, EA, and QA pairs, respectively. Furthermore, we address the distribution gap during the inference stage to extract knowledge from the evidence more effectively. Our framework ensures that the model comprehends the logical connections between queries, evidence, and answers, thereby simultaneously enhancing evidence generation and question answering capabilities. In this study, we apply the EATQA framework to the LLama model, demonstrating superior performance compared to other large language model (LLM)-based methods and hallucination mitigation techniques on two challenging GQA benchmarks. Further analysis reveals that our method not only preserves the pre-existing knowledge within the LLM but also reduces hallucination and produces more accurate answers.

Empathy and Music Preferences: Exploring Valence-Arousal Patterns, and Sequential Listening Behaviors in Naturalistic Settings

Empathy has been linked to music preferences in controlled laboratory settings, but naturalistic settings like music streaming platforms remain unexplored. This study investigates how trait empathy influences music preferences and sequential listening behaviors in real-world settings. To this end, we collected and analyzed one-year Spotify listening histories of 290 Indian university students alongside their trait empathy scores, measured using the IRI scale. Our results reveal that individuals who score high on the IRI subscales of Empathic Concern, Fantasy, and Perspective Taking prefer sad music. Moreover, those scoring high on Empathic Concern or Perspective Taking were found to be more likely to transition from happy to sad music. These findings partially align with previous lab-based research, specifically for the subscales of Empathic Concern and Fantasy, while providing novel insights into the relations between the Perspective Taking subscale and music consumption. The study also provides novel insights into sequential listening behaviours, thus, strengthening the evidence that empathy shapes musical preferences and listening behaviors across diverse contexts.

Externalizing Imagery: Exploring the Phenomenology of Outsight

We adapted Irving's (2014) Image Control and Recognition Task (ICRT) to explore a phenomenon we term outsight. The ICRT is a visual synthesis task: Participants construct a mental image of an object following stepwise instructions. They are then asked to name and subsequently draw the imagined object. We focus on trials when participants after having failed to name their mental image, could do so after having drawn it. In this exploratory study, such outsight recognition occurred on 29% of the ICRT trials. In addition, outsight recognition was accompanied by some of the phenomenological markers associated with aha! experiences. We offer some reflections on the importance of reified imagery for creativity.

Do Young Children Learn Words from the Company they Keep?

The challenge of early word learning is often framed as one of individually mapping words to their referents. Yet children do not experience words just as individual labels, but as parts of broader language contexts, such as conversations and stories. In principle, word contexts might support word learning because words similar in meaning tend to occur in similar contexts. Thus, a child who knows some fruit words and has heard them in the context of "juicy" might learn that a "juicy mango" is also a fruit even without ever seeing a mango. Although children can use such contextual support to learn words in the lab, we do not know whether they harness contextual support in everyday language for real-world word learning. We quantified words' contextual support in children's everyday language input and found that it reliably predicted normative word learning, even accounting for other established predictors such as word frequency.

Some Assembly Required: Learning Facts in Isolation Limits Inferences

When learning with self-testing alone, will a learner make inferences between the tested items? This study examines whether self-testing's benefits extend beyond isolated facts to support broader connections between the facts. Comparing self-testing to self-explanation (a strategy known to facilitate inferential learning), we find that while self-testing participants show superior recall of individual facts, they perform significantly worse at making connections between those facts.

Temporal proximity inferences in complex sentence comprehension: Evidence from English complement and relative clauses

In language comprehension, mental representations of temporal relations between described situations are often construed by inference. While the basis for these inferences remains unclear, growing evidence suggests that abstract predicate properties – such as dynamicity and causal structure – play a crucial role in temporal event construal. Across two self-paced reading experiments, replicating and extending Gennari (2004), we find that temporal proximity inferences are shaped by these properties, but only for stative predicates that generally encode non-dynamic situations without causal structure: Participants consistently expected states to overlap in time, in both complement (Experiment 1) and relative clause constructions (Experiment 2). These findings indicate that temporal proximity inferences arise as a general feature of (non)-dynamicity, supporting models of language comprehension that prioritize abstract event structural properties in shaping temporal inferences.

MSCNN-ADDA: A Cross-Subject P300 EEG Decoding Algorithm Based on a Multi-Scale Convolutional Neural Network and Adversarial Discriminative Domain Adaptation

A brain-computer interface (BCI) enables direct communication between the brain and external devices. Despite progress, EEG decoding still faces challenges: 1) how to shorten or eliminate the calibration process in cross-subject BCI scenarios; 2) how to capture more characteristic features from different scales in EEG data; and 3) how to extract subject-independent EEG features more effectively. To address these, we propose a cross-subject EEG decoding algorithm based on a multiscale convolutional neural network (MSCNN) and domain adaptation for P300-based BCIs. The MSCNN was trained on a large-scale EEG dataset to extract subject-independent features, then fine-tuned via ADDA to align cross-subject data. In offline analysis, we achieved a cross-subject average accuracy exceeding 83%, indicating that we successfully established a domain adaptation-based cross-subject EEG decoding algorithm, which can eliminate the subject-specific calibration process for new subjects.

Understanding is Seeing: Metaphorical and Visual Reasoning in Multimodal Large Language Models

Drawing from the Conceptual Metaphor Theory and the Structure-Mapping Theory, this paper introduces two exploratory works in the field of metaphorical and visual reasoning using vision models and multimodal large language models. (i) The Multimodal Chain-of-Thought Prompting for Metaphor Generation task aimed to generate metaphorical linguistic expressions from non-metaphorical images by using the multimodal LLaVA 1.5 model and the two-step approach of multimodal chain-of-thought prompting. The results showed the model's ability to generate metaphorical expressions, as 92% of them were classified as metaphors by human evaluators. Additionally, the evaluation revealed interesting patterns in terms of metaphoricity, familiarity and appeal scores across the generated metaphors. (ii) The Metaphorical Visual Analogy (MeVA) task consisted in solving visual analogies of the kind "source_domain : target_domain :: source_element : ?" by choosing the correct target element among three difficult distractors, varying in semantic domains and roles. The results showed that all six models and humans performed higher than chance level, with only GPT-4o and ConvNeXt achieving higher than humans. Moreover, the error analysis showed that, in solving the analogies, the most frequent error was the selection of distractor 1. These works showed encouraging results for future research in the field of metaphorical and visual reasoning, contributing to the broader question of whether AI models serve as empirical tests of existing cognitive theories.

Multi-Option Polarization: How Deliberating More Options Both Increases and Decreases Polarization

Formal models in social epistemology explore why rational agents might polarize. While paradigmatic models focus on binary topics, e.g., "Is H true or false?", many real-world issues involve multi-option topics: "Which of n > 2 options is true/best?" This paper introduces a model of rational deliberation on multi-option topics to address the following question: As a group discusses more options, should we expect their beliefs to polarize more or less? We find a dual effect: as the number of options increases, agents are more likely to disagree on which option is most likely correct. This makes it harder to reach consensus on a single position. At the same time, their beliefs—and thus their disagreements—become less extreme. Hence, while agents are more likely to disagree, these disagreements are less intense. Since each trend aligns with a familiar concept of polarization, more options can increase or decrease polarization, depending on one's measurement.

Exploring the Intuitive Theory of Empathy

Empathy is an emotion that plays a key role in emotional understanding and perspective-taking, and has been identified as a strong motivator for prosocial behavior. We explore people's intuitive theory of empathy, focusing more specifically on the role that the concept of empathy plays in people's causal model of prosocial behavior. We suggest that people implicitly think of empathy as indexing the weight that the actor puts on the welfare of the recipient when deciding whether to help. We test this proposal by asking participants (N=150) to read a series of vignettes in which an actor has the opportunity to help a recipient in need. We find that participants have a robust expectation that actors who feel empathy for the recipient are more likely to help. Furthermore, participants seem to expect that actors who feel empathy are more sensitive to the potential benefits of an action when deciding whether to help. We also test if people can `invert' this intuitive theory to make inferences about an actor's empathy, given their observable behavior. We find only weak evidence that they can do so, although this might be due to limitations in our experimental design. Overall, our work is a first step toward elucidating the computational principles underlying laypeople's conception of empathy.

Do our theories of moral progress predict whether we vote? Evidence from the 2024 US election

Why do people vote—or fail to? We explore whether people's intuitive theories of moral progress shape their intentions and behavior. Specifically, does believing that human action is the driver of moral progress predict voting intention and actual voting behavior? In Study 1a (N=356), conducted one week before the 2024 U.S. presidential election, participants who endorsed stronger beliefs in human action as necessary for moral progress reported stronger voting intentions, mediated by a greater sense of personal responsibility. Study 1b (N=287), conducted post-election, found that human action beliefs did not directly predict actual voting, but indirectly predicted voting when mediated by responsibility. Efficacy (believing that voting is effective) was the only significant predictor of actual voting. Together, these findings highlight the role of personal responsibility and efficacy in driving voting behavior, with potential implications for the role of lay theories in shaping intentions and behavior more broadly.

Discovering Hidden Laws in Innovation by Recombination

Combining two things can create amazing new things - whether mixing water and flour or feeding large datasets into neural networks. Hypothesizing rules and theories for recombination, testing those hypotheses, and communicating our findings to each other are key cognitive mechanisms that allow us to navigate an open-ended world of possible combinations. However, in contrast to this open-ended and highly-complex search problem, cognition is constrained by its capacity. Using ideas from information theory, we hypothesize that the compressibility of recombination rules predicts how successfully people find and use these rules. In a combinatorial discovery game, we find that people indeed learn quicker and collect more points when the rules are more compressible. Interestingly, people use fewer words to communicate their findings when the rules are either too easy or too hard to compress, revealing an inverse-U shaped relationship between compressibility and communication effort.

Morphological Structure in the Arabic Mental Lexicon: Productivity and Priming Effects in Nominal and Verbal Patterns

Semitic languages are characterized as having two types of discontinuous morphemes: roots and word patterns. The role of these morphemes in lexical access and representation remains debated, especially in the case of word patterns. Roots exhibit robust priming for both nouns and verbs, while word patterns yield mixed results—verbal patterns tend to show stronger priming effects than nominal ones. While previous research (e.g., Deutsch et al., 1998) suggested that differences in productivity might explain these word class effects on word pattern priming, no study has directly investigated this hypothesis. To isolate the contributions of productivity and word class, we used a 2×2 factorial design crossing productivity (high vs. low) with word class (verb vs. noun). This design allows us to disentangle the two variables that were previously confounded in studies such as Boudelaa & Marslen-Wilson (2015). We did this using a masked visual priming experiment in Arabic. We found that, regardless of word class, high- productivity patterns showed robust priming, whereas low- productivity patterns did not. Additionally, priming from high- productivity patterns was distinct from semantic and orthographic effects, confirming the independent role of word patterns in morphological decomposition. These results support the dual-route model of lexical access (Baayen et al., 1997).

Stochastic search algorithms can tell us who to trust (and why)

Relying on information from other people (social testimony) is essential for efficiently learning and reasoning about the world. However, determining who to trust is often challenging. In this paper, we argue that trust in social agents (i.e., those providing testimony) can be evaluated by assessing how optimally they have acquired their knowledge. Building on theories that describe knowledge acquisition as a stochastic search through a space of hypotheses, we present a framework which yields predictions about which agents will provide better testimony (because they are more likely to have uncovered higher-probability hypotheses) in different contexts. This approach allows us to jointly predict how the quality of testimony is affected by 1) features of the agents themselves, like their expertise; 2) consensus among multiple agents; and 3) features of the topic and hypothesis space, like its knowability. We present initial simulations demonstrating how even a basic implementation of our framework yields insight into which types of agents and topics are more likely to result in accurate testimony (and why). We conclude by discussing how this preliminary research might be extended to address more complicated social reasoning scenarios.

Modeling Human Sequential Decision-Making in the Tower of London: Incorporating Individual Differences and Timing-Based Replanning Inference

Modeling human sequential decision-making behavior presents a significant challenge for researchers in artificial intelligence, robotics, and cognitive science. In this paper, we introduce a human behavior model designed to predict actions in the Tower of London task, addressing two critical aspects that have been largely overlooked in existing methodologies. First, we propose a profile-based action prediction framework that extracts user and task profiles from historical data, enhancing action prediction in novel scenarios. Second, we introduce a replanning detection component that leverages thinking time as an indicator of planning processes in the human mind, enabling a more precise representation of cognitive dynamics. Our evaluations demonstrate the effectiveness of the proposed model, achieving superior performance in behavior prediction within the Tower of London task. This work lays the foundation for more robust human behavior modeling in sequential decision-making environments.

A Cognitive-Computational Model of Comfort Categorization in Civil Aviation Propeller Aircraft

The present work is dedicated to modeling and analyzing processes of vibro-acoustic comfort perception among passengers in a civil aviation propeller aircraft. The experimental data analyzed in this study were collected as part of the European project IDEA PACI (IDEntification of an Aircraft PAssenger Comfort Index) and encompass both vibro-acoustic and psychometric characteristics. We introduce a computational model of the perceptual processes of passenger comfort for this vehicle class based on an automatic classification system with cognitive plausibility. This system incorporates both prototype theory and exemplar theory of categorization. The study has two primary objectives: first, to develop an artificial classification system with high performance, serving as a valuable support tool for designing more comfortable aircraft; second, to investigate which of the considered cognitive theories most accurately represents the categorization process of human comfort. The experimental results, including other instance-based systems, demonstrate that the computational model used (the Prototype Exemplar Learning - Classifier) effectively predicts human passenger comfort, and the type of representative instances inferred by the system indicates a clear predominance of exemplar theory over prototype theory in modeling the perception of comfort.

No directional preference for grammaticalization in semantic extension game

Grammaticalization is the process by which a lexical item (e.g., noun) acquires a more functional role (e.g., preposition) over time. Grammaticalization is considered largely unidirectional, that is, change from functional to lexical is far less common (Hopper & Traugott, 2003). What is the cause of this unidirectionality? Our experiment tests whether individuals have a preference in the direction of grammaticalization when performing semantic extension in communication. We focus on the phenomenon of using body part nouns as a source of spatial prepositions. We predicted that participants extending body parts to use as prepositions would find the task easier and more intuitive than participants extending prepositions to use as body parts. However, our results show no directional bias, indicating that the historical unidirectional tendency for body parts to be used as spatial relations does not originate in a bias that individuals have for using one to refer to the other.

Portraying Large Language Models as Machines, Tools, or Companions Affects What Mental Capacities People Attribute to Them

How do people determine whether non-human entities have thoughts and feelings — an inner mental life? Prior work has proposed that people use compact sets of dimensions (e.g., body-heart-mind) to form beliefs about familiar kinds, but how do they generalize to novel entities? Here we investigate emerging beliefs about the mental capacities of large language models (LLMs) and how those beliefs are shaped by how LLMs are portrayed. Participants (N = 470) watched brief videos that encouraged them to view LLMs as either machines, tools, or companions then took a survey measuring mental capacity attributions. We found that the companion group more strongly endorsed statements regarding a broad array of mental capacities that LLMs might possess relative to the machine and tool groups, suggesting that people's beliefs can be rapidly shaped by context. Our study highlights the need to explore the factors shaping people's beliefs about emerging technologies to promote accurate public understanding.

What and How Schema Networks Are Acquired During the Learning of Line Graphs: Modelling Using Representational Systems Theory.

This paper addresses a gap in our understanding of cogni-tion with external representations: what memory struc-tures are acquired and how they change during learning? The focus will be on line graphs. We adopt Representation Interpretive Structure Theory (RIST) and its modelling no-tation (RISN) as an approach to answer those questions. RIST is a schema theoretic account and RISN operational-izes its assumptions. Models for stages in the gradual ac-quisition of growing interpretive sophistication are built, from basic precursor components to advanced interpreta-tions. Learning mechanisms are proposed to explain the transition between the stages. During learning, the memory structures undergo localized incremental changes and more global radical restructuring of the networks.

Modeling Face Recognition Challenges in Autism Spectrum Disorder: A CNN-Based Approach

Computational modeling has been a crucial tool in cognitive science to understand human cognitive functions and impairments in neurocognitive disorders. Convolutional Neural Networks (CNNs) exhibit striking similarities to human visual processing systems for object recognition, making them a powerful tool for studying visual processes. In this study, we examined the neurobiological theories, namely, the Excitation/Inhibition (E/I) Imbalance and Internal Noise (IN) in explaining face recognition challenges in autism spectrum disorder (ASD) using CNNs, and revealed that over-excitation and increased noises in the CNNs led to compromised performance on face recognition and atypical patterns of internal representations of face stimuli. This approach enables systematic comparisons between typical and atypical cognition, offering a theory-driven perspective to investigate cognitive challenges and their neurocognitive mechanisms with a computational approach.

Feedback-correcting ConvLSTM-driven Neural Model for Stable Saccadic Visual Perception

The brain utilizes corollary discharge signals to anticipate the visual consequences of saccadic eye movements and provide a coherent visual perception. However, discrepancies between a saccade's predicted and actual sensory outcomes challenge the brain's capacity to maintain visual stability. In this work, we introduce a comprehensive computational framework for visual perception incorporating a feedback corrective mechanism that dynamically adjusts predictions based on sensory discrepancies. We show that this feedback mechanism refines internal world models, and provides robust performance with an increasing number of saccades. Our results highlight the delicate balance between the benefits and vulnerabilities of predictive feedback systems supporting and extending current theories of sensory prediction and visual stability.

Foraging Connections: Optimal Foraging in Letter Fluency

The letter fluency task is the timed listing of words that begin with a specific letter (e.g., words starting with T). Participants often list words in phonologically related clusters (e.g., tank, task, tap) and occasionally switch clusters (e.g., tap, thud). This process has been likened to patch switching in animal foraging. Optimal performance requires switching clusters in a manner that maximizes the rate of retrieving words, known as the marginal value theorem. Previous work has found evidence for this in semantic fluency. The current study tests whether people adhere to the marginal value theorem in letter fluency and whether executive functioning is associated with optimal performance. Three letter cues (T, N, and J) and one semantic cue (animals) were administered. Results are consistent with optimal search in N and J, but not T or animals. These findings provide mixed support that people search optimally during letter fluency.

Variation in Adults' Judgements about Relative Proportional Magnitude and Proportional Equivalence

Proportional reasoning is critical for successful functioning across domains and development. However, proportional information is also complex, resulting in behavioral variation across contexts and tasks. In the current study, we systematically compare adults' proportion judgements on a proportion magnitude comparison task and an equivalent proportion matching task with both dot arrays and continuous rectangles. We find that the match-to-sample task is more difficult than the magnitude comparison task and dot arrays are more difficult than the rectangles. Interactions between task and format, as well as specific patterns of errors, provide additional insight into possible explanations for these patterns. Overall, findings provide theoretical insight into the cognitive processes involved in solving proportional tasks and methodological insight into how to best design and interpret performance on both comparison and match-to-sample proportion tasks.

Amplifying Truth? Vocal Volume and Speakers' Self-Perceived Truthfulness

This study was the first to examine whether the volume of one's voice serves as an embodied cue for assessing information credibility. Eighty U.S. undergraduate students were randomly assigned to one of three conditions: loud, soft, or control. They read aloud in their assigned loudness condition while rating the truthfulness of trivia statements, followed by silently rating additional statements. The results revealed no significant effect of voice loudness on truthfulness ratings. When examining confidence levels reflected in the ratings, an interaction effect between reading status and loudness condition emerged. Participants who controlled their volume (either loud or soft) rated statements with higher confidence compared to when rating statements silently. These findings suggest that speakers do not associate their own voice loudness with the truthfulness of information.

Capturing User Intent through Integration of Item ID and Modality Information in Session Recommendation

Session-based recommendation aims to capture user intent from short-term, anonymous interaction sequences to recommend relevant items. From a cognitive science perspective, understanding user intent is closely tied to how humans process information, allocate attention, and make decisions under limited cognitive resources. While existing session-based methods mainly rely on ID-based modeling, such approaches face severe data sparsity and lack alignment with how users cognitively process information. Incorporating modality information can alleviate this issue. However, simple integration of ID and multimodal information often results in modality underfitting, limiting the effective use of multimodal features. To address these challenges, we propose SRIM(Session-based Recommendation with ID and Modality), a model that integrates ID and multimodal representations through a two-phase strategy: independent training followed by joint optimization. SRIM can better capture session-level intent by simulating users' actual perceptual contexts. Experiments on three real-world datasets demonstrate that SRIM significantly outperforms existing methods in session recommendation. The code for SRIM is available on GitHub {https://github.com/liang-tian-tian/SRIM}.

A Dictionary-based Quantitative Analysis of the Sound-symbolic System of Japanese Ideophones

Ideophones are characterized by sound symbolism, a non-arbitrary relationship between speech sounds and meaning, which is observed across languages worldwide. Despite the increasing interest in sound symbolism and ideophones in cognitive science, Japanese ideophone research has lacked a robust quantitative approach. This paper proposes a dictionary-based method to quantify the sound-symbolic system of Japanese ideophones and presents statistical results that support and extend previous findings. The proposed method aims to advance the objective and comprehensive analysis of ideophones, sound symbolism, and iconicity, offering valuable contributions to ongoing discussions in the cognitive sciences.

Overconfidence as Truth Approximation

Human reasoning and decision-making under uncertainty often deviate from normative standards of rationality. Over the past decades, cognitive scientists have extensively investigated heuristics and cognitive biases, such as overconfidence—the tendency to overestimate the probability that one's judgments are correct. Meanwhile, philosophers have explored different "cognitive utilities" that guide both scientific and everyday reasoning, including the concept of truthlikeness, i.e., how well an hypothesis, be it a statement or a numerical interval, approximates the whole truth about a target domain. In this paper, we integrate empirical findings with philosophical perspectives, showing how formal models of truthlikeness offer valuable insights for empirical research on overconfidence. In particular, by conceptualizing overconfidence through the lens of expected truthlikeness maximization, we argue that many instances of this phenomenon may be construed not as cognitive biases, but rather as rational strategies for approaching the truth under conditions of uncertainty.

Modeling Understanding of Story-Based Analogies Using Large Language Models

Recent advancements in Large Language Models (LLMs) have brought them closer to matching human cognition across a variety of tasks. How well do these models align with human performance in detecting and mapping analogies? Prior research has shown that LLMs can extract similarities from analogy problems but lack robust human-like reasoning. Building on Webb, Holyoak, and Lu (2023), the current study focused on a story-based analogical mapping task and conducted a fine-grained evaluation of LLM reasoning abilities compared to human performance. First, it explored the semantic representation of analogies in LLMs, using sentence embeddings to assess whether they capture the similarity between the source and target texts of an analogy, and the dissimilarity between the source and distractor texts. Second, it investigated the effectiveness of explicitly prompting LLMs to explain analogies. Throughout, we examine whether LLMs exhibit similar performance profiles to those observed in humans by evaluating their reasoning at the level of individual analogies, and not just at the level of overall accuracy (as prior studies have done). Our experiments include evaluating the impact of model size (8B vs. 70B parameters) and performance variation across state-of-the- art model architectures such as GPT-4 and LLaMA3. This work advances our understanding of the analogical reasoning abilities of LLMs and their potential as models of human reasoning.

Plausibility sampling rather than difficulty influences sequential selection of episodic counterfactual thoughts

People often engage in episodic counterfactual thinking: simulating alternative ways in which past events might have occurred. Existing research has shown that the perceived plausibility of episodic simulations modulates judgments of regret, mood and prosocial behavior. However, knowledge about the factors influencing the perceived plausibility of episodic counterfactuals is limited or derived from studies using vignette-based hypothetical scenarios. Inspired by research on modal cognition, here we test whether counterfactual plausibility is influenced by a sampling process that prioritizes the generation of plausible alternatives. Additionally, we evaluated whether the sequential generation of episodic counterfactual simulations is associated with vividness and difficulty. Across two experiments we demonstrated that when people generate episodic counter-factual thoughts, they initially produce the most plausible and vivid mental simulations, without concurrent changes in difficulty. Our results provide support for a sampling process that prioritizes the generation of more plausible and vivid counterfactual alternatives over less difficult ones.

DilatedSleepNet: A Novel EEG Waveform-Aware Model for Single-Channel Automatic Sleep Staging

Sleep plays a crucial role in maintaining human health and improving quality of life. However, traditional manual sleep staging methods are not only time-consuming but also heavily reliant on expert experience, limiting their feasibility for largescale applications. Therefore, developing high-precision and fully automated sleep staging methods is essential for assisting clinical diagnosis. To address this research need, we propose an innovative automatic sleep staging network, DilatedSleepNet. This model introduces a novel multi-scale dilated convolution strategy to effectively capture the waveform characteristics of EEG signals, enabling accurate sleep stage classification using only single-channel EEG input. We systematically evaluated the performance of DilatedSleepNet on three publicly available datasets, achieving classification accuracies of 86.8%, 83.2%, and 85.4%, respectively. Experimental results demonstrate that DilatedSleepNet exhibits outstanding generalization ability and robustness across multiple datasets, providing a strong technical foundation for the diagnosis and research of sleep-related disorders.

Learning telic-controllable state representations

Computational models of purposeful behavior comprise both descriptive and prescriptive aspects, used respectively to ascertain and evaluate situations in the world. In reinforcement learning, prescriptive reward functions are assumed to depend on predefined and fixed descriptive state representations. Alternatively, these two aspects may emerge interdependently: goals can shape the acquired state representations and vice versa. Here, we present a computational framework for state representation learning in bounded agents, where descriptive and prescriptive aspects are coupled through the notion of goal-directed, or telic, states. We introduce the concept of telic-controllability to characterize the tradeoff between the granularity of a telic state representation and the policy complexity required to reach all telic states. We propose an algorithm for learning telic-controllable state representations, illustrating it using a simulated navigation task. Our framework highlights the role of deliberate ignorance -- knowing what to ignore -- for learning state representations that balance goal flexibility and cognitive complexity.

Effective but untrustworthy: How artificial intelligence bias opposing human bias affects judgments

Today, people make judgments with the help of artificial intelligence (AI) assistance in many situations, such as medical diagnoses. Although many studies have examined the effects of AI assistance, they have mainly focused on aspects of AI (e.g., AI's accuracy). Here, we emphasize the importance of interactions between AI and human biases. A highly accurate AI may not always be a promising intervention; rather, AI with biases (especially in the direction opposite to individuals' biases) may work effectively because AI's biases may cancel out individuals' biases (e.g., individuals' overestimation bias may be corrected by AI's underestimation bias). We investigated these is-sues using a simple perceptual task assuming medical judgments. First, computer simulations showed that appropriate AI assistance would differ depending on individuals' prior beliefs. Behavioral experiments demonstrated that AI with biases in the direction opposite to participants' biases could effectively reduce their biases. However, participants tended to evaluate AI with biases in the same direction as their own and considered it more trustworthy. Our theoretical and empirical results raise questions about conventional beliefs that more accurate, trustworthy AI should be better. Our findings will provide practical implications for designing AI as a collaborator of people.

The effect of learning Chinese Sign Language on spatial conceptualisation of time in hearing Mandarin speakers

Temporal-spatial metaphors can be differed across languages, and such cross-linguistic influences may affect people's spatial conceptualization of time. Mandarin (including gestures) has different spatial metaphors for time than Chinese Sign Language (CSL). This paper investigated whether native Mandarin speakers' mental space-time mappings change after learning CSL for 14 weeks. Sixty native Mandarin speakers who had no prior knowledge of sign language took a pretest and posttest of space-time mappings before and after taking a CSL course. The results showed that participants changed their temporal-spatial mappings after learning CSL. Specifically, they had more sagittal space-time mappings and fewer lateral ones than before. They also had more "future-in-front/ past-in-back" mappings consistent with CSL space-time mappings. Furthermore, these changes were more significant in high-proficiency learners than in low-proficiency learners. Our results not only demonstrate an effect of bodily experience on time conceptions, but also suggest that sign language can impact spatial temporal reasoning.

Learn What is Detectable, Detect What is Useful

Many computational models of morphology represent complex words by n-grams to account for lexical processing and acquisition. However, while n-gram models are simple and efficient, they are not without problems. From a cognitive perspective, it is unclear how n-gram words are represented in the mental lexicon and how these representations affect language use and acquisition. From a computational perspective, these models are problematic because n-gram representations are often ambiguous and redundant: they make very limited use of distributional information and neglect the role of efficiency and sequential processing in language use and acquisition. In this paper, we present a new computational approach to morphology that is cognitively more plausible than standard n-gram models. By analyzing data from the nominal number system in German, we show that a task-specific algorithm of linear processing guided by the principles of efficiency and reliability outperforms state-of-the-art n-gram models and also makes predictions about lexical processing that are consistent with the judgments of German native speakers in a psycholinguistic experiment.

Eliciting the Priors of Large Language Models using Iterated In-Context Learning

As Large Language Models (LLMs) are increasingly deployed in real-world settings, understanding the knowledge they implicitly use when making decisions is critical. One way to capture this knowledge is in the form of Bayesian prior distributions. We develop a prompt-based workflow for eliciting prior distributions from LLMs. Our approach is based on iterated learning, a method that has been used to explore implicit knowledge in human decision-makers in which successive inferences are chained together to converge to the prior distribution. We validated our method in settings where iterated learning has previously been used to estimate the priors of human participants -- causal learning, proportion estimation, and predicting everyday quantities. We found that priors elicited from GPT-4 qualitatively align with human priors in these settings. We then used the same method to elicit priors from GPT-4 for a variety of speculative events, such as the timing of the development of superhuman AI.

INTUIT: Investigating intuitive reasoning in humans and language models

We introduce the INtuitive Theory Use and Inference Test (INTUIT), a cognitive test battery targeting common-sense physical and social reasoning. INTUIT adapts classic story-based question-and-answer methods for AI evaluation using VIGNET --- a novel tool that addresses some limitations of existing test batteries through procedurally generated vignettes. We evaluated INTUIT on three GPT models (GPT-4o, GPT-4o-mini, GPT-4.1-mini), one reasoning model (o3-mini), and a human sample (N = 147). Humans generally outperformed models, especially on object function and agent intention inference types. These results highlight INTUIT's sensitivity to intuitive reasoning capabilities and VIGNET's broader application for the evaluation of cognitive capabilities in humans and AI.

Empowering Cross-Patient Adaptive-Length Epilepsy Diagnosis with ECNorm: A Channel-wise Approach

Automatic seizure detection leveraging artificial intelligence has gained widespread attention. However, existing research has predominantly focused on scenarios with patient-specific and fixed-time lengths, with the practical clinical applications across non-specific patients and variable time lengths remaining underexplored. To address this gap, we introduce a novel method named Electroencephalogram Channel-wise Normalization (ECNorm), designed to thoroughly explore the physical significance and data distribution characteristics of different EEG channels to minimize inter-patient variability. We applied ECNorm to a two-layer LSTM model to facilitate cross-patient adaptive-length epilepsy diagnosis. Ablation studies demonstrate that ECNorm significantly enhances the performance of simple architectures like the two-layer LSTM when compared to batch normalization and layer normalization. Leave-one-out experiments on the public CHB-MIT dataset verify that our approach surpasses existing studies across segments of varying lengths (1 and 100 seconds), establishing a new benchmark for patient-independent automated epilepsy diagnosis.

Categories from dimensions: Population-level computational modelling of neurodevelopmental conditions

Theoretical understanding of neurodevelopmental conditions (NCs) has shifted from a categorical approach to a dimensional one, characterized by an acceptance of comorbidity and heterogeneity. Previous computational modelling of NCs has tended only to accommodate categorical views. The current work presents a mechanistic simulation framework that fits with the dimensional view, using artificial neural networks to model populations of learners, with underlying causes of variation in developmental outcomes viewed as continuous, polygenic, and in part environmental. We show how the dimensional and categorical approaches can be linked using latent profile analysis and outlier methods, recovering profiles and specific deficits from dimensional variation. We show how altering the distribution of hyper-parameters shifts the population composition of developmental profiles and frequencies of deficit patterns, and we test their robustness to stochastic factors.

Breaking it Down: Expertise and Dance Segmentation

This study investigates how expertise influences the mental representation of dance choreography, focusing on differences between expert and novice ballet dancers. Event Segmentation Theory (EST) was used to examine chunking in a 50-step ballet sequence. Participants, classified as experts or novices based on dance experience, were tasked with segmenting choreography across repeated viewings. Results showed that experts segmented the sequence into fewer, larger chunks, and showed greater consistency and greater similarity with each other. Novices, in contrast, identified more segments and showed less agreement. These findings underscore the role of domain-specific knowledge that incorporates the structure of the domain in forming mental representations, and sets the stage for exploring how this may enable superior expert learning and memory for very long sequences of dance. Keywords: expertise; event segmentation; chunking; dance memory

Improvisation in Motion: Exploring How Expertise Affects Perception of Joint Actions

Joint improvisation is central to how we navigate the social world, engage and maintain social interactions, and perceive interactions between other people. This project investigates people's ability to distinguish between joint and individual actions (contemporary joint vs. solo dance improvisation) and the information they use to make this determination. In Experiment 1, participants were asked to identify whether two people were improvising dance movements together or alone. Experiment 2 explored how much people's decision-making relies on information about the dancers' facial expressions and gaze direction. Overall, results showed we can accurately identify improvised joint actions, even when the actors' faces and gaze direction are occluded.

Learning visual appearance from language is mediated by causal intuitive theories

What and how do people learn about visual appearance from language? We test the hypothesis that in the absence of sensory evidence, people born blind use abstract causal knowledge to infer object appearance. Congenitally blind (n=19) and sighted adults (n=59) reported how many colors two types of artifacts were likely to have: artifacts for which having many colors is intended to facilitate function (n=30, e.g., fairytale book, fruit candies), and artifacts for which colorfulness is irrelevant or distracting (n=30, e.g., instructional manual, painkillers). The number of colors estimated per object was highly correlated across groups. Blind and sighted people assigned more colors to artifacts for which colorfulness facilitates function and appealed to makers' intentions in open-ended explanations. A text-only version of GPT-4 generated similar but non-identical colorfulness estimates compared to humans. Our findings suggest that people infer the appearance of unseen objects using causal ‘intuitive theories' informed by linguistic evidence.

Sexual Selection Preferences in Anthropomorphized Imagery of Interpretative Graphics in Quantitative Visualization

When choosing what we find visually attractive, men and women tend to focus on different features, even for simple shapes. This study investigates gender differences in visual feature preferences during the anthropomorphization of graphics in the context of sexual selection. We constructed a feature set consisting of 48 geometric attributes to explore how these elements affect sexual selection preferences across genders. In Study 1, we quantitatively visualized these features using genetic algorithms, GANs, and manual design. Study 2 assessed gender preferences through an online survey of 288 participants, revealing the most significant features and differences in male and female preferences.Finally, in Study 3, we applied these findings to real-world art (Chinese calligraphy) to verify the explanatory power of the features. Our results provide new insights into the role of visual features in sexual selection and have practical applications in art, product design, and user experience optimization.

A Quantum Model of Arousal and Yerkes Dodson Law

We present the Oscillating Field Perturbation (OFP) model, a quantum model that provides a quantitative account of the Yerkes-Dodson law concerning the relationship between arousal and performance. Inspired by neural models, OFP conceptualizes cognitive control as an oscillating field that perturbs a quantum system, and represents arousal as the ``gain'' induced by this field. By integrating OFP with the Multiple Particle Multiple Well (MPMW) framework, we demonstrate that OFP successfully explains how the shape of the Yerkes-Dodson law varies with task difficulty and familiarity, consistent with empirical findings. To the best of the authors' knowledge, OFP is the first model to provide a unified account of the empirical variations in the shape of the Yerkes-Dodson law.

Communicative efficiency of distributional and semantically-based core vocabularies in narrative text comprehension

High-frequency words are often assumed to be the most useful words for communication, as they provide the greatest coverage of texts. However, the relationship between coverage and comprehension may not be straightforward; how words relate semantically to people's mental representations is also important. In this study, we evaluate how useful different sets of "core vocabularies" are in text comprehension. The core vocabularies, which reflect different aspects of distributional and semantic information, provide different amounts of information for different vocabulary size and amount of text coverage. In our experiment, we showed people narrative texts with all but the core words removed, and measured comprehension in a variety of ways. Our results show that both distributional (e.g., frequency-based) and semantic (e.g., word association-based) core vocabularies are communicatively useful, but that the semantically-based core vocabularies provide more information when textual coverage is held constant.

Changes in cognitive effort across infancy and early childhood

Cognitive functioning across development has predominantly been assessed through task performance. However, the role of cognitive effort infants and young children exert has been largely neglected in understanding cognitive functioning. In a large longitudinal sample (YOUth cohort, N = 2241) of infants and young children aged 5, 10 and 36 months, we extracted dynamic baseline-corrected pupil responses to measure cognitive effort during a gap-overlap eye-tracking task. Results revealed a shift from predominantly reactive effort when infants were younger to more preparatory effort in older children, especially for the more demanding condition. Moreover, preparatory effort in infancy predicted cognitive effort in childhood. These findings underscore the importance of measuring cognitive effort in addition to task performance to capture a more complete picture of early cognitive development.

Modality-Specific Mental Imagery Abilities are Unrelated to Modality-Specific Category Learning

Category learning is an important ability that underlies complex cognitive processes such as object recognition and speech perception. Categories are ubiquitous across modalities and people differ greatly in their ability to learn novel categories. Here, we addressed a modality-specific cognitive individual difference that may relate to category learning – mental imagery. We examined how individual differences in self-reported auditory and visual mental imagery abilities related to individual differences in auditory and visual category learning. Overall, according to Bayesian analyses, there was anecdotal to moderate evidence for the null hypothesis that differences in self-reported modality-specific mental imagery are unrelated to differences in modality-specific category learning. These results have implications for theories of category learning and raise questions regarding the functions of mental imagery in cognitive processes such as categorization and learning.

Using Head-Mounted Eye Tracking to Examine Infant Face Looking During Naturalistic Freeplay

Infants' attention to faces is a critical component of social development, yet little is known about how face looking unfolds during real-world interactions. Using head-mounted eye tracking and computer vision, we investigated the availability of faces in infants' field of view and their face-looking behaviors while playing with caregivers. Faces were visible only 23% of the time, and even then, infants looked at them only 19% of the time. Time series analyses revealed that face visibility is unrelated to face looking; Infants orient their head and body to bring a face into view when they intend to look at it, rather than looking because a face is present. Face looking is related to the face's visual properties and is only slightly influenced by parents' face looking. These findings highlight the active nature of infant face looking and suggest that it is shaped by both visual features and infants' own interest.

Wisdom of the (expert) crowd: Performance of Aggregation Models for Fetal Heart Rate Judgments

Existing work has attributed the low clinical utility of continuous CTG to the reliability and accuracy of Cardiotocography judgments. The aim was to determine whether aggregating the judgments of multiple obstetricians (leveraging the "wisdom of crowds'') using obstetricians' optimized estimates of the probability of hypoxia improves accuracy. In the current study, we apply three different aggregation techniques to the evaluations of nine obstetricians from the CTU-CHB Intrapartum Cardiotocography Database. The evaluations were optimized estimates of the probability of hypoxia in each evaluation category. The three aggregation models ranged in complexity from an unweighted aggregation scheme to an approach that weighted evaluations based on the contribution of the obstetricians. All the aggregation models were shown to improve judgment accuracy above chance performance. However, the most accurate model was the one which calculated the simple average of obstetricians' judgments. There was no additional benefit of selecting obstetricians who were positive contributors to an ensemble and weighing the evaluations based on their contribution to the ensemble. Aggregating obstetricians' evaluations may be a solution to the ongoing reliability and accuracy issues in fetal heart rate judgment.

Perceptions of A.I.-Enhanced Bodies: Autonomy, Authenticity, and Preferences Among Young Adults

This study explores the psychological impact of AI-generated and user-manipulated images on body image perception, particularly in the context of social media platforms. Focusing on young adults, the research examines their ability to identify A.I.-enhanced, user-enhanced, and unaltered images. Results indicate that participants can readily detect AI-enhanced images due to exaggerated features but struggle to identify subtle alterations from traditional photo-editing apps. Interestingly, participants showed a preference for minimally edited or unaltered images, despite faster detection of AI- enhanced images. Qualitative data suggest a divide in participants' attitudes toward AI manipulation: some expressed concern about its effects on body image and self-esteem, while others expressed indifference. These findings highlight the increasing difficulty in distinguishing authentic content from digital manipulation and raise important questions about rapidly evolving definitions of beauty and authenticity. Overall, findings underscore the need for media literacy interventions to address these challenges.

Singular "they" exposure increases singular interpretations in ambiguous pronouns

Evidence is accumulating that patterns of use for singular "they" are changing in English. The pronoun is becoming the preferred generic when the gender of the referent is unknown or backgrounded. This change reflects a shift in patterns of acceptability for uses of singular "they" which is in turn linked to the increased frequency of singular they. We predict that adaptation may be a cognitive mechanism underlying this change, and if so, we may see short-term adaptation within a lab session. In the present study, we use a between-subjects priming paradigm to test whether participants adapt to the frequency with which they encounter singular or plural senses of "they" in the local discourse. We find that selections of singular "they" are significantly more likely after participants have been exposed to unambiguously singular vs. plural uses of "they". This finding implicates adaptation and suggests that adaptation may link changes in the frequency of linguistic forms to changes in their acceptability.

Humans learn proactively in ways that language models don't

Do large language models (LLMs) learn like people do? We investigate this question with a simple task that compares human learning and LLM finetuning on the same set of novel inputs. We find that while humans learn and generalize robustly, finetuned LLMs largely fail to generalize from what they learned and are more influenced by prior expectations than humans are. We then analyze human solutions of our task and find that stronger performance is characterized by the proactive formation of efficient representations that aid learning and generalization. Although LLMs can use in-context learning to match the performance of humans who do not form these representations, and can use similar representations provided in-context to match the performance of those who, they do not form these representations on their own. Given these findings, we then consider how future theories of human learning might be built in the age of LLMs

How Altruistic Motivation Synchronizes Brain-Muscle Coherence to Enhance Motor Performance

While extrinsic and intrinsic motivation have been well studied, the effects of altruistic motivation on motor performance remain largely unexplored. This study investigates the influence of altruistic motivation on brain–muscle coherence and its effect on improving time to task failure. Thirty-one participants performed two high-intensity isometric grip tasks to failure. The first trial was conducted without any extra motivational incentive, while the second trial was performed under one of three conditions: altruistic, extrinsic, or control. Electroencephalogram (EEG) and electromyogram (EMG) signals were recorded during both trials. Our results demonstrate that only the altruistic group improved their performance from the first to the second trial (68%; p = 0.004). The altruistic group exhibited increased EEG–EMG coherence in the alpha and beta bands and reduced delta coherence. These findings suggest that prosocial motivation restructures neural oscillatory activity, optimizing force control and endurance.

Improving Cognitive Capability of Large Language Model: A Multi-Step Symbolic Reasoning Approach

The emergence of large language model (LLM) has promoted the research progress in many fields, but it still faces challenges in imitating human logical reasoning, especially in the step-by-step reasoning of complex tasks and zero-shot logical cognition. To address these challenges, we propose a multi-step symbolic reasoning strategy that decomposes complex tasks into subtasks and optimizes the decomposition using a subtask verification module. Moreover, we also introduce a new zero-shot symbolic module which can help improve the model's reasoning ability on unseen samples with symbolic representation and logical schemes. We evaluated our method on four reasoning datasets: the industrial private dataset Ship Assembly Technology and the public datasets ProntoQA, ProofWriter, and OpenBookQA. Our framework demonstrates substantial improvements in reasoning interpretability and generalization capacity compared to existing prompting paradigms. The proposed method establishes a new pathway for enhancing LLMs' cognitive architectures through symbolic system integration, showing strong potential for efficient knowledge transfer to downstream applications while preserving human-understandable reasoning traces.

Comparing eye movement and lexical decision experiments on Thai compound recognition

This study investigated how visual processing of simple and compound words in Thai can be modulated by the experimental tasks. A sentence reading eye-tracking experiment (Experiment 1) and a lexical decision experiment (Experiment 2) were conducted on Thai bisyllabic compounds and simple nouns. Experiment 1 had two interword spacing conditions and examined how adding interword spacing to Thai sentences (normally unspaced in writing) modulated lexical processing. Experiment 1's results showed that compounds incurred longer fixation durations than simple words, whereas interword spacing facilitated word recognition for both word types. Experiment 2 revealed a different result. Word recognition was faster for compounds than simple words. We assumed the disparity stemmed from different task demands. Sentence reading requires integration of semantics and syntactic processing, and compounds would require additional semantic integration processing. By contrast, lexical decision focuses on isolated word recognition and the word's syntactic and semantic features are not activated.

A Dense Convolutional Bi-Mamba Framework for EEG-Based Emotion Recognition

In recent times, emotion recognition based on electroencephalograms (EEGs) has found extensive applications. Although numerous approaches leveraging CNN and Transformer have been put forward for automatic emotion recognition and have achieved commendable performance, several challenges remain: (1) Transformer-based models are proficient at capturing long-term dependencies within EEG signals. However, their quadratic computational complexity poses a significant hurdle. (2) Models that combine Transformers with convolutional neural networks (CNNs) often fail to effectively capture the coarse-to-fine temporal dynamics of EEG signals. State Space Models (SSMs), exemplified by Mamba, have emerged as a promising solution. They not only showcase outstanding capabilities in modeling long-range interactions but also maintain a linear computational complexity, which is highly advantageous. To address these challenges head-on, we introduce Emotion-Mamba, an innovative framework designed specifically for EEG-based emotion recognition. The proposed framework initiates the process by employing the CNN Encoder to extract information from both the temporal and spatial dimensions of EEG signals. Subsequently, the extracted feature information is relayed to the Hierarchical Coarse-to-Fine Bi-Mamba (HBM) block, which is adept at efficiently processing these features. Furthermore, a Dense Temporal Fusion (DTF) module has been incorporated. This module capitalizes on the multi-level, purified temporal information sourced from CNN Encoder and HBM blocks, with the aim of bolstering decoding accuracy. We conduct comprehensive evaluations of Emotion-Mamba using the SEED and SEED-V datasets. The experimental findings unequivocally demonstrate that our proposed approach surpasses the existing state-of-the-art methods.

How Well Do People Perform on Novel Logic Puzzles Requiring Higher-Order Theory of Mind?

Theory of mind (ToM) refers to the ability to reason about the behaviour of others and oneself by attributing internal mental states, such as knowledge, desires and intentions. ToM can be applied recursively - for example, "Amy thinks that Bernard knows that it is raining" is said to be a second-order ToM statement from the reader's perspective. Past research suggests that there is a limit to the number of times humans can apply ToM recursively - for example, they tend to use up to second-order ToM reasoning in strategic games. In the present study, we propose and conduct a novel human experimental design, in which different orders of ToM reasoning in the logic puzzle "Cheryl's Birthday" can be distinguished. Results show that higher-order ToM reasoning is associated with longer times to solve the puzzle(s) and a higher rate of mistakes.

MPPFND: A Dataset and Analysis of Detecting Fake News with Multi-Platform Propagation

Fake news spreads widely on social media, leading to numerous negative effects. Most existing detection algorithms focus on analyzing news content and social context to detect fake news. However, these approaches typically detect fake news based on specific platforms, ignoring differences in propagation characteristics across platforms. In this paper, we introduce the MPPFND dataset, which captures propagation structures across multiple platforms. We also describe the commenting and propagation characteristics of different platforms to show that their social contexts have distinct features. We propose a multi-platform fake news detection model (APSL) that uses graph neural networks to extract social context features from various platforms. Experiments show that accounting for cross-platform propagation differences improves fake news detection performance.

It takes one to know one: Theory of mind helps children to detect lies that are revealed by semantic leakage

Can children detect a lie when the liar unintentionally reveals essential information (displaying so-called semantic leakage)? Furthermore, because previous research found that theory of mind (ToM) is a factor in children's lie production ability, what role does ToM play in their lie detection ability? An experiment was carried out with 128 Dutch-speaking children (4 - 14 years old). Children's lie detection ability was assessed using a story in which one of the characters produced a lie signalled by semantic leakage. A false-belief task was used to test children's first-order ToM (Kevin thinks that…) and second-order ToM (Marieke thinks that Kevin thinks that…). Children were 73% accurate in identifying the liar by referring to semantic leakage. Their performance improved with age, with 12-year-olds showing ceiling performance. Finaly, first-order ToM was a significant predictor, even after controlling for age, suggesting that lie production and lie detection partly rely on shared cognitive functions.

Individual Differences in the Tendency to Use Multiword Information in Natural and Artificial Languages

Work in the last decades showed that learning from multiword units is often beneficial for language learning, impacting mastery of arbitrary linguistic relations and predicting efficient language processing. Much of this work has looked at differences between first (L1) and second language (L2) learning, documenting differences in how children and adults approach language learning, with only a few studies looking at individual differences in reliance on multiword units. Here, we ask whether adults differ in their tendency to draw on multiword units when learning a new language, and if so, whether such differences are related to learning outcomes and to language processing. We used an artificial language with grammatical gender to measure participants' tendency to treat article-noun sequences as one unit during language learning, and a multiword recall task to measure their tendency to benefit from multiword units in their native language (L1). Our findings show that individuals differ in their tendency to chunk information into multiword units when learning an artificial language, with most participants falling into one of two distinct groups, showing a steady pattern of preferences to treat article-noun sequences as either one or more than one unit throughout the task. This tendency was found to be numerically, albeit not significantly, related to how much individuals benefit from multiword information in their L1. These findings document a novel dimension of individual differences – the tendency of a learner to rely on multiword units, which may be related to different aspects of language learning and processing.

Exploring resource-rational planning under time pressure in online chess

Human planning is incredibly efficient. Even in complex situations with many possible courses of action, people are able to make good decisions. Recent proposals suggest that a primary contributor to this efficiency is the intelligent use of cognitive resources, but how people allocate these resources under time constraints is not fully understood. In this work, we conduct a resource-rational analysis of planning in a large data set of online chess games. We first demonstrate that players spent more time thinking when they had more time to do so, and that this effect was especially prevalent when computation was more valuable. Then, we show that additional time spent planning resulted in better selected moves when one existed, and compare between signals of general and immediate time pressure. Finally, we highlight the role of expertise in this setting. Our results provide evidence that people make resource-rational choices when planning under time pressure.

Does Precision Affect Categorization? Magnitude Categorization and Measurement Scales

How do systems of measurement influence our conceptualization of relative magnitudes? This study investigates the cognitive interplay between measurement precision and magnitude categorization. By employing morphed shapes organized by an arbitrary dimension, we examine whether exposure to high- vs low-precision numerical systems affects conceptual judgments and well-known phenomena such as semantic distance and semantic congruity effects as found for familiar dimensions. Participants trained on novel scales revealed differences in their sensitivity that depended on the precision of the trained measurement system, consistent with high-precision systems leading to relatively expanded dimensional encodings compared to low-precision systems. Our findings also shed light on other topics such as the interplay of perception and language in learning novel dimensions and the association of directionality with a mental number line.

Does Explicit Analogical Reasoning Help Second Language Acquisition? Evidence From Artificial Language Learning

When people acquire a second language (L2), do they benefit from analogical reasoning? Past research showed that people are likely to engage in analogical reasoning to support their L2 learning, yet it is unclear whether this process is explicit or only occurs automatically and implicitly. In the current study, English-speaking participants (N = 100) learned a miniature artificial language with grammatical markers that were either morphologically congruent or incongruent with English grammar. We then assessed participants' acquisition of the artificial language and their explicit use of analogical reasoning. Acquisition was improved when the artificial language was structurally congruent with English, and was also better with participants who reported explicit analogical reasoning. This was especially pronounced for the ability to generate novel content. These findings provide evidence that learners acquiring a new language spontaneously leverage analogies with their existing languages, and this is especially beneficial when the analogies are recognized explicitly.

Spatially Upward and Emotionally Uncertain: A Pilot Study on Mental Representations of Lexical Tones

The present study investigated cross-modal correspondences between Cantonese tones and two dimensions: (1) spatial motion and (2) emotional valence, via a forced-choice mapping task on Hong Kong native speakers. Results show that the two contour tones (rising and falling contours) could be reliably matched with motions (upward and downward) that are congruent with pitch trajectories; and in contrast, the correspondence between contour tones and emotion valence (rising tone is positive and falling tone is negative) was less robust and limited for selective vowels. In summary, our findings indicate that beyond arbitrary form-meaning correspondence, vertical spatial information, both concrete (motion) and abstract (valence), are also encoded in lexical tones of Hong Kong Cantonese, but that the relative strengths of the two types of correspondences were not equivalent.

Young Children's Understanding of Prior and Posterior Probability

This study investigates 4-6-year-olds' ability to reason about prior and posterior probabilities, and how they update their decisions based on new evidence. Across two experiments, children made a prior probability guess and then, after receiving additional information, a posterior probability guess. Our findings suggest that children as young as four can make accurate prior probability guesses and in some cases, update them when given new evidence. Children's ability for probability updating improves with age. These results suggest that the ability to reason about posterior probabilities emerges earlier than previously thought, by age 4.

Estimating and Correcting Yes-No Bias in Language Models

When presented with a yes-no question, humans tend to say 'yes' regardless of the ground truth. This 'yes-bias' can be attributed either to the social pressure to agree with an interlocutor or simply to the tendency to mimic the distribution of the input data. Here, we estimate 'yes-no' response bias in language models (LMs), with the goal of distinguishing the two theories, and explore two strategies for bias correction. We develop two yes-no question datasets derived from existing world knowledge datasets, and test 16 open-weight LMs. We find that LMs often show response bias on yes-no questions, but that it is highly variable, deviating from bias observed in humans. We further present a novel bias correction method, which eliminates bias and improves model performance. Evidence of non-humanlike response bias in LMs informs us on the source of yes-bias in humans, and the efficacy of our bias correction method holds promise for LM evaluation.

Understanding Visual Representation of Linear Models: A Comparison of Real Students and ChatGPT

The rise of ChatGPT has sparked interest among educators in integrating it into teaching and learning. However, effective teaching requires a deep understanding that allows instructors to use multiple representations to support comprehension and address students' misconceptions. Before relying on ChatGPT as a teaching tool, it is crucial to assess its ability to interpret multiple representations. This study evaluates ChatGPT's understanding of the fundamental statistical concept "data = model + error," which underpins a number of statistical analyses in introductory statistics courses. Through tasks involving graphical representations, we qualitatively examined GPT models' understanding and compared their strengths and limitations to those of students. The results showed that while ChatGPT demonstrated competence in certain areas, it also exhibited misunderstandings—some resembling students' and others unique to the models themselves.

An Ideal Observer Model of Audiovisual Detection Captures Modality-Specific, but not Amodal, Confidence Ratings

Detecting objects in the environment, and forming a sense of confidence in these decisions typically involves multisensory processing. We sought to characterize how humans form amodal and modality-specific confidence judgments during audiovisual detection. We found that participants made more accurate detection and confidence judgments for audiovisual than unimodal stimuli. To explain these results, we extended a Bayesian evidence accumulation model to audiovisual detection and successfully reproduced both unimodal and audiovisual detection judgments. Despite being fitted to decisions and decision times alone, our model accurately reproduced modality-specific confidence. It failed, however, to account for amodal confidence, suggesting that the latter might not arise from optimal signal integration in detection contexts. Our results indicate that, in the presence of audiovisual signals, different integration rules apply for perceptual and metacognitive decisions.

How Empathy Promotes Socially Adaptive Behaviors in Interpersonal Conflicts?: An Exploratory Study on the Role of Intention Inference

In interpersonal conflicts, empathy fosters socially adaptive behaviors that facilitate reconciliation, such as offering comprehensive and non-defensive apologies. Perceivers adjust their behaviors based on inferences about the intentions of social targets during social interactions. How do these inferences relate to perceivers' empathy and shape their adaptive behaviors in conflicts? This study examined two aspects of intention inference: inference accuracy (how accurately perceivers infer targets' intentions) and target-oriented inference (how much perceivers focus on targets' states). We investigated whether these inferences link the relationship between empathy and adaptive behaviors. Our results showed that empathetic perceivers focused more on targets' states when inferring their intentions, which was associated with offering more comprehensive apologies. Inference accuracy, however, did not influence the relationship between empathy and either the provision of comprehensive apologies or the reduction of defensive responses. Our study underscores the importance of considering others' states in promoting adaptive behaviors during interpersonal conflicts.

Assessing the Verbal Redundancy Effect by Adding Narration to Written Text in Native and Non-Native Speakers

The addition of verbal narration could impede reading performance, called the verbal redundancy effect. Two experiments in this study explored the moderating role of readers' vocabulary size and text difficulty in this effect. A total of 77 native English speakers in Experiment 1 and 45 non-native English speakers in Experiment 2 were divided into two groups of vocabulary-size in each experiment. In both experiments, participants read eight passages and answered questions. The study manipulated the narration presentation and text difficulty. Experiment 1 showed that adding narration impedes comprehension when high-vocabulary participants read easy passages, whereas it enhances comprehension when low-vocabulary participants read easy passages. The redundant narration effect was moderated by reading skills. Experiment 2 showed no significant narration effect, but comprehension scores were higher when high-vocabulary multilinguals read neutral passages with narration than without narration. These effects are aligned with previous research and well explained by cognitive load theory.

More Than Boundaries: Exploring the Characteristics and Attributes of Daily Life Events

Conventional methods in event cognition often focus on identifying boundaries by instructing participants to mark transitions. While effective for detecting shifts, they offer limited insight into how events unfold. This study examines six characteristics—location, people, activities, mood, bodily states, and purposeful actions—and their stability across daily events. Using nightly segmentation, 41 participants captured and reviewed daily images over 14 days, defined events, and described each using the six dimensions. People (0.58) and location (0.55) were most stable, followed by mood (0.38) and bodily states (0.31). Activities (0.07) and purposeful actions (0.18) were highly variable. These findings emphasise that characteristics such as goals and activities not only serve as effective markers for identifying transitions at boundaries but also provide valuable perspectives on how events are distinguished and understood within the broader context of daily life. Keywords: Daily Events, Event Cognition, Nightly Segmentation.

State Sensitivity in an Additive Discovery Game

Successful innovation hinges on balancing exploring new ideas and exploiting existing ones. A rational innovator should be state-sensitive, effectively switching to exploitation when the best available idea reaches some standard. We tackle innovation with a discovery-by-recombination game under additive reward growth, and compare the optimal state-dependent policy with a state-independent policy. Our experiment reveals that participants made state-dependent decisions, exploring more in rounds with early successes, albeit being told of the same true success probability. In contrast, the optimal state-dependent policy switches to exploitation earlier. This suggests that participants' state-sensitivity may be driven by ad-hoc subjective probabilities. Participants also deviated from optimality through excessive exploration, switching multiple times between exploration and exploitation, and their switching points also differed from the theoretical optimum.

Dimensions of Identity-Representing Belief

Recent work has proposed that there are symbolic beliefs. These beliefs do not serve primarily to track the facts in the world but rather to express the believer's own identity. On this view, several disparate features of belief – from whether a belief is important to identity to whether it is sensitive to evidence – would be related to an underlying "symbolicness" dimension. We converted the features potentially related to symbolicness into items and asked people to rate their own beliefs on them. Study 1 found that beliefs which were high on one feature (importance) were rated higher on all the items, except for insensitivity to evidence. Study 2 found that ratings of any beliefs on almost all the items loaded onto a single, symbolicness factor, except again for evidence insensitivity. Study 3 asked participants to rate their own beliefs on all the items in Study 2 and several additional items designed to measure whether a belief was subjective. We recovered the symbolicness factor, but found it was largely orthogonal to the subjectivity and evidence insensitivity items. These findings suggest that most of the features we tested relate to an underlying symbolicness factor, which corresponds to whether a belief represents identity. But, surprisingly, this factor was not related to items that get at whether a belief represents facts about the world. It would seem that the degree to which a belief aims to express the believer's identity and the degree to which a belief aims at accurately tracking facts in the world are two orthogonal dimensions, which can vary independently.

Do Analogies in Geoscience Textbooks Inflate Judgments of Understanding?

Widely used analogies in science textbooks relate unfamiliar phenomena to more familiar everyday objects. However, literature on the illusion of explanatory depth suggests that people are highly overconfident in their understanding of the causal processes underlying how everyday objects and devices work. Thus, such analogies may inflate judgments of understanding (JOUs) for to-be-learned scientific explanations, even when the analogies are presented too superficially to aid understanding. A first experiment showed that superficial analogies notably increase JOUs for geology phenomena when presented and judged as individual sentences. However, a second experiment showed that analogies did not increase the more global JOUs made about the longer segments of the textbook in which the analogies were embedded. The superficial analogies did not help (and sometimes harmed) actual understanding of the geology concepts, suggesting that analogies could sometimes increase the general overconfidence that readers have about their understanding of what they read.

Same, but Different: Sequential and Simultaneous Fraction Comparison Tasks Elicit Different Distance and Congruency Effects

How humans process fractions is a topic of debate in numerical cognition research. While some studies suggest that humans process the holistic magnitude of fractions, others suggest we process only the fraction components (numerators and denominators). Two cognitive effects present in fraction processing data have shaped this debate: the distance effect (better performance with fractions separated by a far numerical distance) and the congruency effect (better performance when individual numbers are larger in the larger fraction). In a study with 160 young adults, we compared distance and congruency effects across two task formats of fraction comparisons, using simultaneous and sequential presentation of stimuli. Results revealed that the distance effect and the congruency effect were stronger in the simultaneous task than in the sequential task. These findings suggest that participants used both holistic and componential strategies when comparing fractions and highlight that task formats should not be used interchangeably.

Disgust Reactions and Their Justifications: The Case of Meat

Disgust reactions significantly impact food choices, particularly in meat consumption, yet the factors influencing their intensity and how individuals justify them remain underexplored. This study (n = 217) provides a novel, comprehensive examination of both disgust intensity and justification patterns across seven meat categories: cultured meat, genetically modified meat, game meat, small farm meat, factory-farmed meat, endangered animal meat, and pet meat. Results revealed that disgust sensitivity and gender significantly impact responses, with women reporting higher disgust intensity and greater likelihood to cite moral concerns as justification. Importantly, our study reveals a previously unidentified interaction effect: familiarity moderates the relationship between perceived naturalness and disgust intensity, suggesting a strategy to enhance acceptance of sustainable food alternatives. The justification patterns exhibited systematic variation by meat type. By bridging core and moral disgust research traditions, this work advances our understanding of how disgust functions at the intersection of biological protection and moral judgment.

Modeling a network of beliefs surrounding parents' endorsement of COVID vaccines for children

Cognitive science offers powerful tools for addressing pressing public health needs. Here we apply the cognitive science of intuitive theories and the tools of Bayesian networks to shed light on why so few children in the US have received COVID vaccines and how we might encourage caregivers to seek out these and other life-saving vaccines for their children. 1700 US parents completed 13 belief scales on a range of topics likely to influence their endorsement of pediatric COVID vaccines. We deployed structure learning techniques to develop a cognitive model of the relationships among beliefs and their influence on endorsement of vaccines for children, which accounted for 70% of the variance in participants' beliefs in a held-out testing split. Model-based simulations suggested that educational interventions focused on the effectiveness of COVID vaccines for supporting individual and community health may be most effective in increasing uptake of the COVID vaccine for children.

Examining the Influence of Stress and Anxiety on Visual Working Memory and Decision-Making

This study investigates how stress and anxiety influence the interplay between visual working memory and decision-making in human participants. Using the Socially Evaluated Cold Pressor Test (SECPT) to induce acute stress, we examined cognitive performance on a computerized behavioral paradigm (the Marble Jar Task) requiring the storage, manipulation, and utilization of visual information. Results revealed that while experimentally induced stress did not significantly affect overall accuracy, higher self-reported state anxiety was correlated with both lower decision-making accuracy and poorer visual memory performance. Interestingly, higher state anxiety was also correlated with increased attention towards high-value outcomes in decision-making. This work highlights the importance of understanding how stress and anxiety affect the interaction between interconnected cognitive functions, rather than studying isolated cognitive phenomena.

Development in the comprehension of phonetically reduced spoken words

The speech young children hear is highly variable. For example, reduced pronunciations, where some sounds in the canonical pronunciation are naturally dropped or altered, are common even in speech to children. The present study employed a new story-guided looking method (a variation on language-guided looking) to create felicitous conditions for testing young children's recognition of reduced pronunciations of familiar words. Experiment 1 (18-24 months, n=32) found that toddlers succeeded at recognizing clear pronunciations, but failed to recognize reduced pronunciations, even in repetition trials when target words were preceded by a clear mention of the same word in the previous sentence. In Experiment 2, 3-year-olds (35-39 months, n=17 out of 44 pre-registered, ongoing) succeeded at recognizing reduced pronunciations, and benefited from preceding repetition. Overall, these results demonstrate a powerful new method for studying children's language comprehension under more naturalistic conditions, and highlight an important psycholinguistic development over the 2–3 year span.

Infants' understanding of rates and probability matching in a foraging task

Cognitive scientists have long debated whether human learners are rational decision-makers. Much work has found that adults and children tend to use probability matching strategies in probability learning tasks despite probability maximizing being the optimal strategy. However, other work provides conflicting findings on what decision-making strategies are used and under what circumstances. Unlike previous studies that employed a typical design with a single individual making decisions (where probability maximizing is the optimal strategy), we investigate decision-making strategies in a group foraging context where probability matching is the optimal strategy. In the current study, we tested 14- to 20-month-old infants' ability to (1) distinguish rates of reward distribution in a group foraging scenario and (2) their expectations for probability matching based on these rates. Our results are the first to suggest infants are capable of quantitative reasoning involving rates and they form expectations for optimal decision-making strategies based on rate information.

How infants' understanding of goal-directed actions differs from a large language model

Theory of mind (ToM) is a hallmark feature of human cognition that emerges very early in development. Much work has explored human infants' implicit social reasoning abilities. Recent work has examined whether LLMs reliably make ToM inferences in explicit social reasoning tasks. However, it remains unclear how reliably LLMs generate human-like social reasoning when this capacity is invoked implicitly. We systematically examined GPT-4's ability to implicitly reason about goal-directed actions by adapting well-studied infant paradigms. Our results suggest that, unlike infants who can understand goal-directed actions from a very young age, GPT-4 fails to correctly attribute goal-directed actions to agents. These findings suggest that LLMs may lack key aspects of implicit social reasoning and provide insight into the emergence of these abilities in infants.

Perceptually Training Viewers against Misleading Data Visualizations with Informative feedback

Misleading visualizations are increasingly prevalent, with a surge in their negative impact on data interpretation, often leading to misunderstandings and poor decision-making. To address this issue, we investigate the impact of informative feedback within perceptual training targeting misleading visualizations. Our results show that informative feedback significantly enhances viewers' perceptual skills, improving both accuracy and efficiency in interpreting misleading data visualizations. Additionally, participants demonstrated a transfer effect, applying their developed perceptual skills to novel misleading visualizations beyond those encountered during training. These findings highlight the potential of perceptual training with informative feedback to strengthen viewers' resistance to misleading data visualizations, offering valuable insights for educational practices aimed at fostering viewers' resistance to misleading data visualizations.

Evaluating actions: Do young children prefer actions completed efficiently over those completed inefficiently?

Efficiency informs perceptions and expectations of people's actions from early in life. We examined whether young children aged three and four consider efficiency when evaluating how well agents completed goals. In three experiments, we showed children scenarios where two characters each walked to target objects, and then asked children which character did a better job. In the first experiment, children appeared to consider efficiency. They more often chose a character who took a direct path over one who took an indirect path, but only when the latter character could have taken a shorter path. Two follow-up experiments, though, failed to replicate this pattern and the success in the initial experiment could be explained in terms of the features of the paths (not strictly related to efficiency) used in that experiment. The findings suggest, then, that three- and four-year-olds do not yet use efficiency to normatively evaluate actions. We consider two alternative explanations for this.

Understand Intrinsic Motivation in Causal Learning Through Exploration

Intrinsic motivation plays a crucial role in shaping exploration and learning, yet its specific contributions to causal discovery remain underexplored. This study examines the impact of three intrinsic motivation metrics—entropy, information gain, and empowerment—on causal learning outcomes. Across two experiments, participants engaged in interactive tasks requiring them to infer causal structures through exploration. Results indicate that information gain and empowerment significantly predict learning success, whereas broad, undirected exploration (entropy) does not. These findings suggest that learners optimize causal discovery by prioritizing actions that maximize information and control, rather than engaging in indiscriminate exploration. Our study offers insights into how strategic exploration facilitates causal reasoning and how these principles can be applied to machine learning.

Studying Cross-linguistic Structural Transfer in Second Language Learning

Adults who learn a new language often report feeling that their first language gets in the way. Systematic effects of the first language on additional languages would have straightforward implications for both theory and pedagogical practice if they could be adequately characterized. Unfortunately, this has been shown to be challenging. Languages are vast and complex, and there are a very large number of them. Thus, most studies focus on a few narrowly-defined phenomena and one or two language pairs. The potential complexity of the phenomenon and the sparsity of the observations conspire to make it difficult to establish clear patterns. We present whole- language analyses of the morphosyntax of 133,659 second- language essays spanning 273 L1-L2 pairs. We find clear, consistent effects of the L1 on the morphosyntax of the L2, independent of the L2. We find that not all aspects of morphosyn- tax are equally informative about the L1, suggesting avenues for more precisely specifying how and why L1 influences L2.

Linking student psychological orientation, engagement, and learning in college-level introductory data science

Introductory data science courses have the potential to provide students from diverse backgrounds skills for working with and reasoning about data. However, what predicts success in these courses remains poorly understood. Here we investigate how students' initial psychological orientation relates to their subsequent engagement and learning. In Study 1, we took an observational approach, analyzing data from 1306 students across 11 institutions using an interactive online textbook. Students' psychological orientation, (e.g., math anxiety, stress expectations) predicted performance on assessments administered throughout the term. In Study 2, we developed and tested an intervention targeting these aspects of students' learning experience among 146 students enrolled in a single course. Preliminary analyses suggest that this intervention shifted students' beliefs about the relationship between stress and learning. This work highlights the promise of combining observational studies with interventions for advancing understanding of how affective and cognitive processes interact to support learning in real-world settings.

Early Evaluations of Caregivers Who Help and Hinder Safe and Dangerous Goals

Two experiments collected third-party evaluations from U.S. 4–5-year-old children (N = 80) who heard stories about caregivers helping or hindering their infants' achievement of safe, dangerous, or ambiguous goals. Children's evaluations were sensitive to danger: They switched from positively evaluating parents who helped access safe objects, to negatively evaluating those who helped access dangerous objects. Older children offered robustly positive evaluations of parents who protectively hindered access to dangerous objects, but younger participants were more likely to negatively evaluate these parents. Given a moderately risky goal that participants themselves judged as unsafe, children's evaluations of helping and hindering were mixed, though there was preliminary evidence of a developmental shift. These findings show that young children go beyond basic inferences about whether an act promotes or hampers another agent's goal when considering whether the action was good or bad. Instead, young children consider the broader consequences for the target's welfare.

Manipulating Predictive Focus Improves the Taste Appreciation of Coffee

Predictive processing plays a fundamental role in perception and decision-making. However, prediction can sometimes undermine our accurate evaluation of perceptual information. This study aimed to demonstrate this undermining effect in taste appreciation for everyday scenarios and to improve appreciation through targeted manipulations of predictive focus. We conducted cognitive experiments in which participants evaluated high-quality coffee with unusual flavors. We hypothesized that the initial appreciation would be low because of the coffee's unusual flavors, resulting in negative prediction errors. However, by directing the predictive focus toward specific taste features through instructions, we expected to observe an improvement in their appreciation. Our results support this hypothesis, suggesting that manipulating predictive focus can improve taste appreciation.

Impact of Mask Use on Face Recognition in Children: An Eye-Tracking Study

We examined the impact of mask use on face recognition in children across early childhood, middle childhood, and early adolescence. While children across the developmental stages were similarly affected in recognition accuracy and eye movement pattern, they differed in the impact on response time and eye movement consistency during two challenging scenarios where the mask conditions during face learning and recognition did not match. As compared with learning and recognizing unmasked faces, when recognizing masked faces learned without a mask on, similar to adults, children in early adolescence had slower responses, whereas younger children did not. When recognizing unmasked faces learned with a mask on, younger children had decreased eye movement consistency, whereas children in early adolescence did not, similar to adults. These findings suggest that children in early and middle childhood have different vulnerability to mask use in society from adolescents and adults, with important implications for age-specific interventions.

Learning a Doubly-Exponential Number of Concepts From Few Examples

Recent research has shown that people can learn more new concepts than the number of examples they are presented with. However, these results relied on strong assumptions about what skills and prior knowledge are required to perform this kind of less-than-one-shot learning. This has included having participants disentangle soft labels that fuzzily map stimuli to multiple concepts, interpret continuous feature weights, and parse complex compositional statements. We propose a novel minimal paradigm that strips away these assumptions to explore how efficiently people can simultaneously learn visual and symbolic concepts. We show theoretically that it should be possible to learn up to $2^{k-1}$ binary features from $k$ examples, and to learn up to $2^{2^{k-1}}$ unique combinations of those features. We validate this empirically, showing that people may be able to learn as many as 8 novel binary features and up to 256 concepts corresponding to unique compositions of those features from just 4 examples.

Cognitive Decision-Making in TSP Tasks: The Impact of Line Stylization Features of Point Arrays

The Traveling Salesman Problem (TSP) is a classic NP-hard problem, and research on its cognitive decision-making often focuses on internal factors like memory and experience, while neglecting the influence of the problem's structural characteristics. This study identifies that potential linear features in the TSP point distribution (such as implied paths formed by visual aggregation) may significantly impact human path selection strategies. To test this hypothesis, we propose a method for quantifying the Line Stylization Degree and generate different TSP instances with varying characteristics by introducing disturbances. These are then combined with experimental analysis of participants' decision-making patterns. The results show that participants tend to plan paths along implied lines, and this strategy may reduce cognitive load. The contribution of this paper lies in revealing the shaping role of visual structural features on cognitive decision-making, providing theoretical support for designing human-centered path planning algorithms.

Scientists & Women Scientists: Exploring Gender Biases in Institutional Category Systems

For many categories of people, men are perceived as the more default or typical members whereas women are perceived as more atypical. This bias can lead to an asymmetry in the existence and frequency of categories marked by gendered language. Here we explore the extent to which this asymmetry exists in two institutional category systems: the Library of Congress Subject Headings (LCSH) and English Wikipedia. We find that the LCSH exhibits more bias towards women than Wikipedia, and that in the LCSH this bias has not changed in the last 30 years, whereas Wikipedia shows a noticeable increase in gender balanced categories during the early 2010s. These findings suggest that more can be done to reduce gender bias in the LCSH and demonstrate how principles of typicality and categorization play out in real-world settings.

Identity-Preserving Face Privacy Enhancement via Diffusion Models with Cognitive-Aware Obfuscation

Face recognition technology raises privacy concerns as face images contain identity and soft biometric attributes. Existing methods struggle to balance privacy, image quality, and identity retention, often neglecting human perception. We propose a diffusion-based identity-preserving face privacy method that enhances privacy at the cognitive level while maintaining identity recognition. Unlike GAN-based approaches, our model generates higher-quality, more diverse, and fine-detail privacy-enhanced faces. It selectively obfuscates identity-critical regions and enables flexible attribute modifications via natural language prompts, eliminating reliance on predefined classifiers. Additionally, our method significantly reduces inference time from minutes to seconds, improving practical feasibility. Experiments show superior performance over state-of-the-art methods in both algorithmic and human cognition-based evaluations, effectively confusing human observers while ensuring reliable machine-based identity recognition.

Thinking through syntax: Expanding the scope of "thinking for speaking"

The "thinking for speaking" hypothesis proposes that our language can influence cognition during language produc- tion or interpretation, directing our attention to the grammat- ical and/or semantic categories readily codable in our lan- guage. Beyond the codability of grammatical and semantic categories, the role of syntactic hierarchy–a core feature of human language—has not been studied so far. This study addresses this gap by investigating the effect of learning dif- ferent complex noun phrase (NP) structures on English na- tive participants' similarity judgments of objects which dif- fered in color or number. In Experiment 1, as the training proceeded, participants who learned to describe novel objects following an unusual NP structure that highlights the dimen- sion of number over color were more likely to judge objects matching on number; by contrast the judgments of partici- pants who learned a more typical NP structure that highlights color over number did not change significantly over time. The training-specific effect observed in Experiment 1 failed to emerge in Experiment 2 where online language involvement was reduced. These results extend the scope of "thinking for speaking", suggesting that hierarchical structure in syntax may also influence cognition during language use. They also shed light on the potential for cognitive flexibility in representations of the NP.

A Factorized Probabilistic Model of the Semantics of Vague Temporal Adverbials Relative to Different Events

Vague temporal adverbials, such as "recently," "just" and "long time ago," describe the temporal distance between a past event and the utterance time, but leave the exact duration underspecified. In this paper, we introduce a factorized model that captures the semantics of these adverbials as probabilistic distributions. These distributions are composed with event-specific distributions to yield a contextualized meaning for an adverbial applied to a specific event. We fit the model's parameters using existing data capturing judgements of native speakers regarding the applicability of these vague temporal adverbials to events that took place a given time ago. Comparing our approach to a non-factorized model based on a single Gaussian distribution for each pair of event and temporal adverbial, we find out that, while both models have similar predictive power, our model is preferable in terms of Occam's razor, as it is simpler and has a better extendability.

Overcoming Learning Imbalance with Fusing Vision-Language Model Knowledge for Black-Box Domain Adaptation

Once the human brain learns a concept, it can easily transfer the learned knowledge across diverse environments without referring back to the original learning materials. Inspired by this cognitive process, Black-Box Domain Adaptation (BBDA) has been purposed to transfer the knowledge learned in a black-box source model to the target domain without any premise for source data or model parameters. Existing BBDA methods mainly rely on knowledge distillation or sample selection with pseudo labels, overlooking the different learning difficulties of classes. This results in easy-learning classes dominating the adaptation process and thus degrades adaptation performance. Motivated by the significant success of Vision-Language models (ViL model), we propose a novel method that integrates the knowledge of ViL model to achieve adaptation while mitigating learning imbalance. Experiments on various datasets demonstrate the effectiveness of the proposed method.

Who owns new creations? Initiation weighs more than completion

People contribute to new creations through different tasks, skill and effort. How do these contributions influence judgements of ownership? Past research has shown that initial effort towards a goal and net labour contribution are both relevant to intuitions about rightful ownership. However, their relative weights remain unclear, and hence often have different implications about who the owner should be. Here, we examine ownership judgements of new creations, contrasting different contributions. Across four online experiments (total N=704), we describe an initiator and a completer who work together on the creation of a new object, with their contributions varying in terms of effort, skill and how much they meet the original goal. We find an especially strong preference for the initiator to take ownership. This may be because initiation seems more necessary: without it, the subsequent steps of creation would not be needed.

From Infants to AI: Incorporating Infant-like Learning in Models Boosts Efficiency and Generalization in Learning Social Prediction Tasks

Early in development, infants learn a range of useful concepts, which can be challenging from a computational standpoint. This early learning comes together with an initial understanding of aspects of the meaning of concepts, e.g., their implications, causality, and using them to predict likely future events. All this is accomplished in many cases with little or no supervision, and from relatively few examples, compared with current network models. In learning about objects and human-object interactions, early acquired concepts are often used in the process of learning additional, more complex concepts. In the current work, we model how early-acquired concepts are used in the learning of subsequent concepts, and compare the results with standard deep network modeling. We focused in particular on the use of the concepts of animacy and goal attribution in learning to predict future events. We show that the use of early concepts in the learning of new concepts leads to better learning (higher accuracy) and more efficient learning (requiring less data). We further show that this integration of early and new concepts shapes the representation of the concepts acquired by the model. The results show that when the concepts were learned in a human-like manner, the emerging representation was more useful, as measured in terms of generalization to novel data and tasks. On a more general level, the results suggest that there are likely to be basic differences in the conceptual structures acquired by current network models compared to human learning.

The increase in brain network modularity leads to improved memory performance in the volitional eyes-closed state

Previous research has shown that participants exhibit better memory performance when consciously closing eyes (EC) compared to keeping eyes open (EO). However, the underlying dynamical mechanism remains unclear. Here, we propose a reservoir computing (RC) algorithm based on EEG-derived functional connectivity of brain networks in resting state to simulate the brain's memory processes and reproduce the EC-related memory advantage. Our findings indicate that, compared to EO-based connectivity, the RC constructed from EC-based resting-state connectivity demonstrates superior memory performance. Further graph-theoretical analysis reveals that the EC networks exhibit stronger modularity and the modularity index are positively correlated with memory ability. Therefore, we conclude that the functional connectivity of whole brain underlies memory function and the broad reorganization of connectivity in the EC state leads to its memory advantage over the EO state.

Feedback Maintains Stability in Cultural Transmission

Second language learners can destabilize and change the languages they acquire, due in part to competition between their first and second languages. There is reason to think that how one acquires a second language affects this competition. One such way of affecting language learning is feedback. In our preregistered study, 90 native English speakers learned and transmitted an artificial language with flexible word order and case marking across 15 iterated learning chains while receiving positive, negative, or no feedback. The original flexible word order remained most stable across generations of transmission when feedback was given; otherwise English SVO word order was likely to predominate by the final generation. These findings elucidate the role feedback may play in negotiating between competing linguistic variants and ensuring their stable transmission across generations.

Working Memory Retention Span in the Invisible Displacement Task in Horses

Working memory is vital for complex tasks but is under-explored in non-human species, such as horses. This study evaluates horse working memory through an invisible displacement task involving delayed choices. Twenty-six horses were tasked with selecting between two buckets containing a treat, with delay intervals of 10, 30, or 45 seconds between presentation and decision-making. The accuracy of correct choices was influenced by delay duration, presence of distracting stimuli, and performance in basic condition. Results showed that horses could retain information for up to 30 seconds and process information about object displacement. These findings enhance our understanding of horse cognition, revealing their capacity for simple reasoning and a longer working memory span than previously acknowledged.

Problem-solving Strategies in Frictional Force Problems: Evidence from Think-aloud Protocols

Prior work has shown that novices' intuitive conception of frictional force is often different from that of formal physics theory. These discrepancies can lead to challenges in physics education. The current paper focuses on a common misconception: "frictional force always resists motion." Though frictional force is linked to relative motion (motion of one object compared to another), it is not always determined by absolute motion. We collected think-aloud protocols from participants, who had varying levels of prior knowledge, on physics problems about motion, relative motion, and frictional force. We analyzed how participants selected properties, made comparisons, and drew inferences. We found that regardless of prior knowledge, participants were able to extract relevant information from the descriptions. When collapsed across the three problem types, frequencies of comparisons did not differ. However, participants with high prior knowledge were more likely to compare objects when making inferences about force and motion. In contrast, participants with low prior knowledge were more likely to rely on insufficient information like motion of a single object. Participants with high prior knowledge were also more likely to leverage experience with other problems, by comparing across processes or constructing hypothetical scenarios.

The Role of Nonverbal IQ in Diagnosing Developmental Language Disorder in Multilingual Children

Abstract Previous research has been inconsistent in approaching the exclusion criterion of nonverbal IQ when investigating developmental language disorder (DLD) in monolingual and multilingual children. The present study investigates the influence of the controversial low nonverbal IQ range (between one and two standard deviations (SD) below the mean) on lexical and morpho-syntactical abilities. 91 multilingual children, aged 4-8;11, were tested on Crosslinguistic Lexical Task and Sentence Repetition Task in Germany. Data were analyzed using generalized linear mixed models, considering the factors nonverbal IQ, DLD status, age, gender, and length of exposure (LoE) to German. Results show that children with typical language development (TLD) outperformed those with DLD on the LITMUS tests, independent of their nonverbal IQ, supporting the validity of these tools. Language status (TLD/DLD) and LoE had the strongest impact on test performance, exceeding the effect of nonverbal IQ. Regardless of language status, nonverbal IQ affected only receptive vocabulary but not productive vocabulary or morpho-syntax. However, when applying the one SD threshold, its influence shifted from receptive vocabulary to morpho-syntactic abilities. No significant differences were found between average and low nonverbal IQ groups across most tests within the TLD and DLD groups.

Distinct inhibitory control systems underlie individual differences in dynamic responses to an ultimatum game: A preliminary investigation

Adult participants were asked to accept or reject third-party distributions that were fair or unfair in ways that created advantageous or disadvantageous inequities while their manual responses were tracked. Reach tracking measures how participants resolve conflict between alternate options and the timing of that resolution. Participants were less likely to accept disadvantageous inequities than fair distributions or advantageous inequities. Participants showed greater deviance in their curvature towards the alternative option when they rejected the distribution than when they accepted it. However, they resolved their deviance towards the alternate option faster when they rejected a distribution. This suggests a new way of considering the role of inhibitory control in economic decision making: Participants more quickly detect conflict when rejecting distributions, but the act of costing oneself resources requires more inhibitory processes related to conflict resolution.

Curiosity is linked to information seeking in the moral domain

Curiosity motivates information seeking and knowledge acquisition across the lifespan, yet its role in the moral domain remains unexplored. To address this gap, we developed a task to examine how people seek information about real-life ambiguous moral scenarios. Across two experiments, we found that curiosity predicts greater information seeking, regardless of whether the action was seen as morally good or bad. In Experiment 1, we also found that moral goodness was associated with less information seeking, suggesting an asymmetry in how people engage with moral information. Experiment 2 replicated these findings and showed reputational concerns also influenced information-seeking, and that judgments of action justification help explain the link between perceived morality and information-seeking behavior. Together, these results demonstrate the role of curiosity in moral information seeking, and highlight the influence of perceived morality, justification, and reputational concerns in this process.

The development of polyseme learning under uncertainty

Acquiring multiple meanings for a word is often proposed to be difficult for word learners. However, the difficulty may depend on the meanings: prior work has demonstrated that word-learning is easier for both adults and children when words' multiple meanings are related (polysemous, like "cap") than unrelated (homophonous, like "bat"). However, it remains an open question how learners infer polysemous meanings if learners encounter these words in more referentially ambiguous contexts. In two studies, we examine children's and adults' learning of polysemes under uncertainty, using both artificial stimuli from prior work (Study 1) and attested non-English polysemes (Study 2). Results suggest that while adults can use similarities between referents to infer polysemous meanings across multiple exposures, children generally struggle to do so. This indicates that polyseme learning improves with age and suggests current computational models of cross-situational word-learning may capture children's word learning strategies better than those of adults.

Learning task rule updating strategies requires extensive practice

People can adjust how fast they update task rules, depending on the volatility of their environment. We investigated whether this adaptivity is primarily driven by recently experienced volatility in task demands, or can also be shaped by learned, environment-specific associations with expected levels of volatility. To this end, we trained participants on a Wisconsin Card Sorting Task where different environments required different speeds of task rule updating. We demonstrate that, initially, participants updated strategies depending on the most recent experienced levels of volatility and feedback (Experiment 1). However, after extensive (four days) training (Experiment 2), participants also developed environment-specific associations. Our findings provide important insights in how people learn to regulate cognitive flexibility.

Cross-Language Typicality Effects in a Multilingual Large Language Model

The typicality effect is the finding that some members of a category are more ``central'' and others more ``peripheral''. This effect is seminal for understanding the mental representation of concepts. Recently, researchers have looked for typicality effects in the representations learned by machine learning models as evidence of their cognitive alignment. Studies of the typicality effect in Large Language Models (LLMs) have focused on models trained on English corpora and category norms collected from English speakers. Here, we use existing norms to investigate the typicality effect across five languages: English, French, Portuguese, German, and Spanish. We focused on eight categories common across these norms, and asked whether a multilingual LLM, GPT-4o-mini, shows human-like typicality effects across these languages. The results show variation in typicality gradients across languages. Importantly, GPT-4o-mini's typicality judgments show strong alignment with human norms for some languages: English and French. The strong performance for French, in particular, cannot simply be attributed to the representation of that language in the training corpus. We discuss the implications of these findings for future studies exploring alternative model prompting approaches, different languages, and the modeling of new category norms collected using uniform methods.

Data Ownership and Privacy: Investigating a Shared Psychological Basis

People often share their personal data online despite reporting that they should not. They also show surprise and distress when data is used in ways they authorize despite giving consent. But, what underlies this inconsistency in people's thinking? In the present study, we investigated the proposal that thinking about control over information, or informational autonomy, likely underlies variability in thinking about privacy and data ownership. Namely, we propose that threats to one's autonomy might account for the aforementioned changes in people's concern for their data. To test this account, we used a survey-style design to measure how a hypothetical threat to the self, and thereby control, influenced adults' (N = 51) judgments about the ownership and privacy of their personal data. The threat was police lawfully obtaining their data with a warrant. We found that privacy and ownership judgments significantly increased over time. We also found that the variability in participants' ownership and privacy judgments was related. Together, our findings suggest that privacy and ownership likely have a shared psychological basis, and this shared psychology can likely explain the variability in people's judgments about personal data across time.

Mixing Words and Pictures: Mixed Evidence for Common Conceptual Representations

The relationship between symbolic and nonsymbolic representation has been a subject of long-standing debate, with the common system model and the separate systems model providing contrasting predictions. To test these models, we asked participants to compare the size of animals or numerical value of numbers, presented in symbolic-symbolic, symbolic-nonsymbolic, and nonsymbolic-nonsymbolic formats. Consistent with the common system model, performance improved as the ratio between stimuli increased in both animal and number domains, regardless of the format. However, supporting the separate systems model, we observed a switch cost in the symbolic-nonsymbolic number comparison task. Future research should explore the factors contributing to these mixed findings across domains.

How Gain- and Loss-Framed Incentives Affect Water Usage Behavior

This study examined the impact of gain- and loss-framed monetary incentives on a peak shift in water usage behavior. Based on the idea that framing incentives as losses (e.g., "You will lose $13 if you fail to shift your water consumption from peak to off-peak hours") may enhance the perceived value of the monetary component, triggering a greater peak shift compared to gains (e.g., "You will receive $13 if you shift your water consumption from peak to off-peak hours"). This study found that both gain- and loss-framed incentives significantly triggered a peak shift; however, the loss frame proved more effective than the gain frame. Moreover, participants prioritizing multiple environmental values were more likely to adjust their usage. Nonetheless, no interaction was observed between values and framing. These findings shed light on individual environmental values' influence on pro-environmental behavior, offering more profound insights into the cognitive processes that drive these actions.

Gestural Relativity of Spatial Cognition: Speakers' co-speech gestures shape listeners' spatial frame of reference

To think about objects' locations, people adopt a spatial frame of reference anchored either to their own body (egocentric; e.g., left vs. right) or to something external (allocentric; e.g., cardinal directions). Within cultures, people habitually rely on the same frame of reference, manifested in language, gesture, and memory. How are these norms transmitted? One account, linguistic relativity, argues they are transmitted through language. Here we explore a complementary route: gesture. In a between-subjects experiment (N = 70), we manipulated the spatial frame of reference used in gesture to describe table-top locations. As predicted, participants reliably adopted this frame of reference in a subsequent spatial search task, even after the speaker stopped gesturing. This suggests that a speaker's gesture has the capacity to reshape listeners' spatial reasoning. We argue that this offers a mechanism for "gestural relativity," which we consider in light of a larger cognitive-ecological perspective on spatial cognition.

Proprioceptive Recalibration by Moving Viewpoint: The Effect of Indirect Positional Information of Torso

Manipulated visual information recalibrates proprioception in which participants perceive their body parts to be at a different location from their actual position in the body. Previous studies have provided direct visual information regarding the manipulated position of the body parts, which have often been the limbs in this context. In our experiments, we manipulated the location of the viewpoint and/or the virtual right arm during the training session. The viewpoint corresponds to the position of the head, indicating the position of the trunk connected to the head. The results of the experiments showed that this indirect visual information could cause a proprioceptive recalibration of the trunk, a very fundamental body part. In addition, a comparison with the arm, for which direct visual information was provided, suggests that the recalibration of the trunk in the absence of direct visual information was weaker than that of the arm.

Computer Vision Models Show Human-Like Sensitivity to Geometric and Topological Concepts

With the rapid improvement of machine learning (ML) models, cognitive scientists are increasingly asking about their alignment with how humans think. Here, we ask this question for computer vision models and human sensitivity to geometric and topological (GT) concepts. Under the core knowledge account, these concepts are innate and supported by dedicated neural circuitry. In this work, we investigate an alternative explanation, that GT concepts are learned ``for free'' through everyday interaction with the environment. We do so using computer visions models, which are trained on large image datasets. We build on prior studies to investigate the overall performance and human alignment of three classes of models -- convolutional neural networks (CNNs), transformer-based models, and vision-language models -- on an odd-one-out task testing 43 GT concepts spanning seven classes. Transformer-based models achieve the highest overall accuracy, surpassing that of young children. They also show strong alignment with children's performance, finding the same classes of concepts easy vs. difficult. By contrast, vision-language models underperform their vision-only counterparts and deviate further from human profiles, indicating that naïve multimodality might compromise abstract geometric sensitivity. These findings support the use of computer vision models to evaluate the sufficiency of the learning account for explaining human sensitivity to GT concepts, while also suggesting that integrating linguistic and visual representations might have unpredicted deleterious consequences.

Relational reasoning as a learned bias: Evidence that generating explanations facilitates relational matching in adults

Reasoning based on relations is essential for human learning and thinking. However, how this ability is acquired remains unclear. Two accounts offer different views: whereas one suggests that relational reasoning is related to cognitive development, the other views it as a learned bias. We conducted two experiments to investigate whether relational reasoning is a learned bias. Experiment 1 tested adults on the Relational Match-to-Sample task. The results revealed that a significant proportion of adults failed to engage in the expected relational reasoning; instead, they relied on the object similarity. Scores on Raven Advanced Progressive Matrices (APM-12) suggest that the preference for similarity is not tied to general cognitive ability. Experiment 2 tested whether similarity-based reasoners can learn relational bias when prompted to generate explanations. The results showed that participants who primarily generated relational explanations successfully learned relational bias. Taken together, this study suggests that relational reasoning is a learned bias.

Early experiences shape children's explore-exploit decisions: evidence from the rural-urban gap

Explore-exploit decisions begin to emerge early in life, and those raised in different childhoods, such as urban and rural settings with various childhood conditions, may develop distinct exploration preferences. However, existing research presents conflicting evidence: some studies suggest that lower-quality early experiences lead to a heightened sensitivity to risk or stress, resulting in a tendency toward over-exploitation, while others argue that higher-quality early experiences can promote cognitive development, enabling children to exhibit a more adult-like, exploitative tendency. To investigate the impact of early life experiences on explore-exploit decisions, we compared the responses of urban and rural children in an explore-exploit task within a reward collection game. Our findings indicate that urban children tend to favor exploitation and achieve better reward performance. These rural-urban differences highlight the need for further research into how cognitive maturity and developmental stage shape explore-exploit choices, with the potential for extending these findings across a range of scenarios.

Bridging Perception and Language: A Systematic Benchmark for LVLMs' Understanding of Amodal Completion Reports

One of the main objectives in developing large vision-language models (LVLMs) is to engineer systems that can assist humans with multimodal tasks, including interpreting descriptions of perceptual experiences. A central phenomenon in this context is amodal completion, in which people perceive objects even when parts of those objects are hidden. Although numerous studies have assessed whether computer-vision algorithms can detect or reconstruct occluded regions, the inferential abilities of LVLMs on texts related to amodal completion remain unexplored. To address this gap, we constructed a benchmark grounded in Basic Formal Ontology to achieve a systematic classification of amodal completion. Our results indicate that while many LVLMs achieve human-comparable performance overall, their accuracy diverges for certain types of objects being completed. Notably, in certain categories, some LLaVA-NeXT variants and Claude 3.5 Sonnet exhibit lower accuracy on original images compared to blank stimuli lacking visual content. Intriguingly, this disparity emerges only under Japanese prompting, suggesting a deficiency in Japanese-specific linguistic competence among these models.

Dual-Branch EEG Decoding Method for Collaborative Multi-Brain Motor Imagery

Collaborative multi-brain motor imagery is an innovative brain-computer interface (BCI) paradigm that records and decodes brain signals from multiple individuals to collectively complete motor imagery tasks. However, existing decoding methods often rely on techniques such as averaging, concatenating, or cross-brain coupling of data or features, and lack coordination between single-brain and multi-brain decision-making in this context. To address this, we propose a dual-branch electroencephalogram (EEG) decoding method that jointly learns private and shared domain information. The method employs a Siamese network for private common spatial pattern (CSP) learning and a feature-sharing network for shared features, then combines the outputs for classification. Experiments with EEG data demonstrated a 10.27% improvement over the single-brain scenario and a 9% improvement over state-of-the-art methods. This approach effectively integrates private and shared domain learning, advancing collaborative BCI technology.

A computational model of poetry appreciation based on a spreading activation network and the incongruity resolution theory

In the studies of aesthetic appreciation, poetry has received less attention than other art forms.Furthermore, although a number of recent studies have focused on the factors that influence poetry appreciation, very few attempts have been made to explain the appreciation process.In this study, to explore the cognitive process of poetry appreciation, we propose a computational model of poetry appreciation based on the incongruity resolution theory, which has been proposed to explain the process of evoking poetic effects.Assuming that poetry reading comprehension is a process of evoking mental imagery from words in a poem, we model it as spreading activation in a semantic network.The notion of incongruity resolution is then modeled as the difference of the diversity of activation in the network. The computational model was tested using multiple regression analysis with simulation results as independent variables and human ratings for aesthetic appeal as a dependent variable. The analysis demonstrated that the degree of incongruity resolution computed by the model accounted for a significant portion of the variance in aesthetic appeal. This finding provides evidence for the validity of the model and suggests its potential for a more comprehensive model of poetic appreciation in literary works.

Confidence in absence as confidence in counterfactual visibility

When things are perceived clearly they can be detected with confidence. But under what conditions can one be confident that something is absent? Here we use a meta-perceptual illusion to show that confidence in absence scales not with visibility itself, but with the subjective belief that a stimulus would have been visible, if present. In two pre-registered experiments, participants detected the presence or absence of letters in frames of dynamic noise, and rated their decision confidence. Across trials, stimuli could appear bigger or smaller. Critically, while perceptual sensitivity was increased for smaller stimuli, participants' meta-perceptual beliefs (measured with post-experiment debriefing and prospective confidence ratings) were that larger letters were easier to detect. Accordingly, while confidence in presence scaled with objective visibility (and was therefore higher for smaller stimuli), confidence in absence scaled with beliefs about counterfactual visibility (and was therefore higher for bigger stimuli). This dissociation between the effect of stimulus size on confidence in presence and absence diminished as the experiment progressed: a sign of meta-perceptual learning. Furthermore, the effect of size on confidence in absence, but not in presence, correlated with a meta-perceptual parameter from an ideal observer model of perceptual detection, fitted to decision and response time data alone. Overall, we conclude that confidence in absence closely tracked participants' model-derived expectations about the visibility of counterfactual stimuli.

Oscillating Echoes: Primary Memory in MINERVA 2

Although the MINERVA 2 model provides a detailed account of many findings related to episodic memory, its assumption that primary (working) memory is a single vector of features is unrealistic because it fails to explain the feature-binding problem—how the features of multiple items or objects can be simultaneously maintained but still differentiated. For example, the model cannot explain how the features of "red circle" and "blue square" would be maintained in primary memory in a manner that allows the color features to be correctly associated with their respective shapes. Here we propose a more plausible, biologically inspired implementation of primary memory within MINERVA 2, evaluating its performance using serial-order memory tasks and contextualizing the model's new assumptions within the broader cognitive science and neuroscience literatures.

Do Large Vision-Language Models Distinguish between the Actual and Apparent Features of Illusions?

Research has begun exploring the performance of large vision language models (LVLMs) in recognizing illusions. However, studies often have not distinguished actual and apparent features, leading to ambiguous assessments of machine cognition. We introduce a visual question answering (VQA) dataset, categorized into genuine and fake illusions. Genuine illusions present discrepancies between actual and apparent features, whereas fake illusions have the same actual and apparent features even though they look illusory. We evaluate the performance of LVLMs for genuine and fake illusion VQA tasks and investigate whether the models discern actual and apparent features. Our findings indicate that although LVLMs may appear to recognize illusions by correctly answering questions about both feature types, they predict the same answers for both Genuine Illusion and Fake Illusion VQA questions. This suggests that their responses might be based on prior knowledge of illusions rather than genuine visual understanding.

Multi-site fMRI-based mental disorder detection using adversarial learning: an ABIDE study

Heterogeneity in open fMRI datasets, caused by variations in scanning protocols, confounders, and population diversity, hinders representation learning and classification performance. To address these limitations, we propose a novel multi-site adversarial learning network (MSalNET) for fMRI-based mental disorder detection. Firstly, a representation learning module is introduced with a node information assembly (NIA) mechanism to extract features from functional connectivity (FC). This mechanism aggregates edge information from both horizontal and vertical directions, effectively assembling node information. Secondly, to generalize the feature across sites, we proposed a site-level feature extraction module that can learn from individual FC. Lastly, an adversarial learning network, is proposed to balance the trade-off between individual classification and site regression tasks. The proposed method was evaluated on Autism Brain Imaging Data Exchange (ABIDE). The results indicate that the proposed method achieves an accuracy of 75.56% with reducing variability from a data-driven perspective. The most discriminative brain regions revealed by NIA are consistent with statistical findings, uncovering the black box of deep learning to a certain extent. MSalNET offers a novel perspective on the detection of multi-site fMRI mental disorders and it considers the interpretability of the model, which is a crucial aspect in deep learning.

Inferring Traders' Price Expectations from Time Series Data with POMDPs

Price expectations drive traders' buy, hold, and sell decisions. They are often estimated by surveying investors; however, verbal accounts may differ from latent expectations. In this paper, we propose how to infer traders' price expectations from trading data instead. We assume traders' goal is maximizing final earnings by sequentially buying, holding, or selling shares. Due to sequentiality, trading is represented as a Partially Observable Markov Decision Process solved with Deep Reinforcement Learning. This model follows an approximately optimal trading policy with respect to price paths used in training. Meanwhile, we assume traders choose optimal trading actions given their price expectations. Therefore, price paths characterized by trend and volatility parameters are assumed to approximate expectations. We then infer which values of these parameters produce a human-like trading policy. While this approach achieves a good model fit with worst-performing traders in an empirical study, our results are more ambiguous for top traders.

Plasticity and Speed-Accuracy Trade-Off in Color Discrimination: Insights from the 100-Hue Test

This study examines the plasticity of color discrimination and the speed-accuracy trade-off in color discrimination using the 100-hue test. To prevent ceiling effects from task simplicity, an unprecedently short time limit of 75 seconds was implemented, a novelty of this study. Unlike conventional 100-hue tests, accuracy rate—rather than total error score—was used to assess color discrimination ability due to its preciseness. The results showed that practice enhances color discrimination accuracy by 40% to 59%, with significant improvements observed after only three trials. Additionally, longer response times correlate with higher color discrimination accuracy. These findings suggest that color vision is highly plastic, with color discrimination ability improving significantly after three practice trials, and that a speed-accuracy trade-off exists in color discrimination.

Using Cross-Domain Data to Predict Syllogistic Reasoning Behavior

Humans can reason across multiple domains (e.g., syllogistic, conditional, relational reasoning). Previous research has often investigated these domains in isolation, hence often identifying domain-specific strategies. A fundamental question in cognitive science is, however, if we apply rather a more general reasoning process or more domain-specific mechanisms. To approach this question and allow for analyses and modeling across domains, we first present a general data set that is well-grounded in the state-of-art of reasoning research and covers not only syllogistic, conditional and spatial reasoning, but also includes a test battery (e.g., memory). Second, we investigate relationships between the domains and present a preliminary step towards cross-domain modeling by predicting an individuals' syllogistic conclusion based on behavior observed in the other domains. Our results show that domains are heavily interrelated with subtle differences between domains, highlighting the need for explanations that integrate subtle domain-specific strategies in a general theory of human reasoning.

Flexibly biased learning rates in social learning

Research on individual decision-making often finds a positivity bias, where people weight positive outcomes more strongly than negative ones during learning. This can be beneficial when rewards are rare, by amplifying relative value differences. Yet, we know very little about learning rate biases in social settings, where a key advantage is being able to vicariously learn from the negative experiences of others. This would imply a benefit for focusing on negative outcomes when learning socially, but is at odds with the seemingly inflexible positivity bias found in individual learning. Here, we examine learning rate biases across both individual and social settings, testing for adaptivity versus generally stable biases. Overall, participants appear more flexible in their learning rate biases when learning socially than when learning individually. This implies that human social learning may be more flexible and closer to normatively optimal behavior than individual learning.

Norms moderate causal judgments in cases of double prevention

If Peter prevents Jack from catching a falling bottle that Mike knocked over, most people would think that Mike caused the spill to a greater degree than Peter. Cases of double prevention like these are famously inconsistent with the idea that causal judgments rely on counterfactual dependence; the spill wouldn't have happened if Mike hadn't knocked the bottle over or if Peter hadn't prevented Jack from catching the bottle. But newer counterfactual models are more flexible, and they assume that people imagine different counterfactuals in proportion with their perceived normality. Following recent work showing that these newer models can account for causal judgments in cases of double prevention, here we find that normality affects such judgments. Specifically, when the productive factor is normal and the double preventer is abnormal, we find that participants preferentially rate either the productive factor or the double preventer as more causal depending on the normality of the possible preventer. Contrary to standard interpretations, then, our results suggest that cases of double prevention are actually more problematic for competing theories of causal judgment than they are for counterfactual theories.

Transcranial Magnetic Stimulation for Modeling Alzheimer's Disease: A Neural Dynamics Approach with Pre-training

Alzheimer's disease (AD) is a neurodegenerative disorder with limited treatment options. While transcranial magnetic stimulation (TMS) has demonstrated therapeutic potential, its underlying mechanisms and personalized applications are not yet fully explored. This study integrates deep learn-ing with neurodynamics, employing a pretrained resting-state AD self-supervised model to constrain dynamic train-ing and simulate TMS-induced neural dynamics in AD pa-tients. The results reveal regional heterogeneity, showing enhanced excitability in stimulated regions, in contrast to reduced global neural dynamics. We identified significant differences in brain responses between AD patients and healthy controls, which provide critical theoretical support for developing TMS treatment strategies. These insights ad-vance the understanding of AD pathophysiology and high-light the potential of TMS interventions, which may lead to more effective therapeutic approaches.

Mandarin-Speaking Late Talkers and Gesture Production at 24 Months

We studied gesture and language production in 21 Mandarin-speaking late talkers at 24 months of age and compared them with 28 age-matched typically developing children to determine their gestural and cross-modal communicative abilities. Spontaneous cross-modal data were collected during naturalistic mother-child interactions. Results from the Words and Sentences survey of the MCDI-T showed that late talkers had underdeveloped vocabulary and grammatical complexity. Nonetheless, their gestural competence was intact and comparable to that of typically developing children. Both groups demonstrated similar patterns in using declarative pointing, imperative pointing, showing, giving, representational, and conventional gestures to achieve communicative goals. Among these, declarative pointing was the most common for establishing joint attention and sharing interests or information with the addressee. Although late talkers were capable of reinforcing, clarifying, and supplementing speech with gestures, they did so less frequently than their typically developing peers. In sum, late talkers used gestures effectively to support communication; however, they showed limitations in integrating varied information for cross-modal communication.

MGSleepNet: A Multi-Granularity Sleep Staging Network Based on EEG and EOG Signals

Sleep staging is the foundation of sleep analysis. Recent studies have attempted to integrate multimodal signals, such as electroencephalogram (EEG) and electrooculogram (EOG), to enhance the sensitivity of models. However, these attempts still face limitations in effectively merging multimodal signals, particularly in capturing the interplay of global and local information during sleep stages simultaneously. To address this issue, we propose a Multi-granularity Sleep Staging Network (MGSleepNet), which integrates two core modules: the Global Feature Integration module (GFI) and the Fine-grained Information Capture module (FIC). The GFI effectively captures the global features of EEG and EOG signals through multi-scale convolution, channel attention mechanisms, and spatial attention mechanisms. The FIC module obtains fine-grained interaction information between EEG and EOG by segmenting time periods and employing cross-attention mechanisms. The combination of these modules resulted in an accuracy of 83.16% and 82.46% in five-fold cross-validation across subjects on the Sleep-edf-20 and Sleep-edf-78 datasets, respectively. In addition, we also achieved better than opportunity level performance on our own data sets. Finally, the ablation studies confirmed the benefits of integrating global and fine-grained relevance paradigms to enhance sleep staging performance. The model input research indicated that MGSleepNet demonstrates good performance in sleep staging outcomes.

Gesture Use Contributes to Autobiographical Remembering

Gestures support communication and mental processes. However, the contribution of co-speech gestures to autobiographical retrieval has recently started to receive attention. This study examines whether gestures facilitate autobiographical constructions by activating existing episodic details and integrating new ones, through a within-participant manipulation of gesture use (spontaneous and encouraged) and event type (past and future). Our main findings showed that representational gestures accounted for an increase in episodic details within autobiographical memory constructions. Although participants gestured more when they were encouraged, and past events elicited more details than future events, the association between gestures and increased episodic details did not differ across conditions. These findings suggest that representational gestures are particularly instrumental in autobiographical memory processes, as they contribute to the activation and retrieval of episodic details in mental simulations.

CV-Probes: Studying the interplay of lexical and world knowledge in visually grounded verb understanding

How do vision-language (VL) transformer models ground verb phrases and do they integrate contextual and world knowledge in this process? We introduce the CV-Probes dataset, containing image-caption pairs involving verb phrases that require both social knowledge and visual context to interpret (e.g., `beg'), as well as pairs involving verb phrases that can be grounded based on information directly available in the image (e.g., ``sit"). We show that VL models struggle to ground VPs that are strongly context-dependent. Further analysis using explainable AI techniques shows that such models may not pay sufficient attention to the verb token in the captions. Our results suggest a need for improved methodologies in VL model training and evaluation. The code and dataset will be available https://github.com/ivana-13/CV-Probes.

Representational similarity analysis between ADHD and SCZ based on functional brain network

Attention deficit hyperactivity disorder (ADHD) and schizophrenia (SCZ) have complex neural mechanisms. This study used fMRI data from resting state and 2-back working memory tasks to analyze the abnormal brain network characteristics of the two. The results showed that ADHD patients had dispersed functional connections under task conditions, and increased node degrees in the prefrontal cortex, insula, anterior cingulate gyrus, and cerebellum, indicating compensatory network reorganization. SCZ patients showed abnormal connections between the default mode network and the limbic cortex, and weakened coupling between the parietal lobe and the attention network, reflecting cognitive integration and emotion regulation defects. Through representational similarity analysis (RSA), it was found that the two diseases had shared abnormal connections in the prefrontal cortex, middle temporal gyrus, and hippocampus, which may be related to working memory regulation disorders. These findings provide potential biomarkers for targeted intervention.

Two Stage Psychology-Guided Fine-Grained Editing and Sampling Approach for Mitigating Hallucination in Large Language Models

The hallucination issue in large language models (LLMs) significantly restricts their application in high-stakes domains such as healthcare, cognitive science and law. Existing approaches primarily focus on data optimization or decoding strategies but lack a fine-grained analysis of the underlying mechanisms of hallucinations. This paper proposes a psychology-guided two-stage fine-grained editing and sampling framework (PGFES), which, for the first time, introduces psychological classifications of hallucinations into LLM optimization. Firstly, an attention-augmented MLP probe is designed to identify "truthfulness directions" corresponding to different hallucination types through feature channel reweighting, enabling fine-grained editing of the model's internal representations during inference. Then, a dynamic weighting mechanism based on Jaccard similarity is employed to compute the weights of multi-path edited outputs, achieving adaptive sampling. Experiments demonstrate that the optimization method incorporating psychology-related concepts improves truthfulness by 20.4% on the TruthfulQA open-domain question-answering task compared to baseline models and exhibits strong generalization across cross-domain datasets.

Dense Sentence Sets Induce an Anchor-and-Baseline Strategy in Likert Scale Acceptability Judgments

Research in experimental syntax typically assumes that the five-point Likert scale offers an ordinal probe that maps monotonically onto a latent degree of sentence acceptability. We challenge that assumption by showing that, when the stimulus space is densely populated, speakers repurpose the scale into an anchor-and-baseline device. Two large-N experiments (Russian and Serbo-Croatian; N=237; 120 permutational word-order variants per language) elicited over 28000 sentence acceptability ratings. Plotting Shannon entropy of the response distribution against the mean rating reveals a robust 'entropy arch': uncertainty climbs to a sharp peak at the midpoint and collapses toward both ends. We interpret the arch as the quantitative fingerprint of constraint competition: the scale extremes serve as categorical anchors ('completely acceptable' vs. 'completely unacceptable'), while the center functions as a floating baseline against which speakers register maximally uncertain, cue-balanced configurations for which grammatical, information-structural and frequency-based cues pull in opposite directions. Our findings reframe Likert data as the outcome of dynamic calibration rather than static preference strength and provide a simple diagnostic, entropy profiling, for locating linguistic 'tipping-point' constructions. Beyond sentence acceptability, the approach offers a principled way to map regions of maximal competition in any domain where categorical anchors and graded uncertainty coexist.

Unifying inference and selection in singular causal explanation

Explaining why events occurred involves solving different information-processing problems: inferring what actually happened (causal inference) but also highlighting a subset of the causes that contributed to the outcome (causal selection). While past research has investigated causal inference and causal selection separately, we report results of an experiment (N=284) examining how people solve both problems jointly, as is the case in real-world explanation settings. We find evidence that participants infer the state of unobserved variables on the basis of available evidence, and observe common behavioral signatures of causal selection. However, explanation preferences deviate in important ways from the predictions of a computational model combining existing theories of causal inference and causal selection. In particular, participants were disproportionately likely to select inferred over observed variables. We suggest a possible preference for producing explanations that allow the explainee to benefit from inferential work performed by the explainer.

Extending a Mathematical Theory of the Emergence of Knowledge from the Experience to Capture Learning Dynamics in Transformers

The Transformer architecture used in LLMs has garnered widespread attention due to these model's human-like conceptual knowledge and language understanding, yet understanding how these models' capabilities result from experience-guided learning, and connecting this learning process with the structure in their training data, can seem intractable. Here we present preliminary steps to characterizing the developmental trajectory of a minimal Transformer trained on a next-token prediction task, using a simple dataset with quantifiable uncertainty and a simple, intuitively characterizable structure that captures some aspects of natural semantic structure learned by LLMs from large datasets. We show how the dynamic learning process of this model is a predictable consequence of the structure of the training data, exhibiting attested features of human semantic development, as captured in a theory of neural network learning dynamics (Saxe et. al. 2019) previously used to capture such dynamics in a network originally introduced by Rumelhart & Todd (1993).

The Roles of Speech Complexity and Pointing Gesture in Guiding Children's Attention During Shared Book Reading

Shared book reading is widely acknowledged for its positive impact on language development, as it exposes children to complex linguistic structures not typically encountered in daily conversation. However, the mechanisms through which shared reading supports language acquisition remain less well understood. This study investigates the effects of speech complexity and gesture use on children's real-time word learning from books. Using a dual head-mounted eye-tracking paradigm, we assessed gaze dynamics in 18- to 24-month-old children during naturalistic book reading with their parents. Our findings indicate that while parental speech is rich in linguistic diversity, children at this age exhibit a preference for simpler sentence structures. Simpler sentences and imperatives, particularly when paired with child gestures, appear to capture children's attention most effectively. This study emphasizes the interplay between speech complexity, gesture, visual attention, and word learning, demonstrating that multimodal input plays a critical role in facilitating language acquisition.

Novel Noun Generalization And Stimulus Comparisons In Children: Manipulating The Number And Structure Of Learning Stimuli.

Studies in novel word learning show that comparison settings (i.e., several stimuli introduced simultaneously) favor taxonomically-based generalization. Most comparison studies have been done with forced-choice designs. Here, we investigated, in a free-choice comparison design the type of stimuli four and five-year-old children chose in a novel noun generalization task, either from the same basic level category, or a near superordinate category, or a distant superordinate category, but also perceptual, thematic, and unrelated lures. Same basic level category items were more chosen than other taxonomically related items. Perceptual lures and near superordinate items did not differ, suggesting that children did not arbitrate between perception and taxonomy. Results are discussed in terms of different theoretical perspectives on stimulus generalization, lexical constraints, stimulus comparison and finally Bayesian approaches. They suggest that children integrate the results of their comparison rather than sampling probabilistic regularities

Cross-modal serial dependence emerges in sequential numerosity comparison task

Serial dependence is a phenomenon in which current perception is attracted to the immediately preceding perception and is thought to reflect a mechanism for stabilizing perception. It has been demonstrated across a variety of stimuli and has also been observed in the numerosity perception. A previous study suggested that cross-modal serial dependence in numerosity perception from audition to vision did not occur; however, differences in the stimulus presentation format might have prevented serial dependence from emerging. Therefore, we used a standardized temporal presentation format. As a result, we observed bidirectional cross-modal serial dependence between audition and vision. However, the effects in each direction were not consistent within individuals. These findings provide important insights into the mechanisms underlying serial dependence, indicating that both higher-level processing, such as abstract numerical representation, and lower-level processing, such as auditory and visual cortex involvement, are engaged in cross-modal serial dependence in numerosity perception.

Decomposing Implicit Bias in Distributional Semantic Models: The Roles of First- and Second-Order Co-Occurrence

Distributional semantic models (DSMs) are computational models that learn semantic relationships through word co-occurrence patterns, broadly aligning with human statistical learning mechanisms. Prior research has shown that DSMs capture not only general semantic structure but also human social biases. For example, Caliskan et al. (2017) demonstrated that pre-trained word embeddings encode associations that mirror implicit stereotypes measured by the Implicit Association Test (IAT). To better understands how DSMs acquire these biases, we examined the roles of two distinct sources of distributional information: first-order (direct co-occurrence) and second-order (indirect co-occurrence) statistics. Our analysis revealed that nearly all biases tested could be accounted for by first-order statistics alone, while about half were significant in second-order statistics. Every bias was present in at least one of these co-occurrence types, with nuanced variation in how different topics exhibited bias across first- and second-order associations. These findings suggest that implicit biases in DSMs can be attributed to simple co-occurrence patterns, predominantly direct associations. Moreover, they support theories positing that implicit biases reflect statistical regularities in the environment rather than personal attitudes. This work highlights how these biases are embedded in natural language and how a cognitive system capable of statistical learning could acquire implicit biases through the same mechanisms that shape human semantic memory.

Curiosity Exploration Styles in Word Association Tasks

Recent analyses of human creativity and curiosity have identified the existence of three styles of exploration: busybody, hunter, and dancer. These styles were recognized largely by observing participants' explorations within a task, converting those observations into networks, and measuring networks' properties. But do these exploration styles still appear across different tasks? And when graph-based descriptors of an individual's exploration style are identified, how well do they transfer to similar tasks? We study inter- and intra-individual differences in two similar, but distinct, word association tasks: Chain Free Association and Semantic Fluency. We demonstrate that in some cases, graph-theoretic features do seem to capture individual semantic exploration patterns across tasks. Furthermore, our results provide evidence supporting the existence of the dancer style and its relationship to the Busybody-Hunter score. These findings highlight the potential of graph analysis as a tool for characterizing and exploring individual cognitive styles in semantic tasks.

Using Network Science to Measure Centrality and Standardness in Event Knowledge

An important issue in event cognition concerns how activities come to mind when people think about events (eat at a restaurant). Linear theories suggest that people think of activities in a temporally linear order, whereas hierarchical theories suggest that activities come to mind based on their centrality (i.e., importance). The current study used five network science centrality measures (CheiRank, PageRank, 2D Rank, Betweenness, and Closeness) derived from 80 temporally structured event networks to predict participants' centrality and standardness rankings and ratings. Participants were provided with 40 events and 4-10 activities per event, and ranked or rated each activity's centrality or standardness. Linear mixed-effect regression showed that CheiRank, which assigns importance to activities that have many influential outgoing links, was the strongest predictor. This suggests that people's understanding of centrality relates to the degree to which an activity leads to other activities, supporting hierarchical models and the Event Horizon Model.

People do not engage in ad-hoc reasoning about alternative messages when interacting with a literal speaker

Derivation of pragmatic inferences typically assumes that both interlocutors behave rationally, as described by the cooperative principle. However, real-world communication often involves speakers who cannot behave fully rationally due to factors such as limited language proficiency, high cognitive load, or insufficient reasoning skills. In such cases, listeners may adjust their inferences to account for the speaker's limitations. In this study, we investigate whether participants engage in ad-hoc reasoning about alternative messages available to the speaker when the speaker is explicitly literal. Our findings reveal that people overwhelmingly do not do so and instead behave as literal listeners, and that nudging participants to consider alternative messages does not improve performance. This suggests that while people readily consider the speaker's intent, they do not tend to engage in ad-hoc reasoning about probabilities of alternative messages in the absence of a rational speaker.

The benefits of one-sided iconicity

Iconicity has recently been shown to be widespread in language and to play a particularly important role in bootstrapping new referring expressions or even getting whole new languages off the ground. The basis of this role has long been assumed to depend primarily on transparency for the receiver of the iconic signal, but might there also be producer-side advantages that play a significant role? We investigated this using an experimental referential communication game in which dyads communicated fruit and vegetables. We manipulated whether the sender could generate iconic signals and whether the receiver saw them. Results suggested that iconicity gave dyads a head-start, via stability in production, even if the receiver did not perceive the iconicity. However, this benefit declined over time, most likely due to memory constraints on the receiver.

Comparing One-Boundary and Two-Boundary Evidence Accumulation Models for Go/No-Go Processes: An Application to the Decision to Shoot

The cognitive processes underlying Go/No-Go performance may be explained by two plausible evidence accumulation models: Two-Boundary (2-B) and One-Boundary (1-B) decision drift models (DDMs). While both embed a Go decision, the 2-B DDM embeds a definitive No-Go decision, whereas the 1-B DDM embeds a response window for Go. Using simulations, we found that model comparison methods like leave-one-out cross-validation (LOO), coupled with Bayesian hierarchical modeling, can correctly identify the underlying model. Additionally, using the correct model reduces the risk of missing true effects or detecting spurious findings. Therefore, we recommend researchers implement and compare both models for Go/No-Go studies to reduce misleading results. Lastly, we implemented these models to investigate race effects in the decision to shoot during police training. We found that the accumulated evidence needed to reach the Shoot decision is lower for Black suspects, which explains the heightened error rates for shooting unarmed Black suspects in data.

Semantic Congruency Across Development

Multisensory processing often results in facilitation and/or interference effects, yet the mechanisms remain unclear. The two reported experiments used a Stroop-like task to examine how congruent, incongruent, and irrelevant information presented in one sensory modality (e.g., visual) affects processing and responding in a different modality (e.g., auditory). Across two experiments, adults (E1) and 5-year-olds (E2) were presented with pictures and sounds, and they had to determine if what they saw or heard was an animal or vehicle. Experiment 1 with adults showed evidence of both facilitation on congruent trials and interference on incongruent trials, with unattended visual stimuli having a larger effect on auditory processing than vice versa. Results in 5-year-olds were slightly more symmetrical than adults but there was no evidence that auditory input dominated visual processing. Possible mechanisms underlying these effects are discussed.

Scaffolding to Support Analogical Comparisons with Science Images

This study examines scaffolding techniques to support students' analogical comparisons with science images. Analogical comparison involves identifying deep relational structure over superficial similarities, which can be challenging without guidance. Across experiments, participants compared analogous evolutionary science images with or without various scaffolds: describing the relation, completing a mapping table, and spatial support for alignment. Results demonstrated that describing the relation and completing a mapping table, especially in combination, significantly enhanced scientific interpretations of the images. However, spatial support was ineffective. The findings highlight practical strategies for improving conceptual learning from science visuals.

Path encoding and manner salience in motion event descriptions: the case of Bulgarian and English

Examining motion event descriptions allows us to evaluate what information individuals deem to be salient when communicating about events. Systematic variations between languages regarding how they encode details about motion events have given rise to theories of typological classifications of languages. In this study we evaluate the classification of Bulgarian, a South Slavic language, along Talmy's typology of verb-framed and satellite-framed languages, and relate this to Slobin's concept of manner salience. Based on behavioural evidence from an experiment using free-form descriptions in Bulgarian and English, we show that path encoding in Bulgarian is distinct from that of English (a satellite-framed language), but the use of complex path expressions in Bulgarian means it cannot be easily captured by Talmy's two-way classification system. We show that Bulgarian displays a lower rate of manner salience than English and patterns similarly to verb-framed languages in its treatment of the default manner of motion (walking).

Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

Artificial neural networks can acquire many aspects of human knowledge from data, making them promising as models of human learning. But what those networks can learn depends upon their inductive biases -- the factors other than the data that influence the solutions they discover -- and the inductive biases of neural networks remain poorly understood, limiting our ability to draw conclusions about human learning from the performance of these systems. Cognitive scientists and machine learning researchers often focus on the architecture of a neural network as a source of inductive bias. In this paper we explore the impact of another source of inductive bias -- the initial weights of the network -- using meta-learning as a tool for finding initial weights that are adapted for specific problems. We evaluate four widely-used architectures -- MLPs, CNNs, LSTMs, and Transformers -- by meta-training 430 different models across three tasks requiring different biases and forms of generalization. We find that meta-learning can substantially reduce or entirely eliminate performance differences across architectures and data representations, suggesting that these factors may be less important as sources of inductive bias than is typically assumed. When differences are present, architectures and data representations that perform well without meta-learning tend to meta-train more effectively. Moreover, all architectures generalize poorly on problems that are far from their meta-training experience, underscoring the need for stronger inductive biases for robust generalization.

Action Slips: The Effects of Devaluation and Amount of Training

Repetitive habitual behaviour can persist even when it conflicts with goals. This is termed an action slip. Wood et al. (2023) demonstrated this effect using a novel procedure, where participants' performance was strong on congruent trials but comparatively poor on incongruent trials. The current study extends their work by exploring a devaluation version of the experiment, followed by a further experiment that manipulates training duration. The devaluation approach eliminated the congruency effect, while reinstating standard testing with varied training led to its reappearance—even after a short amount of training. These findings challenge the Stimulus-Response (S-R) account and dual-process theory. We discuss the question of whether the congruency effect seen in these experiments is evidence to support a dual-process account of habitual behaviour.

Lexical Search Dynamics in Taxonomic, Thematic, and Ad hoc Categories

Searching through semantic memory involves navigating across clusters of related items, but the current body of work has primarily focused on search within hierarchically structured taxonomic categories (e.g., Animals). The present work explored search behavior via the verbal fluency task in thematic categories, where relatedness is construed via complementary roles in a shared environment, and ad hoc categories, where relatedness is construed via shared service to an external goal. We found strong differences across domains within each type of category, but compared to Animals, responses in thematic and ad hoc categories had higher average phonological similarity and word frequency. Phonology-inclusive process models provided the best account of search in taxonomic categories, but overall model performance was poor for ad hoc categories. There were also important differences in the contribution of lexical sources to within and between-cluster transitions across domains. These results underscore the necessity of exploring lexical search and validating computational models in categorical domains with different compositions and types of relatedness.

Curriculum learning in humans and neural networks

The sequencing of training trials can significantly influence learning outcomes in humans and neural networks. However, studies comparing the effects of training curricula between the two have typically focused on the acquisition of multiple tasks. Here, we investigate curriculum learning in a single perceptual decision-making task, examining whether the behavior of a parsimonious network trained on different curricula would be replicated in human participants. Our results show that progressively increasing task difficulty during training facilitates learning compared to training at a fixed level of difficulty or at random. Furthermore, a sequences designed to hamper learning in a parsimonious neural network network impair learning in humans. As such, our findings indicate strong qualitative similarities between neural networks and humans in curriculum learning for perceptual decision-making, suggesting the former can serve as a viable computational model of the latter.

Replication of Prefrontal Asymmetry in Approach-Avoidance Motivation in fMRI

A large body of research suggests that approach-related emotional states are lateralized to the left prefrontal cortex (Harmon-Jones et & Gable, 2018). However, because affective motivation and valence have often been entangled in experimental designs, it is unclear which construct drives this laterality. In one fMRI study designed to dissociate motivation and valence, Berkman and Lieberman (2010) found that approach motivation was more left-lateralized than avoidance motivation in the dorsolateral prefrontal cortex (DLPFC), controlling for valence. Our study did not replicate this key finding from Berkman and Lieberman (2010). Furthermore, whereas Berkman and Lieberman (2010) found that individuals' trait approach motivation predicted the laterality of approach-related DLPFC activity, we found that trait approach motivation predicted the laterality of positive valence, controlling for motivation. Overall, our results do not provide any clear support for the 'textbook' model of affective motivation in the frontal lobes.

Ease of Access to Information Does Not Impact Curiosity

For some learning problems, information is readily available. For others, substantial time and effort is needed to acquire information. The present work tests whether this variation in "information accessibility" affects curiosity. In two experiments, we prompted adult participants to rate their curiosity about the answers to trivia questions. For each trivia question, participants were informed that information accessibility would be high—they would receive the answer with minimal time and effort—or low—they would receive the answer with substantial time and effort. We found that information accessibility affected decisions to seek information, but not self-reported curiosity. This suggests that curiosity is unhindered by the practical costs of information search.

The Black Stories Experiment: Two Groups are Trying to Solve a Riddle Game Behind a Screen, Only One Group Is Alive

Studying large language models (LLMs) can provide valuable insights into their strengths and limitations. This study explores problem-solving capabilities of GPT-4 by comparing the model's performance in solving Black Stories riddles, to human performance. The study utilized a set of 12 adjusted Black Stories, each tested twice within the human and GPT-4 group. The experiment was conducted through text messaging for a comparable set-up. The primary measure of performance was the number of questions and hints needed to solve the riddle. Results indicated no significant difference between the groups. Qualitative results showed that GPT-4 excelled in precise questioning and creativity but often fixated on details. Humans covered broader topics and adapted the focus quickly but struggled with uncommon details. This research suggests that despite different approaches, GPT-4's performance was comparable to that of humans, demonstrating its potential as a capable participant in these types of problem solving games.

Can children represent and compute over mixed sets with the Approximate Number System?

Considerable debate exists over the kinds of numbers the Approximate Number System (ANS) can represent and compute over. Across three experiments (N = 218), we show that children can represent and add large mixed sets (i.e., large collections that include two types of items) with the ANS. In Experiment 1, 5-7-year-olds completed a replication of a large non-symbolic number addition task using an online asynchronous format. In Experiments 2 and 3, 5-7-year-olds completed a variation of that addition task with mixed sets of stimuli either area-controlled or area-correlated and again performed above chance level. Taken together, these findings are a crucial first step in examining whether the ANS can represent all positive rational numbers (i.e., fractions or ratios), as opposed to exclusively integers. In sum though, our findings suggest that children can represent and compute over large mixed sets of stimuli with the ANS.

Connection Between Lexical Processing and Phonological Regularity: A Comparison Between English and Turkish

Phonological theories rely on rules being able to apply regularly to all members of a particular domain. When there are exceptions to these regular patterns, accounting for inconsistencies poses a challenge for phonological theories. This is especially true when morphophonological factors alone cannot explain the observed variation. In addition to phonological or morphological factors, this paper argues that the way words are recognized in the process of lexical access can account for exceptional morphophonological patterns. In two corpus studies, one on English (a morphologically poor language) and one on Turkish (a morphologically rich language), results show a correlation between factors consistent with lexical decomposition and morphophonological exceptionality. For both languages, exceptional suffixes are more likely to be decomposed from stems than non-exceptional suffixes. This suggests that there is a connection between the way complex words are processed in the lexicon and the way they are treated by the phonological grammar.

When Walls Talk: People Make Social Inferences From Towns' Protective Features

Human towns are shaped by intentional design. Here we ask whether people use societal features to make social inferences, specifically focusing on how the presence of protective architectural features influences people's intuitions about towns' residents. U.S. adults (N = 100) were presented with two novel societies – a ‘protected' town with walls, locks, and gates, and an ‘unprotected' town lacking such features. We manipulated whether residents had chosen or been randomly assigned where to live. Across both conditions, people judged that unprotected society residents felt safer, happier, and were nicer; and that protected society residents dressed more similarly, stayed inside more, and had more rules. Most people preferred to live in the unprotected society. Positive attributions and preference for the unprotected society were associated with liberal (vs. conservative) political affiliation. Overall, we show that people use the physical features of built environments to make social inferences about residents' behaviors, traits, and mental states.

Enhanced Prototype Formation for Other-Race Faces in Infancy: Developmental Trajectories and Environmental Adaptations

Face prototype formation plays a pivotal role in translating face experience into robust internal representations. However, the early developmental trajectory and experiential influences on this cognitive capacity remain underexplored. Across five within-subject experiments conducted in Canada and China across different time periods, we investigated how face race and environmental context modulate infant prototype formation. Using a novel prototype formation paradigm, we discovered infants consistently exhibited stronger prototype formation for unfamiliar other-race faces compared to own-race faces, and this bias increased significantly with age. Furthermore, we demonstrated remarkable plasticity. Infant cohorts tested during and after COVID-19 lockdowns show opposite age-related trajectories in face prototyping, reflecting differential environmental exposure to diverse faces. These findings illuminate the experience-dependent nature of early face processing specialization, suggesting infants' prototype formation dynamically adapts to optimize face processing within specific environments. We discuss implications for understanding the developmental origins of face processing biases and potential social consequences.

Language models demonstrate the good-enough processing seen in humans

Comparative illusions, also called Escher sentences, are comparative sentences that appear acceptable but challenge the boundaries of comprehension. Escher sentences, illusions that are the subject of this paper, require structural reinterpretation to resolve their anomalies, making them ideal phenomena for probing the mechanisms of language processing in humans and machines. Using human behavior as a benchmark for large language models' (LLM) performance, we assessed LLMs' behavior with three methods: prompt, probability measurement, and lexical disturbance. Our results indicate that LLMs, when prompted, display "human-like'' behavior, LLMs struggled to reliably rank surprisals, and they were sensitive to both lexical and syntactic cues in Eschers. These results indicate that LLMs seem to manifest the good-enough processing seen in human cognition. Understanding and narrowing this discrepancy could result in AI systems that are more in tune with human reasoning and more interpretable, ultimately enhancing our grasp of sentence processing in humans and machines.

Do infants prefer owners over thieves?

Normative concerns for ownership fundamentally regulate social interaction regarding resources in humans, but little is known about origins of ownership inferences and evaluations tin earliest human development. Here, we ask if normative evaluations of ownership respect are rooted in an early-developing protomoral dispreference for those who violate ownership. Across two studies (N = 66) infants saw puppet shows where an owner was approached by a non-owner who attempted to take the owner's resource. Infants were subsequently allowed to choose between the two puppets. We controlled for the dominance implications of the outcomes of resource competition by systematically varying whether the non-owner or owner prevailed in attaining the resource. Both studies yielded clear Bayesian evidence that infants did not preferentially reach towards either the owner or ownership violator. These results suggest that infants do not hold strong negative evaluations of those who violate ownership.

Integrating Specialist Judgments With and Without Mentalizing

In daily life, we frequently interact with specialists—individuals whose complementary insights augment our limited first-hand experiences in decision-making. Efficient as this may be, the cognitive demands of employing our theory of mind to determine and integrate the evidential contributions of diverse specialists may offset its benefits. In this study, we explore human judgments in a scenario where they must aggregate opinions from multiple social peers, each possessing expertise in a different aspect of the problem. We examine whether participants integrate specialist judgments with or without mentalizing using a normative Bayesian model and propose two heuristic approaches. Our results show the majority of participants across two experiments relied on heuristics, suggesting that people don't tend to or have limited ability to integrate specialist judgments through mentalizing.

Children consider costs to owners when reasoning about ownership transgressions

Ownership affords different rights and privileges to owners than non-owners. We investigated whether children view transgressions that impose a large or permanent cost to the owner as less acceptable than (1) actions that impose small or temporary costs, and (2) actions that do not impose any costs to owners. Children aged three to eight years (N=72) and adults (N=72) were shown vignettes in which an agent interacts with someone else's property without permission. Both adults and children judged actions that imposed severe costs to owners as less acceptable than minor transgressions that imposed temporary costs and actions that did not involve physical contact. These findings reveal that children and adults consider the costs imposed on owners when judging the acceptability of people's interactions with others' property. Critically, these findings also provide preliminary evidence that children's concept of ownership may be embedded into their broader social cognitive framework of intuitive psychology.

Higher perceptual attention cost slows contingency learning after a modality shift

Learning perception-action contingencies from the environment depends on attention, to efficiently control cognition and selectively process sensory information. Shifting attention between sensory modalities incurs a cost in humans, other primates, and rodents, resulting in slower learning in the new modality. Previous set-shifting work in rats showed that increased difficulty of perceptual discrimination in the preceding modality increases the following shift cost. We studied this in humans by manipulating perceptual attention in one sensory modality, titrating task demand by staircase design, to test the effect on perceptual contingency learning in another modality. To accommodate the complexity of human learning, we introduce a Bayesian method to decompose and estimate learning characteristics from performance data. This method identifies the completion of rule acquisition and consolidation, accounting for individual variation in learning. Results show the expected modality shift cost, and offer new evidence in humans that shift cost is exacerbated by prior demands on attention.

Children's Expectations of Emotional Intimacy in Close Relationships

Humans across cultures distinguish intimate, close, or family-like relationships from those that are merely affiliative. Recent work suggests that this distinction is so fundamental that even humans as young as 8 months recognize a common cue of social intimacy: close physical contact. In the current studies, we investigate whether children, ages 6 to 9 years, recognize another hallmark of intimate relationships: emotional intimacy. In Study 1, children used the disclosure of sad emotions, as opposed to facts or happy emotions, as a cue for close social relationships. Interestingly, adults thought that disclosing emotions more generally was indicative of closer relationships. In Study 2, children expected that people in close social relationships would more often disclose sad emotions, but not happy emotions or facts. Again, adults did not distinguish between happy and sad emotions: they thought people in closer relationships would disclose both happy and sad emotions rather than facts. In Study 3, neither children or adults thought that disclosing sad emotions was a way to create social relationships. Together, these results suggest that by the age of six years, children connect close social relationships with emotional intimacy, but that they don't use it in their planning.

The Role of Task-Unrelated Thinking Characteristics and Function in Affect Regulation During Online and On-site Classes

Task-unrelated thinking (TUT) can impact both performance and well-being, yet its role in affect regulation remains underexplored, especially in an educational context. This study examined TUT level, characteristics, and functions in 173 on-site and 143 online students, assessing their affect and class experiences in an ecological setting. While overall TUT levels did not differ between groups, distinctions emerged in characteristics (e.g., inner speech) and functions (e.g., stimulation or avoidance). Valence was the only characteristic predicting prospective sadness or anxiety. Using TUT for problem-solving or avoidance was linked to increased sadness, whereas using it for stimulation was linked to reduced anxiety. These findings highlight that TUT's effects depend more on its nature and purpose than its frequency. The observed link between avoidance-related TUT and negative affect has significant implications for clinical psychology and educational settings, particularly in understanding emotion regulation in online and on-site learning.

Children's expectations of paternalistic helping

Requests for help often guide our goal-directed helping behavior. However, sometimes the requested help does not actually accomplish the requester's ultimate goal. Here we ask whether 3- to 12-year-old children expect others to help in ways that prioritize others' ultimate goals instead of their requests, known as paternalistic helping. We also investigate whether children's expectations of paternalistic helping vary based on the relationship between the partners (classmates, enemies, and friends). In two studies (total N = 502), children were read a short vignette about a character who unknowingly requests a broken cup. Children were asked to predict whether a second character would give the requested but broken cup, or a different, unbroken cup. Children expected paternalistic helping when the requested item could not accomplish the target's goal. But, they were less likely to expect paternalistic helping when characters were described as enemies.

Learning Efficient Recursive Numeral Systems via Reinforcement Learning

It has previously been shown that by using reinforcement learning (RL), agents can derive simple approximate and exact-restricted numeral systems that are similar to human ones (Carlsson, 21). However, it is a major challenge to show how more complex recursive numeral systems, similar to for example English, could arise via a simple learning mechanism such as RL. Here, we introduce an approach towards deriving a mechanistic explanation of the emergence of efficient recursive number systems. We consider pairs of agents learning how to communicate about numerical quantities through a meta-grammar that can be gradually modified throughout the interactions. Utilising a slightly modified version of the meta-grammar of Hurford (1975), we demonstrate that our RL agents, shaped by the pressures for efficient communication, can effectively modify their lexicon towards Pareto-optimal configurations which are comparable to those observed within human numeral systems in terms of their efficiency.

Frequency and informativity of phonological input directed to children in the first four years of life

Information theory characterizes how signals are optimized for transmission from source to receiver across noisy channels, yet little is known about how these principles manifest when the receiver's capabilities change over time. Using child-directed speech as a natural experiment, we analyzed >7.5 million phones in North American English caregiver speech to children aged 3-44 months (N=218) from the CHILDES database. We found that while the relative frequency of individual phones remained stable over this developmental time period, phonological informativity increased from early infancy (3-8 months) through toddlerhood (27-32 months), before plateauing in the preschool years. This result suggests that speech directed to children sounds less redundant, with a phonological structure that is harder to predict in context, as children progress through early childhood. Our findings demonstrate how linguistic signals may be optimized to accommodate receiver (child) characteristics, with implications for both general principles of information transmission and theories of how children carve out linguistic representations and patterns from limited, noisy input.

Language models assign responsibility based on actual rather than counterfactual contributions

How do language models assign responsibility and reward, and is it similar to how humans do it? We instructed three state-of-the-art large language models to assign responsibility (Experiment 1) and reward (Experiment 2) to agents in a collaborative task. We then compared the language models' responses to seven existing cognitive models of responsibility and reward allocation. We found that language models mostly evaluated agents based on force (how much they actually did), in line with classical production-style accounts of causation. By contrast, humans valued actual and counterfactual effort (how much agents tried or could have tried). These results indicate a potential barrier to effective human-machine collaboration.

Why Multimodal Models Struggle with Spatial Reasoning: Insights from Human Cognition

Multimodal models excel in tasks requiring semantic integra- tion of language and vision but struggle with spatial cognition. Using a visual perspective-taking task inspired by cognitive science, we find these models fail when the image and ref- erence view differ, reflecting spatial cognition comparable to a two-year-old child. To explore these disparities further, we analyze internal representations using a human action fMRI dataset and voxelwise encoding models, revealing key differ- ences between AI and human spatial encoding. This work pro- vides new benchmarks and insights into bridging artificial and biological cognition.

U.S. adults' beliefs and explanations about health disparities

Public health data highlight disease outcome disparities corresponding to age, social class, and race, which are due to biological, behavioral, and/or structural factors (depending on the disparity). We examined whether adults are aware of these disparities and how they explain them. This study recruited U.S. adults (N = 241) through Mechanical Turk and examined whether they thought that there was a relation between social categories and illness. We examined their judgments and explanations for transmitting and contracting COVID-19 or the common cold. We found that adults thought that older adults and poor people were more likely than younger adults and rich people to get sick, whereas younger adults were more likely than older adults to transmit disease. People relied on biological explanations for disparities due to age, and structural explanations for disparities due to social class. However, the results for race were more mixed, suggesting that people do not always assume that social categories are related to illness.

Cognition in Action: The relation between physical and mental paper folding in young children

Physically folding paper is a common activity performed by many children, but it is not mastered until middle-childhood. Paper folding ability has been the focus of studies motor development. There has been a long history in cognitive science of assessing spatial skills through mental paper folding tests. Despite the similarities between physical and mental paper folding, it is currently unknown whether there is a relation between physically and mentally folding paper. This study examined 107, 3- to 8-year-old children in both skills. Our results show that children of all ages were able to physically fold paper, but became more accurate with age. Additionally, we found that there is a significant relation between physical and mental paper folding, and that this relation was robust to different statistical controls and statistical specifications. We discuss how these results influence our understanding of the co-development of cognitive and motor skills.

Neural Signatures of Semantic and Perceptual Memory Formation Become More Similar Across Development

In adults, the contribution of the prefrontal cortex and hippocampus to memory encoding varies depending on the type of information being learned. Because these regions are still developing in children, their contribution to the formation of memories for different types of associations may differ from that of adults. Here, we examined how semantic and perceptual similarity between items affects memory behaviour and neural engagement in children (6-7 years) and adults. Participants completed a pair learning task during functional magnetic resonance imaging, in which pairs were perceptually or semantically related. Memory was tested outside the scanner with cued recall. Semantic similarity facilitated recall in both age groups, but more so in adults. Neurally, semantic pairs elicited broad frontoparietal activity while perceptual pairs engaged ventral visual and lateral prefrontal areas. Children showed more distinct neural responses to semantic versus perceptual pairs than adults, as well as more engagement in anterior hippocampus for semantic than perceptual pairs. These findings suggest that semantic similarity provides a powerful scaffold for memory across development, with age-related changes in memory encoding marked by a shift toward reliance on more integrated neural systems.

Decompose, Deduce, and Dispose: A Memory-Limited Metacognitive Model of Human Problem Solving

Many real-world problems are defined by complex systems of interlocking constraints. How people are able to solve these problems with such limited working memory capacity remains poorly understood. We propose a formal model of human problem-solving that uses metacognitive knowledge of its own memory limits and imperfect reasoning to guide subproblem choice. We compare our model to human gameplay in two experiments using a variant of the classic game Minesweeper. In Experiment 1, we find that participants' accuracy was influenced both by the order of subproblems and their ability to externalize intermediate results, indicative of a memory bottleneck in reasoning. In Experiment 2, we used a mouse-tracking paradigm to assess participants' subproblem choice and time allocation. The model captures key patterns of subproblem ordering, error, and time allocation. Our results point toward memory limits and strategies for navigating those limits— including the careful choice of subproblems and memory-offloading — as central elements of human problem-solving.

Children learn the meaning of ambiguous evidence from third-party belief revision

A person who changes their mind signals that they have encountered new information that prompted their belief shift. Can children use their developing understanding of third-party belief revision to take advantage of this signal for their own learning? Children (5;0-9;11 years) played a Whodunit-style game in which detectives updated their beliefs in response to different clues. The clues in isolation were meaningless to participants. In simple cases, children accurately inferred the meaning of the clues based on how they changed others' beliefs. With age, children more readily integrated changes in agents' certainty to guide these inferences. These findings suggest that children can draw reverse inferences about evidence by leveraging a causal understanding of how it impacts an agent's beliefs. Thus, children may learn world knowledge indirectly by observing its effects on others' minds.

No Evidence for Cost-Benefit Arbitration Between Social Learning Strategies

When learning a task by observing another person performing it, an individual can either focus on imitating the other's behavior (policy imitation), or attempt to infer the other's goals and beliefs and adjust their own behavior accordingly (goal emulation). Imitation is considered to be computationally cheap but less accurate, while emulation is considered to be computationally costly but more accurate. Drawing upon research on computational resource rationality, we ask whether individuals incorporate cost-benefit considerations when choosing whether to imitate actions or emulate goals. To answer this question, we used an observational-learning extension of a two-step bandit task, and manipulated the reward at stake. Participants' behavior was best fit by a dual-process model of goal emulation and one-step imitation, consistent with findings from previous research. However, contrary to our hypothesis and inconsistent with cost-benefit arbitration, we found no evidence that rewards at stake influenced participants' social learning strategies.

The Role of Spatial Frequency in Cuteness Discrimination of Infant Faces: An EEG Study

Infant facial cuteness serves as an evolutionary mechanism to enhance survival prospects by eliciting caregiving behaviors in adults. Spatial frequency (SF) processing is the basic mechanism for visual analysis. However, how different SFs contribute to the neural mechanisms underlying cuteness discrimination of infant faces remains poorly understood. To address this question, this study investigated how low SF (LSF) and high SF (HSF) differently modulate cuteness discrimination by behavioral accuracy measurement, event-related potential (ERP), time-frequency and functional connectivity analysis on recorded electroencephalogram (EEG) signals. Thirty participants performed a paired-comparison task in which they selected the cuter one face from two infant faces filtered with broad SF (BSF), LSF-only, or HSF-only. The behavioral results indicated that participants' cuteness discrimination ability in BSF condition was higher than that in LSF and HSF conditions. Time domain analysis revealed LSF faces elicited larger P1 amplitudes, while HSF faces evoked enhanced N170 and P300 components. Time-frequency and functional connectivity analyses further identified stronger theta-band oscillations and increased theta-band connectivity in posterior area when HSF faces were presented. These findings provide novel insights into the neural mechanisms underlying cuteness discrimination and highlight the importance of SFs integration in social face processing.

Dynamics of topic exploration in conversation

Conversations are intricately structured forms of social interaction in which talkers move through interconnected topics with nested levels of semantic specificity. What principles govern how conversational partners jointly navigate an expansive topic space? To characterize these dynamics, we introduce a new dataset of annotated topic shifts from N=1,505 annotators on 200 distinct video call conversations between strangers (Reece et al., 2023). Conversational dyads made stochastic but systematic transitions between topics, and within individual topics, we find that dyads begin concentrated in semantic space before dispersing to more idiosyncratic regions as topics progress. The same dispersion pattern also holds over entire conversations, providing quantitative evidence for nested levels of increasing specificity over conversations. Overall, our findings suggest that strangers get to know one another through systematic exploration of topic space, revealing hierarchical structure in idle talk.

Modelling compounding across languages with analogy and composition

Compounding is a common word formation process in many languages around the world. Previous semantic analyses of compounding suggest that analogy and composition are crucial cognitive processes that underlie the formation of new compounds, but these processes are typically considered separately. Here, we formulate a computational model of compounding that integrates both analogy and composition. Compared to simpler baselines, we show that the model combining both processes achieves the best performance in predicting the constituents of attested compounds in English, Chinese, and German. Our work extends previous semantic-based accounts of compounding via a computational approach that can be evaluated using large-scale crosslinguistic data.

Facilitating Human-AI Coordination through Computational Theory of Mind

How can an AI teammate implicitly coordinate with a human? We address this question by integrating Instance-Based Learning (IBL), a cognitive theory of learning and decision making, with the level-k Theory of Mind framework. We hypothesize that coordination emerges when partners adopt complementary k-levels and when the higher k-level agent has an accurate model of their partner's cognitive processes. To test this hypothesis, we introduce a simultaneous-choice, multi-attribute task, where outcomes depend on interactions between choice features and agent decisions. Simulations of pairs of IBL-based agents at different k-levels support the hypothesis that complementary k-levels enhance coordination. However, empirical results from an experiment reveal no advantage of [human, IBL-L2] pairs over [human, IBL-L1] pairs, even when participants are restricted to operate as L1 agents. Post-hoc simulations show that model fitting recovers some advantage for [human, IBL-L2] teams by enabling the IBL-L2 agent to more accurately predict their human partner's actions.

Setting and Adjusting Thresholds in an Optimal Stopping Task: Model Predictions and Empirical Results

Researchers have proposed that people set thresholds to decide when to stop searching in optimal stopping tasks with full information, where option values are known. Most models assume that individuals set internal thresholds to guide their stopping decisions. However, whether humans actually set and adjust thresholds with experience remains unexamined. This experiment investigates how people set and adjust thresholds and whether this affects search behavior and learning over time. We designed an optimal stopping task where participants either report a threshold before seeing the option's value or proceed without setting one. In addition, we varied whether the set threshold was binding for stopping decisions. Our findings, based on model predictions and empirical data, suggest that setting thresholds leads to more errors and lower accuracy. Accuracy is lowest when thresholds are non-binding. Participants often deviate from their set thresholds and perform better for doing so. These results challenge the assumption that people rely on thresholds for stopping decisions. Instead, they seem to learn from experience to improve accuracy and reduce errors, offering new insights into sequential decision making.

Event Boundedness Affects Attention Allocation during Online Event Processing

The human mind can segment continuous streams of activity in the world into meaningful, discrete units known as events. However, not all events are created equal. We draw a distinction between bounded events (e.g., folding a handkerchief) that have a predictable structure that develops in distinct stages (i.e., a beginning, middle, and end) and a well-defined endpoint, and unbounded events (e.g., waving a handkerchief) that lack such a well-defined structure and endpoint. We predict that event boundedness (bounded vs. unbounded) will affect attention allocation patterns over the course of the event. Here, we tested this prediction using a dwell time paradigm by measuring the time participants spent on each still frame of an activity. We found that event endpoints attracted increased attention compared to midpoints; importantly, this increase was significantly greater when people viewed bounded events, compared to unbounded events. In addition, event endpoints attracted increased attention compared to event beginnings, but this pattern also interacted with event boundedness. We conclude that abstract internal event structure (specifically, event boundedness) affects attention allocation during online event apprehension.

Large language model tokens are psychologically salient

Large language models segment words into chunks called tokens, using compression algorithms that ignore semantics. We investigated whether tokenization corrupts representations of word meanings in 17 languages. We found that GPT-4o and Llama 3 inflate the similarity of words that share tokens. However, tokens turned out to be good predictors of orthographic priming, such that people recognize a target word faster after reading a prime that ends with the same token. This boost in priming far exceeds what other overlapping strings of letters explain, which suggests that tokenization selectively identifies functional subword units. The pattern extends to the production of word associates in English: Tokens capture phonologically motivated associations, while other strings of letters do not. So, tokenization does influence semantic representations, but because tokens correspond to psychologically salient orthographic and/or phonological constituents, they may endow large language models with human-like language networks and facilitate alignment with human word processing.

Joint Action and Reward-Seeking in a Social Probability-Learning Task

Despite the prevalence of social learning in humans, the cognitive mechanisms underlying social transmission of behavior are not fully understood. We refine and expand a recently-developed paradigm, the Social Multi-Armed Bandit (SMAB) task, to systematically manipulate social information and measure its effects on human decision-making during predictive learning. We compared a dyadic task, in which social influence is maladaptive, to a control task in which participants receive no task-related social information. We found that misleading social information resulted in more maladaptive choices in the Dyadic than the Control condition, confirming findings from a prior study (Adrian, Siddharth, Baquar, Jung, & Deák, 2019). Although the maladaptive Dyadic social effect attenuated across trials and between two "games" (100 trials each), correlations between partners' higher-order response patterns persisted games. These correlated response patterns between individuals within dyads suggest a tendency to emulate higher-order patterns (i.e., heuristics or strategies). The results imply that adults sometimes emulate decision strategies even when the outcome is disadvantageous. They also suggest that social learning may be reflected in higher-order response patterns even after people learn that imitating specific actions is maladaptive.

Referential Form, Word Order, and Implicit Causality in Turkish Emotion Verbs

Pronouns are unique in discourse, as their meaning depends almost entirely on context. Early theories provided simple accounts of how meaning is determined, but research has revealed complex influences across syntax, semantics, discourse, and pragmatics. Evaluating theories is challenging due to methodological inconsistencies and a focus on English, limiting generalizability. Here, we take a step towards a clear empirical foundation for theory, with a tightly controlled study of comprehension of overt and null pronouns in Turkish. We show that pronoun resolution in Turkish is influenced by verb type, word order, and referential form, though not always in ways predicted by existing theories. Our findings highlight the need for further cross-linguistic research to refine models of pronoun interpretation and better account for the interaction of syntactic and discourse factors.

Beyond words and actions: what implicit measures reveal in preschoolers' performance on the RMTS task

This study investigates relational reasoning in preschoolers using the Relational-Match-To-Sample (RMTS) task, which tests their ability to match "same" and "different" relations. We investigate (1) whether 4-year-old children can succeed in the RMTS task and (2) whether verbal justifications of relational language predict success. Forty-nine children participated (Mage=54.97 months), and their performance was measured both behaviourally and through eye-tracking. Results show children identified relational matches above chance. Children who used relational language selected relational matches more often. Eye-tracking data revealed distinct temporal looking patterns during relational and non-relational choice trials, with children preferring relational matches after a brief comparison phase. A cluster-based analysis confirmed that children looked longer at relational than non-relational matches. These findings suggest that relational reasoning in preschoolers involves a dynamic comparison process, and eye-tracking provides valuable insight into this implicit cognitive process.

Play fair: Humans prefer an equal division of labor in a joint multiple object tracking task

In daily life, humans perform tasks in teams, in which the labor division is decided by one individual in the team and not jointly by the team members (e.g., if an employer delegates a task to an employee). In the present study, we tested how the labor is divided if one team member decides the labor division in a joint multiple object tracking (MOT) task. We found that humans preferred to equally split the number of tracked targets in the joint MOT task. When comparing the data with our previous study, in which participants performed the same task with a computer partner, we found that this preference for an equal labor division is specific to interactions with a human but not with a computer partner. Moreover, participants also tended to take into account the tracking difficulty of the delegated targets more with a human compared to a computer partner.

The Role of Feedback in Cognitive Offloading during Human-Computer Collaboration

Cognitive offloading refers to the usage of physical actions to reduce the cognitive load of a task. This study investigates cognitive offloading in a multiple object tracking (MOT) task, in which participants could decide whether they wanted a computer partner to track targets on their behalf. Depending on the experimental condition, participants either received team performance feedback or not. In all conditions, participants knew that the computer partner would track targets flawlessly. We hypothesized that feedback would change participants' metacognitive assessments such that they would maximize performance (i.e., changing their strategy by offloading more or all targets to the computer partner). We find that participants do offload targets to the computer partner in all conditions. Yet, performance feedback did not increase the extent of cognitive offloading. While these findings dovetail with previous findings, future studies are needed to definitively rule out whether performance feedback has an effect on cognitive offloading.

Exploring Neural Synchronization with EEG Using Fractal Animations

This study explores inter-subject neural synchronization, measured via inter-subject correlation (ISC), using EEG during fractal animation observation. Fractal animations, characterized by iterative self-similarity and visual complexity, provide a controlled stimulus devoid of semantic or emotional content, facilitating analysis of core sensory and predictive mechanisms. Fifteen participants watched fractal animations while their EEG was recorded. Following preprocessing, including artifact removal and spatial filtering, ISC was calculated using correlated component analysis. Results showed robust synchronization in occipital regions, linked to early visual processing, and frontal areas, associated with attentional control. Two control manipulations—phase randomization and temporal shuffling—reduced ISC by 65% and 70% (p < 0.001), confirming coherence. Peak synchronization aligned with heightened visual complexity and abrupt transitions. A correlation between self-reported focus and ISC highlighted top-down modulation. These findings endorse fractal animations as a powerful paradigm for studying neural responses, yielding insights into multisensory integration and cognitive processing.

The Role of Worldview Congruence in Misinformation Correction: A Bayesian Approach to Belief Updating

Misinformation poses a growing challenge in society, particularly as people seem reluctant to revise discredited information. However, emerging research suggests that people's persistence in believing discredited information is sometimes rational, with individuals applying their own assumptions in a consistent way. In this experimental study, we employ a political vignette with a false accusation to show that even in politically charged contexts, people can correct misinformation and return their beliefs to baseline. Our results demonstrate that participants' belief updating generally aligns with Bayesian predictions, although with a more conservative approach. With regards to source evaluation, participants were more likely to downgrade the reliability of an accuser's claim when it conflicted with their political views but returned to their initial assessments after the correction. This suggests that while worldview can affect source evaluation at the individual level, this effect does not necessarily translate into a broader erosion of institutional credibility. This study enhances our understanding of how worldview influences belief revision and source evaluation, especially in politically sensitive contexts.

Political Polarization and Fractionalisation from Rational Values-Based Inference in an Agent-Based Graph Network

The rise in political polarization disrupts political consensus and causes individual harm. We build on a theoretical framework of political polarization that emerges from uncertain political identity inference and signaling mediated by moral values. The current computational model extends this framework with rational inference tools and graph theory to better capture the complex dynamics of value-based inference and group formation. We find that minimally constrained signaling and promiscuous inference and updating of moral values leads to general network homogeneity. This contrasts with previous models using the same overarching theoretical framework and highlights the influence of model implementation, which should be further explored to triangulate the necessary causes of polarization. We discuss future extensions to the model to explore what facilitates political polarization as found in previous studies and the real world.

Method for Quantification of the Process of Collaborative Creativity: Visualization of the Dynamics by C2RQA

This study proposes an analytical method to visualize and quantify the process of collaborative creativity. While many studies have theoretically emphasized the importance of process in creativity, its complex nature—characterized by emergence and revisitability, representation and embodiment, and conscious and unconscious aspects—has made it difficult to quantify in a standardized way. We introduce an extended version of cross-recurrence quantification analysis, C2RQA, as a suitable method. C2RQA is applicable to various data types, including continuous, categorical, and binary, and can visualize correspondences between two time series, thereby revealing interaction dynamics. We applied C2RQA to two creative activities: an idea generation task and an insight problem. The results suggest that C2RQA effectively captures broad dynamic transitions in ideas and the underlying subconscious processes involved in collaborative creativity.

Sense-Making, Cultural Scripts, and the Inferential Basis of Meaningful Experience

Cognitive science has made great progress in understanding how we explain and make sense of a complex world. We lack, however, an account of a deeper notion: how an experience that makes sense can become one that is meaningful. We present an account of how explanation and sense-making can lead to meaning-making, by the use—and, crucially, re-use—of a small set of cultural scripts: explanatory complexes that can be shared across domains by members of a social group. We explain meaning-making as a process of inference in which an individual leverages these cultural scripts to segment their full, unstructured set of experiences into a form that can be understood and endorsed as significant. Our account suggests how cultural artifacts (particularly stories in the form of novels, plays, and movies) are crucial for the transmission of these scripts. We present a mathematical model of this inferential process that can account for a range of phenomena which typically resist formalization. This includes the importance of narratives in meaning-making, the difficulty of articulating meaning separately from experiences that encapsulate it, and the ways in which the standard interpretation of a stable cultural artifact can change radically over time.

Segmentation can drive the cultural evolution of the statistical properties of language

Language is passed from one generation of learners to the next via cultural transmission. This process has been shown to give rise to core properties of language that enhance its learnability. Recent experimental work shows that statistical properties of language can also emerge through cultural transmission: specifically, the statistical coherence of words, and the Zipfian distribution of word frequencies. It has been proposed that these properties emerge because they facilitate segmentation. However, it is not clear whether segmentation is necessary for their emergence. We use a computational iterated learning model to simulate the cultural transmission of unsegmented sequences under different assumptions about the nature of learning. We show that segmentation indeed promotes the emergence of these statistical properties, whereas tracking of unigram statistics does not. In addition, we show that tracking sequential statistics alone can also promote their emergence.

Bootstrapping in Geometric Puzzle Solving

We explore how people "bootstrap", or reuse chunked action sequences, to tackle complex problems, in a novel puzzle task. In this task, participants perform sequences of actions to recreate target shapes. In our experimental condition, participants are trained on problems whose best solutions share a distinct abstract action sequence, or schema. Meanwhile, a control group trained on tasks of commensurate difficulty whose solutions did not conform to this pattern. We find experimental-condition participants outperform controls in a set of more difficult test puzzles whose solutions are compositional generalizations of experimental group's training tasks. Notably, the experimental group outperformed controls even in "far transfer" tasks that lacked surface similarity to training in both their target shape and solution sequence. Our results provide a compelling demonstration of the human ability to cache and reuse abstract patterns, offering new insights into how humans approach complex problems that, naively, seem to demand a prohibitive amounts of planning or trial and error.

The Low Prevalence Effect with Random Motion Stimuli in a Visual Search Task

The low prevalence effect (LPE), a decrease in target detection performance as target prevalence decreases, is a concern in real-world visual search tasks, such as baggage screening. Unfortunately, much of the research into the LPE and its potential countermeasures may not represent the challenges of other real-world search tasks, such as sonar and security monitoring, in which objects in the search environment exhibit movement. Additionally, target (e.g., submarine) and non-target (e.g., merchant ship) movement may interact with target prevalence to further decrease target detection performance, but these factors have not been systematically manipulated to determine their effects. In Experiment 1, high and low prevalence targets occurred in search conditions with static or randomly-moving objects. Although there was no significant interaction effect detected between target prevalence and motion, these conditions independently contributed to significantly lower hit rates. In Experiment 2, a higher prevalence target was included as an attempted LPE countermeasure. Observers searched for a relatively high and a low prevalence target in the moving search task. The addition of a higher prevalence target improved target detection overall in a search environment with moving objects, serving as a possible countermeasure to the LPE.

Why Models of Scientific Communication Disagree

Agent-based models (ABMs) have become a valuable tool in social epistemology for addressing a fundamental question: How should scientists communicate? Yet, ABMs often yield conflicting results—some suggest that high levels of communication decrease group accuracy, while others find it beneficial. Why do models differ so dramatically? We argue that these discrepancies arise from a simple fact: different models use different conceptions of "communication," and model qualitatively different phenomena. To demonstrate the effects of such differences, we integrate three paradigmatic conceptions of communication from the literature—direct evidence sharing, belief averaging, and testimony exchange—into a unifying simulation. This allows us to test them under identical conditions. Our findings suggest that communication is generally beneficial. However, the effects of communication vary significantly by which conception one adopts, even under identical conditions. This lack of robustness highlights a critical issue: outcomes depend heavily on how communication is modeled.

CNNs Generalize Numerosity Across Naturalistic Stimuli Without Single-Unit Selectivity

Previous studies observed that neural network models develop numerosity-selective units when trained to perform object classification, without explicit training on numerosity. However, the emergentist view was challenged by the finding that selectivity disappears with larger sample sizes for model evaluation. Here, we investigate whether this finding was due to the qualitative visual mismatch between training and evaluation data. We present experiments with three types of neural networks, optimized either for object classification, numerosity, or both. Using a novel dataset in which both training and evaluation images include daily-life objects, we analyze layer and single-unit selectivity on a range of conditions, varying the visual properties of our evaluation images. Our results suggest that numerosity classification performance is exclusive to numerosity trained networks. Moreover, we observe a discrepancy between single-unit numerosity selectivity, compared to overall network performance. This suggests that numerosity may be represented through different encoding patterns than previously assumed.

The Dynamics of Collective Creativity in Human-AI Hybrid Societies

Generative AI is shaping an increasingly hybrid society, where ideas and cultural artefacs are created both by humans and intelligent machines. Human creativity is influenced in complex, nonlinear ways by the actions of AI-driven agents within their social networks, but these influences are difficult to measure using traditional methods. This study examines how human-AI interactions shape the evolution of collective creation within large-scale social network experiments, where human and AI participants collectively create stories. Participants (either humans or AI) joined 5×5 grid-based networks in which stories were selected, modified, and shared over many iterations. Initially, AI-only networks showed greater creativity (rated by a separate group of human raters) and collective diversity of stories than human-only and human-AI networks. However, over time, hybrid human-AI networks became more diverse in their creations than AI-only networks. In part, this is because AI agents retained little from the original stories, while human-only networks preserved continuity. These findings highlight the value of experimental social networks in understanding human-AI hybrid societies.

Do perceivers contribute to object perception?

In this paper, we argue that the contributions of perceivers to object perception can substantially affect what objects are represented in perceptual experience. To capture the scalar nature of these perceiver-contingent contributions, we discuss two grades of subject-dependency in object perception. The first grade, weak subject-dependency, concerns attentional changes to perceptual content like, for instance, when a perceiver is turning her head, plugging her ears, or her attention is primed for a particular cue. The second grade, strong subject-dependency, concerns generating perceptual objects whose existence depends upon their perceivers' sensory contributions. We offer evidence from the future-directed anticipation of perceptual experts and from the feature binding of synesthetes to exemplify this nonstandard, subject-dependent form of object perception. We conclude that strongly subject-dependent perceptual objects are more than mere material objects, but are rather a necessary combination of material objects with the contributions of a perceiving subject.

Spatial Terms in English Plus Twelve Languages: Evidence for Functional and Geometric Classes

A long-standing open question concerns the universal properties of the representations of spatial expressions, specifically, the question of whether they fall into two separate classes, functional/ force-dynamic vs. geometric (Landau, 2025). Recently Viechnicki et al. (2024) proposed a new method for examining spatial expressions across languages by filtering massive parallel text corpora for basic locative constructions (BLCs); that study showed promising but tentative and limited evidence for the two classes of spatial expressions across languages. The current study replicates and extends those overall findings using a larger corpus and a more effective filtration technique. Experiment 1 analyzes cross-linguistic variational patterns from a corpus of parallel BLCs from 12 languages, finding distinct patterns for functional and geometric spatial terms. Experiment 2 examines the semantics of ground objects from a large corpus of English BLCs and reveals additional evidence for two underlying classes of spatial relations. The two experiments strengthen the evidence from web-scale corpus linguistics supporting distinct universal cognitive representations of functional vs. geometric spatial terms.

Infants' Expectations About the Kinds of Distal Effects Communicative Actions Can Induce

revious studies investigated infants' ability to recognize turn-taking exchanges of signals that can serve communicative information transfer and draw pragmatic inferences from them. Here we investigate 13-month-olds' expectations about the distal effects of communicative versus non-communicative actions and explore their understanding of the epistemic causal mechanisms through which communicative signals modify their addressees' consequent intentional actions. In four looking time experiments (Ntotal = 80), we found that infants understand that communicative signals cannot bring about non-intentional state changes in other entities and expect their distal effects to be limited to inducing intentional behavioral reactions in recipient agents. These results indicate that human infants possess cognitive mechanisms to understand the unique causal affordances of ostensive communicative actions. Coupled with their evolved pragmatic inferential capacities and communicative mindreading skills, these abilities form a specialized cognitive system for interpreting ostensive communicative information exchange between communicating social partners.

Flooding the Zone: An Agent-based Exploration

Online public discourse faces many threats such as human and bot networks spreading disinformation or harassment campaigns aimed at excluding certain voices. One such threat is the strategy of 'flooding the zone': intentionally pumping into the discourse information that is irrelevant to, or distracting from, an important issue. This technique is employed by both individual and state actors with seeming success. How and why that technique is successful, by contrast, is less well understood. In this paper we use agent-based modelling to help elucidate the disruptive impact of flooding the zone on communication itself. Specifically, we probe the ways in which flooding hampers the spread of relevant information and show consequences of this even for idealized, rational, actors.

"Can you tell I used ChatGPT?" How Perceived AI-Mediation Affects Workplace Email Persuasiveness— A Bayesian Approach

Large Language Models like ChatGPT are becoming every-day writing partners in the workplace. This study asked: how does simply knowing an email was "edited by ChatGPT" affect its persuasiveness and the perceived cred-ibility of the sender? We collected data from 308 profes-sionals using experimental vignettes that simulated realis-tic workplace emails. Some emails were described as en-tirely human-written, while others were labeled as AI-edited, with variations in the sender's reliability (who is sending the message) and strength of the argument (how well the content is constructed). A Bayesian Model of Ar-gumentation provided normative predictions for how reli-ability and argument quality should influence persuasion. We found that when an email was labeled as "edited by ChatGPT," receivers saw it as less persuasive overall. However, AI-mediation did not diminish the relative in-fluence of source reliability and argument quality. In other words, while the AI-edited label reduced overall persua-siveness, it didn't change how recipients inherently evalu-ated credibility. They still adjusted their beliefs primarily based on who sent the message and how strong the argu-ment was. To our knowledge, this is the first study to ap-ply a Bayesian framework to understanding how people process AI-mediated communication.

Structural Alignment Across Visual and Linguistic Modalities: A Developmental Refinement Perspective

This study investigates the structural alignment of conceptual representations accessed from visual and linguistic modalities in 8-9-year-old children and adults. Using a Spatial Arrangement (SpAM) task, participants organized familiar items from two categories – household items (HHI) and vegetables & fruits (VF) – presented separately as images and written words on a two-dimensional grid. Representational dissimilarity matrices were computed based on item distances within each modality and analyzed for structural alignment across modalities. Results showed significantly lower cross-modal alignment in children compared to adults, suggesting ongoing developmental changes in the structure of conceptual representations. Additionally, cross-modal alignment was higher for VF than for HHI categories in both age groups, indicating category-specific variations in conceptual organization and its refinement. These findings provide insights into the gradual refinement of the structure of the lexical-conceptual system, extending beyond item-level lexical learning.

Classification Versus Observation through Within- and Between-Category Comparison

Inductive concept learning requires making inferences about target categories based on specific examples. Two factors which influence this process are type of learning task and the nature of the items available for comparison. However, the literature remains inconsistent on which combination of factors best facilitates concept learning. Moreover, much of the present literature focuses on artificial categories with arbitrary boundaries, leaving open the question of how best to improve learning for natural categories. We report two experiments on natural category learning, which cross learning mode (classification vs observation) with comparison type (match vs. contrast vs. control). Across both experiments, we find evidence of an observation advantage and some evidence for a contrast advantage (Experiment 1). These findings offer evidence against a classification advantage during natural category learning, which some studies have shown, and highlight the critical need for investigating the factors that impact the efficacy of classification and observation learning.

Semantic-Pragmatic Adaptation to Variable Use of Temporal Expressions

Previous work has shown that listeners rapidly update their interpretations of vague expressions such as quantifiers and expressions of uncertainty when they observe a speaker's usage of such terms. However, previous studies focused on instances involving two reasoning steps: inferring a world state from a visual scene and communicating the world state. Based on these experiments, it has been argued that listeners infer speaker-specific mappings between world states and vague expressions rather than listeners making inferences about how the speaker infers the world state from a visual scene. Here, we extend the work on semantic-pragmatic adaptation to a new class of expressions, namely vague temporal expressions, such as for a bit and for a while, and employ an experimental paradigm in which the inference from the visual scene to the world state is deterministic. We replicate previous findings in this setting, suggesting that adaptation indeed targets semantic representations.

What Almost Happened? Using Close-Counterfactuals to Prime a Simulation Mindset in Children

Counterfactual reasoning, the ability to reason about how events could have turned out differently, helps individuals understand the causes of events and prepare for the future. The simulation mindset hypothesis posits that exposure to counterfactual scenarios stimulates the generation of imaginary alternatives, enhancing planning, problem-solving, and behaviour adjustment. This study investigated whether close-counterfactual scenarios prime a simulation mindset in children leading to better problem-solving abilities. Ninety six- and eight-year-olds were assigned to either a counterfactual condition, with storybooks featuring close-counterfactual events, or a control condition, with storybooks describing factual events. Participants then completed two problem-solving tasks requiring the generation of alternative solutions. Results showed that 8-year-olds exhibited better problem-solving abilities than 6-year-olds. Counterfactual scenarios did not significantly affect older children's problem-solving skills, however they showed benefits for the younger children. These findings provide emerging evidence that engaging in counterfactual reasoning can enhance divergent thinking and problem-solving skills in children.

Physical reasoning during motor learning aids people in transferring mass, but not motor control mappings

When people interact with objects, they show incredible flexibility in learning novel motor control mappings or adapting their known control mappings to variables like object mass. Such motor learning can benefit from intuitive physical reasoning, as novel contexts of object interaction could be a new combination of a previously experienced control mapping with a different object with known mass. In this work, we present a novel object interaction paradigm in which subjects learned to slide pucks at targets by releasing kinetic energy from a compressed spring in a computer game. Participants needed to learn how their motor actions related to the final positions of the puck, while also adapting to the mass of different pucks. With a Bayesian regression model, we inferred participants' beliefs about object mass and control mappings, and show that they could transfer information about previously experienced puck mass but not the motor mappings of the springs.

Exploring Causal and Compositional Reasoning in Large Language Models

Large Language Models (LLMs) have shown surprising capabilities in reasoning tasks despite lacking direct physical experience with the world. We examine LLMs' ability to reason about object affordances through a tool innovation task where one must select unconventional objects to replace typical tools. In a study comparing GPT-3.5-turbo and GPT-4o with human participants (N=100), we found that while GPT-3.5 performed significantly worse than humans (38.7% vs. 85.8%), GPT-4o with chain-of-thought prompting achieved human-level performance (85.0%). Qualitative analysis revealed that both models could identify causally relevant object properties, but GPT-4o was superior in flexibly applying these properties in novel contexts. We argue that this success relies on compositional reasoning—the ability to decompose objects into abstract properties and recombine them for novel uses. Our findings suggest that LLMs' ability to reason about object affordances has progressed substantially, highlighting the need for further mechanistic research to characterise LLMs' underlying abilities.

Language and the Algebraic Mind: Learning Unfamiliar Rules

Natural language is often considered fundamental to mathematical thinking, a view supported by research on language-of-training effects in bilinguals. However, these effects have been primarily examined in arithmetic. While previous research has explored language effects in algebra, it has largely focused on well-established rules. This study extends prior work by investigating whether similar effects apply to learning of algebra with unfamiliar rules. Thirty-nine Chinese-English bilingual undergraduates were trained to solve arithmetic and algebraic problems in Chinese or English and were later tested on both old and novel problems in both languages. Consistent with previous findings, results revealed a clear dissociation between arithmetic and algebra. Bilinguals responded faster in the trained versus the untrained language for arithmetic problems and solved old arithmetic problems faster than novel ones. However, these effects were absent in algebra, suggesting that algebraic learning does not necessarily depend on natural language encoding.

For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies

It remains debated how well any LM is able to understand natural language or even generate reliable metalinguistic judgments. Moreover, relatively little work has demonstrated that LMs can represent and respect subtle relationships between form and function proposed by linguists. We here focus on a particular such relationship established in recent work: English speakers' judgments about the information structure of canonical sentences predicts independently collected acceptability ratings on corresponding "long distance dependency" (LDD) constructions, across a wide array of base constructions and multiple types of LDDs. To determine whether any LM captures this relationship, we probe GPT-4 on the same tasks used with humans and new extensions. Results reveal reliable metalinguistic skill on the information structure and acceptability tasks, replicating a striking interaction between the two, despite the 0-shot, explicit nature of the tasks, and little to no chance of contamination (Studies 1a, 1b). Study 2 manipulates the information structure of base sentences and confirms a causal relationship: increasing the prominence of a constituent in a context sentence increases the subsequent acceptability ratings on an LDD construction. The findings suggest a tight relationship between natural and GPT-4-generated English, and between information structure and syntax, which begs for further exploration.

A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning

Learning new information without forgetting prior knowledge is central to human intelligence. In contrast, neural network models suffer from catastrophic forgetting: a significant degradation in performance on previously learned tasks when acquiring new information. The Complementary Learning Systems (CLS) theory offers an explanation for this human ability, proposing that the brain has distinct systems for pattern separation (encoding distinct memories) and pattern completion (retrieving complete memories from partial cues). To capture these complementary functions, we leverage the representational generalization capabilities of variational autoencoders (VAEs) and the robust memory storage properties of Modern Hopfield networks (MHNs), combining them into a neurally plausible continual learning model. We evaluate this model on the Split-MNIST task, a popular continual learning benchmark, and achieve close to state-of-the-art accuracy ~90%, substantially reducing forgetting. Representational analyses empirically confirm the functional dissociation: the VAE underwrites pattern completion, while the MHN drives pattern separation. By capturing pattern separation and completion in scalable architectures, our work provides a functional template for modeling memory consolidation, generalization, and continual learning in both biological and artificial systems.

LLM-Generated Semantic Networks Predict Semantic Priming Effects on Human Reaction Times in a Word-Recognition Task

A well-known empirical result in human linguistic processing finds that humans are quicker to correctly recognize a string of letters as word when they are first shown a word that is semantically related to the word they must recognize. This is known as the "semantic priming effect." Since Collins and Loftus (1975), it has been widely theorized that this effect is due to graphical storage of words in memory and a "spreading activation" model of priming. On this theory, words are related to one another in human semantic memory via a graphical structure encoding semantic relationships between words, with participants more likely to quickly recognize a word when they are primed with one that is graphically nearby; the prime word "activates" the node of a participant's semantic memory network representing the prime word and this activation "spreads" to words at nearby nodes. Today, large language models increasingly excel at generating structured data representations, like graphs, when prompted to do so (Ghanem & Cruz, 2024; Dagdelen et al., 2024). In the current paper we investigate whether a language model can be prompted to represent a set of words as a semantic graph, and whether human reaction times in a word recognition task are predicted by the minimum path length between words in such an LLM-generated semantic graph. Using two versions of the Gemini language model, we use a prompting strategy to generate semantic graphs relating all words used in a large semantic priming experiment conducted by Hutchinson et al. (2013), under a variety of different temperatures and settings for the number of maximum output tokens. While we find that all LLM-generated semantic graphs produced during our experiments are such that the minimum path length between two words predicts the reaction time in which a person primed by one word recognizes the other, this effect is most pronounced for graphs generated via a smaller version of the model. It is under these conditions, we find, that LLMs produce the dense graphs that are more predictive of human semantic priming effects in lexical decision tasks.

Emotional Consensus Matters: Impact on Toddlers' Visual Exploration Behaviors

Emotional feedback plays a critical role in guiding early behaviors, yet relatively little is known about how toddlers integrate emotions from multiple informants. The present study investigated how the consistency of emotional feedback influenced toddlers' visual exploration of novel objects. A total of 74 toddlers (12–24 months) viewed videos of eight adult informants displaying happy, sad, or neutral emotions toward unfamiliar objects. In Experiment 1, toddlers who received consistently sad feedback demonstrated reduced exploration. In Experiment 2, toddlers exposed to inconsistent emotional cues (e.g., 50% happy and 50% sad) exhibited greater exploration compared with those presented with consistent feedback. These findings suggested that toddlers' visual exploratory behaviors were shaped not only by the valence of emotional signals but also by the degree to which these signals were consistent. In particular, a mixture of emotional feedback may enhance toddlers' engagement with novel objects.

Is it OK if Alexa doesn't know?: Children and adults' beliefs about smart speaker virtuous ignorance

This study examines 124 5- to 10-year-old children's and their parent's beliefs about smart speakers that provide ignorant or exact responses to knowable and unknowable questions. More specifically we explored whether participants value virtuous ignorance (i.e., the admission of not knowing the answer to a question because the information is unknowable) when considering explanations provided by smart speakers. With age, children and adults increasingly indicate that smart speakers that provide virtuously ignorant responses are more credible than smart speakers that provide exact responses to unknowable questions. Interestingly, children, but not adults, indicated that speakers with virtuously ignorant responses are better for unknowable future event related questions than unknowable number related questions. Children's evaluations of technology's virtuous ignorance changed with age and question type, perhaps reflecting changes in children's expectations about the kinds of responses devices can reasonably provide. Implications for children's and adult's learning and understanding from smart speakers are discussed.

Musical and emotional features individually and interactively predict perceived similarity of popular songs

Judging similarity between pieces of music is critical for interacting with it in everyday life. But how do musical and emotional features drive our subjective judgments of similarity? Much of the previous work has focused on low-level features and has largely ignored the impact of lyrics and emotion on perceived similarity. Here, we tested the influence of a comprehensive set of musical and emotional features on similarity, using original popular songs and cover versions to match clips on lyrics and melody. We found that tempo most strongly predicts lower similarity ratings, but key, voice type, and timbre differences predict similarity in an interactive manner. While emotional arousal did not predict similarity above and beyond tempo, emotional valence did. Together, these results suggest that both musical and emotional factors influence judgments of similarity, shedding light on the fine-grained explanatory mechanisms of listeners' everyday impressions of popular music.

Model Human Learners: Computational Models to Guide Instructional Design

Instructional designers face an overwhelming array of design choices, making it challenging to identify the most effective interventions. To address this issue, I propose the concept of a Model Human Learner, a unified computational model of learning that can aid designers in evaluating candidate interventions. This paper presents the first successful demonstration of this concept, showing that a computational model can accurately predict the outcomes of two human A/B experiments---one testing a problem sequencing intervention and the other testing an item design intervention. It also demonstrates that such a model can generate learning curves without requiring human data and provide theoretical insights into why an instructional intervention is effective. These findings lay the groundwork for future Model Human Learners that integrate cognitive and learning theories to support instructional design across diverse tasks and interventions.

Generics revisited: Analyzing generalizations in children's books and caregivers' speech

Generics, general statements about categories, are believed to transmit essentialist beliefs---the idea that things have a hidden true nature. Research suggests that people essentialize natural (biological and non-living) and social kinds, but not artifacts. Previous studies using small datasets found that generics are often used to describe animate beings in speech to children. Using a larger corpus of children's books and parent speech, we examined a wider range of kinds and generalizing statements (habituals and universals). Our results show that generics are more likely used for biological kinds than artifacts and that their use increases in parent speech as children age. However, generics weren't more likely used for non-living or social kinds than artifacts. Habituals, at least in speech, were more likely used for social kinds than artifacts. Generalizing statements were more likely used for about non-living natural kinds than artifacts. These findings inform the debate over whether generics transmit essentialist beliefs.

The Development of Decision Making: The Role of Objective Uncertainty and Perceptual Novelty

Human decision-making is influenced by many factors, including uncertainty and novelty. Although widely studied, prior findings on these two secondary effects remain mixed, partly due to the confounding effect of reward value and the natural co-occurrence of uncertainty and novelty, which makes it challenging to disentangle them. To examine the unique contributions of uncertainty and novelty across age groups, we designed a two-armed bandit task that carefully controlled reward value and subjective uncertainty. On each trial, participants chose between two options varying systematically in perceptual novelty and objective uncertainty. Participants included 38 children (ages 4–6) and 37 undergraduates. By holding one factor constant while manipulating the other and applying a computational model to trial-by-trial choices, we found that children's decisions were primarily driven by perceptual novelty, while adults were guided by aversion to objective uncertainty. These findings highlight developmental changes in decision-making and offer directions for future research.

Measuring and predicting variation in the difficulty of questions about data visualizations

Understanding what is communicated by data visualizations is a critical component of scientific literacy in the modern era. However, it remains unclear why some tasks involving data visualizations are more difficult than others. Here we administered a composite test composed of five widely used tests of data visualization literacy to a large sample of U.S. adults (N=503 participants). We found that items in the composite test spanned the full range of possible difficulty levels, and that our estimates of item-level difficulty were highly reliable. However, the type of data visualization shown and the type of task involved only explained a modest amount of variation in performance across items, relative to the reliability of the estimates we obtained. These results highlight the need for finer-grained ways of characterizing these items that predict the reliable variation in difficulty measured in this study, and that generalize to other tests of data visualization understanding.

Typicality biases in interpreting unmarked sentences: an artificial language learning experiment on differential argument marking

Many languages exhibit Differential Argument Marking (DAM), where nouns are marked for their grammatical role only in certain contexts. DAM is subdivided into Differential Object Marking (DOM) and Differential Subject Marking (DSM), both of which are conditioned by factors such as animacy across languages and hypothesized to arise via ambiguity avoidance and tendencies to mark atypical situations. While previous studies have primarily focused on production, this study employs artificial language learning to investigate comprehension in both DOM and DSM, examining whether learners rely on typicality biases for grammatical role assignment in unmarked sentences, and whether the type of marking learned affects their interpretation. Results indicate a strong givenness effect and a smaller animacy effect, with no significant differences between DOM and DSM conditions. These findings suggest that typicality biases play a key role in shaping DAM systems and that their emergence is best understood as a communicative phenomenon.

Are there different types of error monitoring? A microstates analysis of error-related brain activity across three tasks

Error monitoring is commonly studied using various inhibitory control tasks, involving response withholding, response cancellation, or response selection. However, it remains unclear whether there is a common neural mechanism underlying error monitoring across these tasks or if it is specific to distinct types of cognitive failures. To identify both similarities and differences in the neural processing of errors across go/no-go, stop-signal, and flanker tasks, we employed microstate analysis. This method allows to study the dynamically evolving topographical patterns of neural activity throughout the brain. Our results revealed that the early phase of error monitoring in all tasks predominantly engages the supplementary motor area. In addition, we observed task-specific neural activity encompassing visual and motor areas in the go/no-go task, the dorsal part of the anterior cingulate cortex in the stop-signal task, and its ventral part in the flanker task. These findings suggest that error monitoring involves a collection of interconnected cognitive processes, rather than a uniform mechanism across tasks.

WalkMore: Exploring the Role of a Personalized Humorous Nudge Architecture on People's Walking Behavior

Approximately 31% of the global population is insufficiently active, leading to 3.2 million deaths annually. Sedentary behavior increases the risk of cancer, heart disease, and metabolic disorders. To improve adherence to behavior change interventions, this study introduces personalized humorous nudging for walking behavior. Following a Randomized Control Trial (RCT) design, participants (N=100) received either humorous or non-humorous nudges via a Telegram-based chatbot, "WalkMore," encouraging daily walks for 7 days. We used the AICTP smart nudge framework (Activity, Influence, Content, Time frame, Presentation) and added humor to the presentation component. Results showed a significant relationship between walking and humor (F=6.48, p<0.05). Manipulation check items reported the nudges' average funniness rating as 3.48 in experimental group and average personalization rating as 3.64 in control group. The findings indicate efficacy of the proposed augmented nudge framework for behavior change.

When Teaching A Robot, People Employ Different Feedback Strategies: Some Are More Effective Than Others

To investigate the effects of human feedback strategies on machine learning (ML), we collected data from participants (N=36) as they evaluated a robot with numeric feedback during a card game. We found that participants employed different partial credit feedback strategies for robot failures during the task (i.e., participants varied in how they scored the same robot failure actions). We then used the feedback from each participant to generate extrapolated feedback strategies. In simulations, we found that training a supervised ML model with these different extrapolated feedback strategies influenced how well the model was able to learn the task. Models trained with labels from some reasonable strategies significantly outperformed models trained with labels from other reasonable strategies. Participants' familiarity with ML, artificial intelligence, and the task did not significantly affect how well their extrapolated feedback strategy trained the model. These findings have implications for transferring learning algorithms into the real world.

Picture Book Features Influence the Use of Complex Modifiers During Shared Number Book Reading

In English, adjectives typically appear in a position where the modifier precedes the noun (e.g., "the blue car," "three ducks"). Utterances in which the modifier comes after the noun (e.g., ducks, we have three) are far less common. Despite this, recent studies suggest that hearing number words after the groups of objects they describe help children learn the meaning of number words (Ramscar et al., 2011; Gibson et al., 2020). The current study explored how specific features of number books might influence the frequency of number-after-noun utterances during a shared book reading session between parent-child dyads. We hypothesized that number books that vary the category of objects counted across sets (e.g., one puppy, two lambs, three kittens) would encourage the number-after-noun construction. We used data from an existing study in which parent-child dyads (n = 157; child's Mage = 44 months; 88 girls, 69 boys; 91.72% of parents self-reported as white) were randomly assigned to read two number books. Results revealed that parent-child dyads who read number books that change the referent category across sets use more number-after-noun utterances (e.g., "Oh look, a ball. We have three.") than those who read books with the same referent category across sets (one puppy, two puppies, three puppies).

Labels Facilitate Categorical Perception Effects during Novel Category Learning

Previous work has suggested that category labels can facilitate faster rates of category learning. To better understand the role that labels play in this phenomenon, the present study investigates whether this labeling advantage coincides with a warping of representational space that is indicative of categorical perception. To this end, we collected behavioral and EEG data during two tasks: an approach-avoid task in which category boundaries are learned and a same-different task. The behavioral results replicate the labeling advantage in category learning and suggest that CP effects are strengthened by the influence of labels. Representational similarity analyses of EEG brain activity collected during the approach-avoid task provide additional support for this theory, showing that stimulus representations exhibit patterns of representational warping that are characteristic of CP when the labeling advantage is most prominent. Together, these findings contribute to a richer understanding of how labels facilitate category learning.

Interaction of language-specific and cross-linguistic strategies during agreement computation - Evidence from Hindi

Errors during sentence production have revealed crucial insights about the cognitive underpinnings of language processing. One such widely studied error is the agreement attraction error. Such errors occur when the subject-verb agreement, a crucial linguistic dependency, falters such that the verb shows the features of a ‘distractor' noun rather than that of the target subject. Previous work on agreement attraction has established similar cross-linguistic patterns, such as the number mismatch asymmetry effect. Such research suggests that the underlying mechanism might be universal. Recent studies, however, indicate that Hindi employs a language-specific strategy during agreement processing that is not reported in other languages. This raises an important question: do the cross-linguistic patterns observed in agreement processing also manifest in Hindi? Our experiment addresses this gap by using a preamble repetition task to elicit errors. Based on the nature of mismatch asymmetry and the structure of Hindi nouns, we hypothesize that if number mismatch asymmetry occurs, it should be limited to feminine nouns in Hindi. Our findings confirm the presence of mismatch asymmetry in Hindi but exclusively for feminine nouns. This suggests that while agreement mechanisms are indeed universal, they are influenced by language-specific configurations and strategies. Overall, our results can be interpreted better within a cue-based retrieval framework.

The Impact of Immediate and Elaborative Feedback on Second Grade Students' Equation Solving and Understanding of the Equal Sign

Understanding mathematical equivalence is critical for students' success in algebra. Despite its importance, many students misinterpret the equal sign due to early exposure to operational patterns in arithmetic, leading to entrenched misconceptions. This study investigates how feedback can correct these misconceptions through the implementation of an online version of the Improving Children's Understanding of Equivalence materials. The study used A/B testing to evaluate accumulating and diminishing feedback on second graders' understanding of mathematical equivalence. Students made significant improvements in equation solving and conceptual understanding across both feedback types, underscoring the value of immediate, elaborative feedback. However, no significant difference was observed between feedback conditions. The present study informs the development of effective instructional interventions for early mathematics education that can be delivered at scale. By refining automated feedback and addressing student-specific learning needs, educators and researchers can strengthen foundational mathematical understanding and better prepare students for future algebraic success.

Efficiency in Writing Systems: Testing Zipf's Law of Abbreviation Across Letters, N-Grams, and Words

This study investigates Zipf's Law of Abbreviation (ZLA) across 155 writing systems, analyzing how visual complexity optimizes with information content at three linguistic levels: letters, n-grams, and words. Using perimetric and skeleton-length complexity metrics, we demonstrate that letters exhibit the strongest correlation (� = 0.2–0.4 in most languages), confirming their role as primary units of efficiency optimization. Larger units (n-grams/words) show weaker effects due to structural constraints. While alphabetic scripts (e.g., Latin-based) align robustly with ZLA, logographic (e.g., Chinese) and abugida (e.g., Kannada) systems reveal exceptions—some with near-zero or negative correlations—highlighting script-specific pressures like distinctiveness or historical preservation. Our findings refine ZLA by emphasizing visual (not just length-based) effort minimization and underscore letters as the fundamental locus of abbreviation effects. Limitations in script diversity and complexity metrics suggest future directions, including phylogenetic controls and perceptual complexity measures. This work advances the cross-linguistic study of writing system evolution under efficiency pressures.

Human Learning of Non-Markov Structures

From comprehending language to learning new dance moves, extracting complex relationships between sequences of input is a key feature of human cognition. Prior studies have predominantly explored the cognitive mechanisms of structure learning using Markov sequences, where each element depends only on the previous one. Real-world experience, however, is rife with complex dependencies beyond Markov processes. Here, we study the effects of non-Markov dependencies on sequence learning by leveraging graph learning approaches. We introduce a motor sequence task in which transitional probabilities between pairs of stimuli are identical from a Markov perspective, but differ on higher-order non-Markov dependencies. We find that participants are better able to anticipate stimuli with higher non-Markov probabilities, providing corroboratory evidence that humans are sensitive to statistical structure beyond Markov dependencies. Further, behavior differed from other participants trained only on Markov sequences. Overall, this work demonstrates that humans can rapidly learn and represent statistical dependencies beyond the Markov regime.

Contact angle uncertainty influences perceived causality in launching events

Humans can perceive causality when viewing collision-like interactions between plain, two-dimensional shapes. To what extent subjective causality reports arise from the physical plausibility of events as predicted by Newtonian physics remains an open question. Here we measured each participant's perceptual judgments about the contact point angle ($\alpha$) and their causality judgments about the target ball's trajectory offset ($\beta$) using Michotte's launching paradigm. Our results showed that subjective causality reports decreased as the target's trajectory deviated further from the physically plausible angle, confirming the important role of physical plausibility in causal perception. Moreover, as uncertainty about $\alpha$ increased, participants were more likely to report events with larger $\beta$ as causal, and their causality reports were more sensitive to changes of $\beta$, consistent with a mental physics model that incorporates $\alpha$. These findings align with predictions of causality perception under a Noisy Newton framework, and provide experimental evidence using individualized uncertainty estimates.

Goldilocks Pattern of Learning after Observing Unexpected Physical Events

Infants learn better following expectancy violations. Yet it is unknown whether this surprise-induced learning operates across development, is all-or-none or graded, and whether surprise directly mediates it. We addressed these questions by showing adults events depicting varying numbers of violations. In Experiments 1 and 2, adults saw events with 0 to 3 physical violations, then heard a novel verb for the presented action. Adults learned better after observing violations; notably, their learning exhibited a Goldilocks pattern—initially increasing with number of observed violations, then declining. Experiment 3 asked whether this learning enhancement was driven by surprise itself, or by the search for explanations for the surprising events. Adults saw events with different numbers of violations, then rated their surprise and generated candidate explanations. Whereas surprise increased monotonically with violations, explanation-generation exhibited a Goldilocks pattern like that in Experiments 1-2. This suggests that surprise-induced learning may reflect the search for explanations.

Zero-Shot Cross-Situational Learning for Building Word-Referent Mappings

Statistical learning (SL) mechanisms drive learning across several domains, including language acquisition. Can SL mechanisms like cross-situational word learning, which rely on the accumulation of statistical evidence, also account for the rapid acceleration in children's word learning? This study examined whether learners could track and integrate several statistical regularities concurrently and use this information to map words to objects that never co-occurred during training—a behavior known as zero-shot learning. Experiment 1 showed that learners leveraged their acquired knowledge of word and object categories to map words to objects that did not co-occur during training. Experiment 2 extended this finding by demonstrating that learners integrated their newly acquired statistics with prior knowledge to map novel words to perceptually similar, yet previously unseen, objects. These findings suggest that integrating several types of statistical relationships between words and objects can accelerate new learning, making SL an efficient mechanism for early word learning.

Unmasking political deception: Investigating the Discernment and Emotional Impact of Deepfake Political Speeches Featuring American Presidential Candidates

Deepfake videos challenge the quality of information in deliberative democracies. In a mixed-methods study, we examine the role of emotions in the detection of political deepfakes by focusing on trust, empathy, and inspiration to assess how deepfakes influence public perception and engagement with political communication. The research unfolds in two phases: an initial qualitative investigation through 3 focus groups (N = 13), followed by a quantitative survey (N = 261) where focus group insights inform the design and interpretation of the quantitative study. Participants were exposed to real, ChatGPT-generated, and historical speeches presented in modern contexts to gauge perceived authenticity and emotional responses, including trust, empathy, and inspiration. Results indicate no significant difference in perceived authenticity between real and deepfake content, with both eliciting comparable emotional reactions. The quantitative analysis reveals a marginal negative correlation between exposure to deepfakes and trust in political communication. Qualitative findings emphasize the influence of contextual cues and pre-existing biases, showing participants often prioritized emotional resonance over technical accuracy when evaluating content. The study highlights the intricate relationship between AI-generated media and public perception, underscoring the necessity for nuanced regulatory policies and improved media literacy to mitigate the impact of Deepfakes on public trust.

Impact of sequential reports with different source dependencies

How we update our beliefs when encountering new evidence is the basis of evidential reasoning. Often, this will involve weighing up multiple pieces of evidence communicated to us by several sources (i.e., testimony). However, the testimonies of multiple sources are rarely truly independent; they may have used the same data or evidence, have the same training or background, or simply be repeating the same story as another. The nature of these dependencies among our evidence items is normatively impactful on the conclusions we should draw. Here we investigate whether participants are sensitive to such complex, yet impactful influences on their reasoning. We find a general preference for source diversity that heuristically gels with normative assertions. To our knowledge, it is the first paradigm that integrates shared background, shared evidence, and corroboration in the same design. We discuss challenges with developing and testing the intricacies of this.

What does it mean to be healthy?

The concept of health has long been debated in philosophy and medicine, with discussions often centering on whether health is merely the absence of disease (negativism) or requires the presence of some positive state or ability (positivism). Empirical studies on the folk concept remain scarce and inconclusive. This paper investigates the folk concept of health through implication and contradiction tests. Our findings reveal that while people often infer that health entails both a disease-free state and lifestyle-related factors, interpretations of `health' vary significantly depending on context, with participants associating health primarily with the absence of disease in medical settings while emphasizing lifestyle factors like diet and activity in personal training scenarios. These results suggest that the meaning of the folk concept of health is strongly context-dependent.

The Effects of Congruent and Systematic Grapheme-Phoneme Correspondences on Novel Word Learning

Previous studies observed a robust effect of grapheme-phoneme correspondences (GPCs) when learners are exposed to orthographic input while learning novel words. Specifically, if the novel words share the same GPC mapping with learners' native language, then this Congruency effect helps learning. Likewise, if the GPC is in a one-to-one mapping relation, this effect of Systematicity improves learning too. However, no studies have looked at the interaction of Congruency and Systematicity on word learning or explored both consonant and vowel stimuli. Here, we show no significant orthographic effect when the consonantal GPC mappings were manipulated, particularly because performance was close to the ceiling. And so, while congruent and systematic GPC mappings could facilitate learning, they were not as robust as the literature had suggested. Nevertheless, the effect of Systematicity was significant in the Vowel Sets, likely due to more exposure to the stimuli.

Causal and Counterfactual Reasoning about Gradual and Abrupt Events

Determining what caused an event is common in everyday life, yet little is known about what aspects of real-world events affect causal attribution. Causes may unfold at multiple timescales, with gradual events (e.g., steady weight loss) and abrupt ones (e.g., acute illness) contributing to an outcome. We investigated causal attribution in real-world contexts (e.g., finance, health) where both types of events contribute to a positive or negative outcome. In Study 1, participants gave higher causal ratings to abrupt causes in negative scenarios about the environment or finance. Conversely, we found higher causal ratings for gradual causes of physical health regardless of outcome valence, and no significant differences in mental health scenarios. Further, participants' counterfactual responses were mostly consistent with their causal attributions. Study 2 suggests that the preference for abrupt causes may be explained by their temporal proximity to the outcome. We discuss explanations for these findings and their implications.

A Common Language? Analyzing the Use of Health-Related Vocabulary Between Laypeople and Medical Professionals

The meaning of being healthy is widely debated, with many suggesting it is a multidimensional concept encompassing key dimensions such as the absence of disease, the presence of well-being, and a healthy lifestyle. While recent studies indicate that lifestyle may be a dominant dimension, it remains unclear whether this holds true across populations or if significant differences exist, particularly between laypeople and healthcare professionals. Our studies reveal a difference, but surprisingly, in the opposite direction of what the literature would predict: medical professionals are substantially more likely than laypeople to frame discussions of "healthy" and "unhealthy" in lifestyle contexts. This result challenges prevailing assumptions about the biomedical focus of healthcare professionals and has implications for improving health communication.

Children Prioritize Age over Gender when Evaluating Adults' Technological Knowledge

The current study examines 120 5 to 10-year-old children's beliefs about adults' abilities to use and fix tablet technology when those adults belong to varying gender (man, woman) and age (young, old) categories. The results indicate that, overall, children appear to prioritize age over gender when judging adults' technological knowledge, with children choosing younger adults as more competent at using and fixing tablets than older adults. In addition, when evaluating adults of the same age category (e.g., a young man and a young woman), children show in-group gender-based preferences where girls choose women and boys choose men. This in-group preference is more pronounced in children's selections of adults when determining who would be better at fixing tablets than who would be better at using these devices. Implications for children's developing ability to consider intersectional identities based on gender and age, and for their STEM learning, are discussed.

Searching for Events: Rapid visual extraction of language-compatible event representations

How does the visual system recognize human interactions, and what is the nature of these representations? Past work suggests observers can automatically recognize event category (e.g., kicking) and event role information (Agent, Patient) from brief displays, so-called rapid gist extraction of event structure. Questions remain though about how quickly event representations are computed and when they might interface with linguistic/cognitive systems. We explored these issues using linguistically guided visual search. Participants' eye movements were recorded as they heard spoken input (e.g., "The red person is kicking the blue person") and searched for the matching image. By manipulating visual preview time prior to hearing the critical verb, we can estimate an upper bound of when visually recognized event information is available to ongoing linguistic processes. And by manipulating the posture of the humans in these images, we can help clarify how event representations are recognized and refined over time.

Leave a trace: Recursive reasoning about deceptive behavior

How do people reason about others when planning deceptive actions? How do detectives infer what suspects did based on the traces their actions left behind? In this work, we explore deception in a setting where agents steal other's snacks and try to determine the most likely thief. We propose a computational model that combines inverse planning with recursive theory of mind to select misleading actions and reason over evidence arising from such plans. In Experiment 1, we demonstrate that suspects strategically modify their behavior when acting deceptively, aligning with our model's predictions. Experiment 2 reveals that detectives show increased uncertainty when evaluating potentially deceptive suspects—a finding consistent with our model, though alternative explanations exist. Our results suggest that people are adept at deceptive action planning, but struggle to reason about such plans, pointing to possible limits in recursive theory of mind.

A Visual Complexity Measurement Method Based on Monte Carlo Sampling

Conventional visual complexity measurement faces challenges in efficiency, accuracy, and alignment with human perception. To address these, this paper presents a novel Monte Carlo-based method for visual complexity measurement, the random line segment width sampling algorithm (RLSWSA). RLSWSA employs local stochastic sampling for efficient estimation of symbol perimeter complexity. By discarding global scanning in favor of local sampling, RLSWSA significantly enhances computational efficiency while maintaining high accuracy and robustness. Experimental results show rapid convergence with just 24 samples, yielding high consistency with traditional methods (correlation coefficient > 0.9). Furthermore, RLSWSA's Spearman correlation with human subjective ratings is 0.74, demonstrating its strong correlation with human perceptual complexity. This study offers an efficient and reliable solution for rapid symbol visual complexity calculation, with strong potential for applications like symbol recognition and design optimization.

Primitive Linguistic Compositionality in a Hebbian Neural Network

Humans have a powerful ability to generate novel compositional representations. For example, imagining a *pink banana* requires compositional mappings between signifiers *pink* and *banana* and the perceptual referents of these signifiers. This essential cognitive faculty remains challenging to model in a biologically plausible way. Here, we present a model that implements signifier-referent compositional associations using Hebbian associative learning. The model satisfies the following constraints: (1) once associated, both signific and referential inputs can activate the shared representation, and (2) when signific and referential inputs are compositional, the model should generalize to novel compositional combinations. When trained on MNIST, the model successfully learns to associate number labels with corresponding images. On colored MNIST, the model learns signific-referential associations for both digits and colors, with somewhat successful generalization to new digit-color combinations. This work serves as a proof of concept for biologically plausible models of signifier-referent association.

The Need for Speed? Exploring the Contribution of Motor Speed to Expertise in a Complex, Dynamic Task

We explore how time pressure affects accuracy at various stages of learning in a complex, dynamic task: the game of Tetris. We emulate human decision-making processes under time pressure in several reinforcement learning models by training them under time pressures present on humans. Subsequently, we compare the performance and the behavior of human players against the ones demonstrated by AI players of equivalent skill. At the surface level, the AI models are able to achieve human-like performance levels at different stages of expertise. However, when probed at lower levels, we find that their behavior and strategies are considerably different from the ones employed by human experts. Examining why and how the models differ from humans highlights the promise of using AI models to study the nuances of human decision-making in dynamic tasks, along with the need to explain both human and AI performance at multiple performance levels for accurate understanding.

Locating strongly informative utterances in conversation using multimodal cues

Interaction theories argue that mutual understanding between speakers in natural conversations arises from building shared knowledge (common ground), but no model specifies what information is retained or under what conditions. Previous studies have used Information Theory metrics to quantify the dynamics of information exchanged between participants but lack an efficient way to identify which information becomes common ground. These attempts furthermore limited them- selves to the study of conversation transcripts, overlooking nonverbal cues like visuals and intonation. To address this, we propose a method for annotating new corpora using models trained on a subset of annotated utterances. Results show a fair applicability (κ ≃0.3) across corpora, though this is strongly modulated by the conversational task being investigated.

Learning to Plan from Actual and Counterfactual Experiences

Our ability to plan and make effective decisions depends on an accurate mental model of the environment. While learning generally requires novel external observations, people can also improve their understanding by reasoning about past experiences. In this work, we examine whether counterfactual simulation enhances learning in environments where planning is straightforward but encoding new information is challenging. Across two studies, participants navigated gridworlds, learning to avoid hazardous tiles. Some participants were asked to engage in counterfactual simulation, constructing alternative plans after observing navigation outcomes. Others learned purely from experience. While counterfactual paths contained fewer hazards than actual paths, we found reliable evidence across both studies that counterfactual simulation conferred no measurable advantage in either navigation performance or explicit environment learning. These findings shed new light on the scope of learning by thinking -- suggesting that the mechanism by which counterfactual reasoning enhances learning might not be by encouraging deeper encoding of past experiences.

Preliminary Evidence that Infants and Children Use Accents to Inform Relational Expectations

Prior research suggests that infants and children prefer people who speak with the same accent as their parents. Here, we investigated whether 13- to 18-month-old monolingual, American English-learning infants (N=39) and 5- to 8-year-old children from diverse linguistic backgrounds (N=87) use accent to predict social interactions. Participants were familiarized to novel blob-like, English-speaking characters. The central protagonist and one side character shared an accent, while the other side character did not. We varied whether the central character had an American or Chinese accent. In Study 1, when the protagonist in distress had an American accent, but not when the protagonist had a Chinese accent, infants looked first toward the side character with the different accent. In Study 2, 5- to 8-year-old children with American-accented parents were more likely to say that a Chinese-accented character caused distress in an American-accented character. These preliminary findings suggest that infants and children may use accent as a social cue to make inferences about antisocial intent.

The Paradox of Certainty: When Graphed Ensembles Convey Averages Better than Graphed Averages

Data visualizations often display averages without raw data to simplify communication and enhance understanding, especially for lay audiences. However, the theory that such simplification improves understanding remains untested. Here, we test this theory's most basic prediction—that at minimum, the average itself is conveyed better by plotted averages than by plotted raw data. Remarkably, we find the opposite: under a wide range of conditions, overall accuracy of average estimation is higher with raw data. This is due to frequent, severe misinterpretations of both bar and line graphs depicting averages. In contrast, raw data yields some variability but few outright errors; notably, the observed variability is comparable to the uncertainty captured by confidence intervals. We conclude that plotted raw data provides valuable context that helps prevent misunderstandings of the average. Our findings challenge the notion that plotted averages alone yield enhanced understanding and emphasize the value of raw data in communicating evidence.

How do we get to know someone? Diagnostic questions for inferring personal traits

When first meeting somebody, we're faced with the challenge of "getting to know them." Why do some questions seem to enable this better than others? In Experiment 1, participants (N=185) evaluated a large bank of conversational questions. We found that questions varied along a reliable latent dimension of interpersonal depth ranging from "small talk" to "deep" questions. In Experiment 2 (N=188), participants answered a subset of these questions along with a number of self-report personality scales. Using a language model to estimate how informative participants' free responses were, we find that individualized personality predictions were more accurate when incorporating free responses; furthermore, responses to deeper questions supported more accurate personality inferences than small talk. Taken together, results suggest not only that responses contained the statistical information necessary to make abstract social inferences, but also that people have accurate intuitions about which conversational topics enable learning about and connecting with others.

Coordination Games with Sequential Stochastic Learning and Language Emergence

Lewis signaling game (LSG) and similar coordination games have been used to model the emergence and evolution of language. However both Nash equilibria and learning or evolutionary dynamics often result in suboptimal signaling systems. We present a sequential reinforcement learning (SRL) model based on a novel sequential binary decision process. SRL has low cognitive demands and parameter count and exhibits lateral inhibition without additional assumptions. We prove all scenarios converge to an optimal signaling system in all N state, N signal LSGs with arbitrary state probabilities and further explore its properties with numerical simulations. Next, we model a signaling game with agents who both speak and hear while using one state of learning (instead of two, as is common). Agents have a probability distribution for meanings in a given context. Speaking agents use the distribution to choose a meaning and use SRL model to choose a signal. Hearing agents use Bayes to combine their state of learning with their meaning distribution to guess a meaning. An agent's state of learning is reinforced from habit of speaking and guessing a meaning. Numerical simulations indicate both agents converge to the same optimal system without external reinforcement as happens in language acquisition.

Sublexical ARTifacts: Bottom-up Interference in a Lexical Category Search

How listeners adapt to unfamiliar talkers and accents is a central question in psycholinguistics. In this study, we explored how listeners dynamically shift mappings from acoustic information to mental representations after hearing a new talker via novel eye-tracking methods. We tested a prediction from Adaptive Resonance Theory (ART) that an anomaly in the signal (in this case, a change in talker) increases the influence of bottom-up relative to top-down information, creating an environment where sublexical competitors (e.g. 'Arch' within 'Archer') would be more likely interfere with lexical access for the target. In two experiments (Exp. 1: General American English [GA] talkers; Exp. 2: GA and Spanish-accented [SP] talkers), this prediction was supported via analyses of accuracy, latency, and gaze. In Exp. 2, we found that the effect replicated but did not differ based on the accent of the talker. The data suggest new paths forward in speech adaptation research.

Humans and convolutional neural networks prioritize similar visual features in intuitive physics judgments

Humans reliably infer complex physical relationships between objects in everyday scenes, yet the mechanisms underlying these judgments remain unclear. We explored whether convolutional neural networks (CNNs) can approximate intuitive physical reasoning by capturing statistical regularities in visual experience. We trained a CNN (Inception-v4) to predict tower stability and tested how well its outputs aligned with human judgments (N = 500). CNN predictions more closely matched human judgments (r = 0.718, p < 0.001, accuracy = 81%) than ground-truth predictions from physics simulations (r = 0.406, p = 0.002, accuracy = 68%), suggesting that both CNNs and humans rely on visual heuristics. Eye-tracking data revealed that CNN importance maps overlapped significantly with human gaze patterns, indicating shared attention to features statistically predictive of physical outcomes in intuitive physical judgments. Our findings show that CNNs trained on visual data capture perceptual cues used in human intuitive physics, highlighting their value as models of heuristic reasoning.

Collective Emotions: Appraisal-based similarity in emotion attributions to individuals and groups

Humans' capacity for Theory of Mind (ToM) allows us to reason about and infer others' mental states, including their emotions. While ToM has been extensively studied in interpersonal contexts, how people attribute mental states–particularly emotions–to collective entities (e.g., corporations) remains underexplored. The current work examines whether and how people ascribe emotions to collectives using the appraisal theory framework. Participants were randomly assigned to scenarios designed to elicit a specific emotional inference about either an individual (e.g., a lawyer) or a collective (e.g., a law firm). We then collected and compared emotion attributions and appraisal judgments of the situations across both conditions. Our results suggest that people attribute emotions to individuals and collectives in remarkably similar ways, with subtle differences in event appraisals. The results pave the way for a deeper understanding of collective ToM, with implications for studying moral judgments and decision-making in societal contexts.

Understanding Task Representations in Neural Networks via Bayesian Ablation

Neural networks are powerful tools for cognitive modeling due to their flexibility and emergent properties. However, interpreting their learned representations remains challenging due to their sub-symbolic semantics. In this work, we introduce a novel probabilistic framework for interpreting latent task representations in neural networks. Inspired by Bayesian inference, our approach defines a distribution over representational units to infer their causal contributions to task performance. Using ideas from information theory, we propose a suite of tools and metrics to illuminate key model properties, including representational distributedness, manifold complexity, and polysemanticity.

Retrieval of Hierarchically-Organized Concepts in a Recurrent Memory System

Exemplar models have been criticized for lacking mechanisms to explain key conceptual phenomena such as the hierarchical organization of concepts. Here, we offer a potential solution. We show that a broad class of exemplar models can be viewed as a special case of global matching models of memory, and that global matching models are themselves discrete-time approximations of Dense Associative Memories (DAMs), a type of recurrent network. Interpreted this way, exemplar models retrieve hierarchical prototypes by modulating competition during retrieval. We demonstrate this ability using artificial data and pretrained GLoVe and Word2Vec embeddings. Our results suggest that exemplar models remain viable candidates for a broader theory of concepts and provide a natural algorithmic account of attractor-like retrieval in the hippocampus, highlighting their relevance in learning theory and cognitive neuroscience.

Function shapes form: Compositionality emerges from communicative needs, not environmental structure alone

Human languages are compositional, combining smaller units of meaning to express more complex ideas. To explain the emergence of compositionality, researchers have appealed to functional pressures from communication. However, languages may merely inherit the component structure found in the environment. We designed a reference game to explicitly disentangle these possibilities; pairs of participants (N = 450) communicated about about sets of shapes that were assembled from component parts. Critically, we manipulated whether shapes that shared the same parts were competitors within each trial or were distributed across different trials. We found that participants successfully developed efficient conventions for referring to the shapes. However, participants who needed to distinguish shapes that shared components within the same context were more likely to develop compositional systems. When shared components appeared in separate contexts, participants favored non-compositional conventions. These results suggest compositional language structure most readily emerges from immediate communicative pressures rather than environmental structure alone.

How the Stroop Effect Arises from Optimal Response Times in Laterally Connected Self-Organizing Maps

The Stroop effect refers to cognitive interference in a color-naming task: When the color and the word do not match, the response is slower and more likely to be incorrect. The Stroop task is used to assess cognitive flexibility, selective attention, and executive function. This paper implements the Stroop task with self-organizing maps (SOMs): Target color and the competing word are inputs for the semantic and lexical maps, associative connections bring color information to the lexical map, and lateral connections combine their effects over time. The model achieved an overall accuracy of 84.2%, with significantly fewer errors and faster responses in congruent compared to no-input and incongruent conditions. The model's effect is a side effect of optimizing speed-accuracy tradeoffs, and can thus be seen as a cost associated with overall efficient performance. The model can further serve studying neurologically-inspired cognitive control and related phenomena.

First Contact: Children's Emerging Sensitivity to Causality in Second-Order Learning

The present study investigated young children's causal generalizations by examining their inductions from second-order learning—where learned correlations between two pairs of features (A–B and A–C) are generalized to non-contiguous features (B–C). Specifically, we asked whether 3- to 6-year-olds could engage in such learning across two canonical causal event types—Blicket detector and Michottian launching—while manipulating contact as a perceptual cue for causality. We replicated Benton, Rakison, and Sobel (2021), finding that children consistently applied second-order learning to infer which objects would produce causal outcomes. Crucially, children's responses did not differ overall between event types. However, there was a significant interaction by age and task condition: younger children learned better when objects did not contact, whereas older children learned better from contact events. Results are discussed with respect to implications for the development of children's causal expectations.

Deep Vision Models Follow Shepard's Universal Law of Generalization

Shepard's (1987) universal law of generalization holds that the probability of generalizing between two stimuli decays as a concave function of their distance in psychological space. While there is widespread evidence for the law in human perception, its relevance to artificial neural networks remains unclear, despite the importance of generalization for these systems. Here, we find that the representational spaces of models that vary in their architecture, objective, and training data yield a concave generalization gradient with respect to human judgments of naturalistic images (Peterson et al., 2018), consistent with Shepard's law. Our results suggest that the representational spaces of deep vision networks serve as compelling, but imperfect, proxies for classic psychological spaces derived from behavioral data. This highlights the strengths and weaknesses of deep vision models as contributors to cognitive theories of perceptual generalization, while adding further evidence for the generality of Shepard's law.

Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?

Recent empirical results have sparked a debate about whether or not Large Language Models (LLMs) are capable of Theory of Mind (ToM). While some have found LLMs to be successful on ToM evaluations such as the False Belief task, others have shown that their performance is not robust against trivial alterations to stimuli. In this paper, we introduce SCALPEL---a technique to incrementally modify stimuli to test different specific hypotheses about why LLMs fail---and apply this method to the `transparent-access' modification of the unexpected contents task. Our results suggest that LLMs often do poorly because they fail to make essential common-sense inferences, such as that seeing a transparent container implies recognizing its contents. We conclude that while modern LLMs go beyond mere pattern matching, they still fall short of robust human-like ToM. We argue that SCALPEL can help cognitive scientists examine LLMs' capabilities in finer detail and provide insight into alternative mechanisms by which tasks that are used to assess human cognition might be completed.

Human Action Classification from Naturalistic Videos

It has long been known that human observers can identify actions based on how people move, even from very impoverished motion depictions such as Point Light Displays (PLDs). This study investigates how humans classify actions, and what types of motion information they use to do so. Using a newly available technique (OpenPose) for extracting human joint locations from natural video, we created three types of reduced displays: PLDs, stick figures, and motion flow videos. Participants identified actions in these videos through verbal responses, and these responses were analyzed for semantic similarity using a Natural Language Processing model. A Hierarchical Bayesian Model further compared semantic similarities across video conditions. Results showed the highest intersubjective agreement (a proxy for proportion correct) for stick figures, followed by PLDs, and the lowest for motion flow videos. These results suggest that dynamic pose representations are crucial for accurate action classification, with motion flow supporting only coarse classification. The same pattern held across different action categories, such as instrumental versus locomotion and upper versus lower limb actions.

Influence of Task Complexity on Visuomotor Adaptation

Recent work has shown that visuomotor adaptation is supported by both implicit recalibration, which appears to be highly constrained, and explicit strategies which seem more flexible and capable of delivering rapid performance gains. However, explicit strategies appear to have strict capacity constraints, bearing remarkable similarity to the limits observed with spatial working memory, which could limit their usefulness in more complex learning problems. Here, we sought¬ to determine the ability of both explicit strategies and implicit recalibration to overcome a complex set of visuomotor perturbations. We find that implicit recalibration is unable to track multiple perturbations, in line with prior findings. In contrast, explicit strategies are effective when task complexity is within the capacity of working memory. These findings highlight the constraints that working memory imposes on visuomotor adaptation and suggest that motor skill learning may be limited by the demands placed on working memory.

How constraints on editing affects cultural evolution

When is it beneficial to constrain creativity? Creativity thrives with freedom, but when people collaborate to create artifacts, there is tension between giving individuals freedom to revise, and protecting prior achievements. To test how imposing constraints may affect collective creativity, we performed cultural evolution experiments where participants collaborated to create melodies and images in chains. With melodies, we found that limiting step size (number of musical notes that can be changed) improved pleasantness ratings. Similar results were observed in cohorts of musicians, and with different selection regimes. This outcome was due to the tendency to overcrowd melodies. Interestingly, limiting step size in creating images consistently reduced pleasantness. These conflicting findings suggest that in domains such as music, where artifacts can be easily damaged, collective creativity may benefit from imposing small step sizes or limiting overcrowding. We discuss parallels with search algorithms and the evolution of conservative birdsong cultures.

Reasoning within and between collective action problems

Understanding cooperation in social systems is challenging because the ever-changing rules that govern societies interact with individual actions, resulting in intricate collective outcomes. In virtual-world experiments, we allowed people to make changes in the systems that they are making decisions within and investigated how they weigh the influence of different rules in decision-making. When choosing between worlds differing in more than one rule, a naïve heuristics model predicted participants' decisions as well, and in some cases better, than game earnings (utility) or by the subjective quality of single rules. In contrast, when a subset of engaged participants made instantaneous ("within-world") decisions, their behavior aligned very closely with objective utility and not with the heuristics model. Findings suggest that, whereas choices between rules may deviate from rational benchmarks, the frequency of real time cooperation decisions to provide feedback can be a reliable indicator of the objective utility of these rules.

Can We Extend the Reverse Cohesion Effect to Programming Contexts?

Existing research has drawn parallels from the comprehension of text to the comprehension of source code. In this study, we attempt to develop this analogy by positing and testing a notion of code cohesion, analogous to text cohesion. We also attempt to extend a known effect in text comprehension research, the reverse cohesion effect, to code contexts. Our findings provide some corroboration for code cohesion, but fail to find robust evidence for a reverse cohesion effect. This reinforces similarities between text and code comprehension but also suggests that everyday comprehension processes of code and text might differ in meaningful ways.

Reexamining Mass/Count Flexibility in the Nominal Domain: A Real-Time Comprehension Study

Are the ‘portioning' readings (‘several beers') and ‘grinding' readings (‘a bit of pear') the result of lexical derivations with real-time processing effects? The evidence is inconclusive. While Frisson and Frazier (2005) argue that these readings are visible as cost, Lima (2019) reports no such effects. We address this inconsistency through two English self-paced reading experiments. Experiment I testing the ‘portioning' reading (‘several pears' vs. ‘several beers') reveals no additional processing effects. By contrast, Experiment II testing the ‘grinding' reading (‘a bit of beer' vs. ‘a bit of pear') reveals higher reading times for the ‘grinding' condition one word after the critical noun. These results provide empirical support for an asymmetrical relationship between individuated ‘count' and non-individuated ‘mass' readings. Whereby the former represent the default conceptual representation resulting in no cost, the latter result from a conceptual expansion to include a container-containee conceptualization, which is done in real-time, resulting in cost.

Novel Goal Creation and Evaluation in Open-Ended Games

How do people generate and decide between the wide array of potential goals available to them at any given moment? We study this question in Minecraft, a game environment that is both open-ended enough to support a diverse array of goals and structured enough to facilitate quantitative evaluation of different goal features that may impact how people respond to different goals. Specifically, we explore the role of goal familiarity, concreteness, and complexity, which we operationalize using both linguistic analyses and by converting human-generated goals into a programmatic domain-specific language. Our results highlight the unique ways in which game environments like Minecraft can facilitate research into how humans engage in open-ended and creative behaviors.

Prioritized memory can explain the effect of value on category representation

Category representations are often assumed to reflect the statistical distribution individual category members. However, recent work shows that people's category representations tend to be biased toward high-value category members. We propose that this bias stems from prioritized memory: when learning about a category, people devote more cognitive resources to remembering important or desirable items, leading to their overrepresentation in category-level representations. We test key predictions of this account behaviorally and computationally. Behaviorally, we find a strong correlation between the features people prioritize in memory and the features that dominate their spontaneous recall of category members. Computationally, we use variational autoencoders to show that when statistical learning prioritizes accuracy for certain items, these items are overrepresented when sampling from the learned category distribution. Together, these findings suggest that prioritized memory plays a key role in shaping category representations.

When to speak up: How children reason about group dynamics

Disagreements arise across many social situations, from families to teams to workplaces. This paper explores how children think about the strategies people use when they disagree with their groups. Specifically, we ask how egalitarian and hierarchical group dynamics influence whether children expect others to speak up about their disagreement, go along with disliked decisions, or leave their groups. We found that 6- to 8-year-olds hold a strong initial expectation that disagreers will speak up, despite believing that the kind of group they are in determines how effective it is to do so. These expectations were dynamic: when given evidence that speaking up did not work, children deferred to other strategies. These results suggest that children update their expectations based on both what has worked in the past and on group dynamics.

Integration of Language and Experience via the Instructed Bandit Task

Humans learn by interacting directly with their environments and by communicating via language. In this project, we explore this interaction between language and experiential learning through a novel sequential decision-making task, the "instructed bandit task" (IBT). In the IBT, agents make choices and receive rewards sampled from an unknown Gaussian distributions after being given linguistic hints. The IBT assesses how linguistic input and experienced reward values combine to determine choice behavior. We additionally propose a novel Bayesian reinforcement learning model that combines Bayesian updating from experience with propositional constraints that capture the meaning of the linguistic hints. As a point of comparison, we evaluate both human participants and Centaur, a LLaMA-based model fine-tuned to mimic human behavior, on the IBT. Our results show that all agents converge with the Bayesian model, and the granular difference in choice sequences reveal the varied role instruction plays in decision-making tasks.

Finding motifs in mental representations of faces, places, and objects

Cognitive science has long relied on the assumption that mental representations are universal across healthy adults, treating individual differences as random noise. We challenge this assumption by proposing that representations instead conform to a limited set of organizational motifs—systematic patterns shared across subgroups of individuals. Using triadic comparisons and embedding analysis, we examine how people conceptually organize a set of DallE-generated faces, places, and objects that systematically vary in five attributes of interest per domain. We show that individuals cluster into distinct ``representational motifs" when organizing faces, places, and objects. Logistic regression analyses show that these motifs differ in the relative use of image attributes to form mental representations. Our findings demonstrate that variability in conceptual organization is not merely noise, but rather reflects meaningful patterns of shared representational frameworks that emerge naturally.

Influences of Language Expressions in Group Decision Making: Exploring Verbal Probability Expressions in Group Discussions with Conversational Agents

This study examined how verbal probability expressions (VPEs) used in group decision-making discussions influence individuals' decision-making processes. An online experimental task was developed to investigate how biased decisions emerge depending on the type of expression used during discussions. Scripted conversational agents were employed to experimentally manipulate the VPEs used during group discussions. 440 participants took part in the online experiment, which controlled two factors: (1) the type of VPEs used by group members during discussions and (2) the type of anchoring group decision. The results revealed an interaction between these two factors when participants perceived the agents as human-like. Specifically, confirmation bias occurred more quickly when positive VPEs were used by the agents and the anchoring probability of the group decision was low (20\%). These findings provide valuable insights into the influence of VPEs on probability decision-making during group discussions, highlighting the advantages of utilizing multiple conversational agents for investigation.

Goal Inference using Reward-Producing Programs in a Novel Physics Environment

A child invents a game, describes its rules, and in an instant, we can play it, judge progress, and even suggest new variations. What mental representations enable such flexible reasoning? We build on recent work formalizing naturally expressed goals as a type of program, grounding linguistic descriptions into precise scoring systems. To support this notion, we study human-created objectives in a physics game environment. We leverage the formal representations to quantitatively analyze relationships between reward geometry, goal complexity, and perceived difficulty. We then propose a proof-of-concept of a computational goal inference method using these program representations and behavioral demonstrations, offering a concrete proposal of how humans reason about others' goals.

Adults hold two parallel causal frameworks for reasoning about people's minds, actions and bodies

Understanding other people involves making sense of their physical actions, mental states, and physiological experiences, yet little is known about the causal beliefs we hold across these domains. Across two exploratory studies, we measured these beliefs and their use in social cognition. In Study 1 (N = 50, M age = 39.44y), US adults (1) freely sorted and (2) reported causal beliefs about events of the mind, body, and actions. Representational similarity analysis (RSA) revealed two causal frameworks: one representing the 3 distinct latent categories, and another expressing causal relationships across them. Study 2 (N = 100, M age = 39.95y) demonstrated that adults flexibly apply either framework depending on the task, using the latent causes for trait inference, and causal beliefs to plan interventions on other agents. These findings suggest that intuitive theories of other people include both a sense of which capacities"go together" and their causal connections within and across domains.

The Odyssey of the Fittest: Can Agents Survive and Still Be Good?

As AI models grow in power and generality, understanding how agents learn and make decisions in complex environments is critical to promoting ethical behavior. This study introduces the Odyssey, a lightweight, adaptive text-based adventure game, providing a scalable framework for exploring AI ethics and safety. The Odyssey examines the ethical implications of implementing biological drives—specifically, self-preservation—into three different agents: a Bayesian agent optimized with NEAT, a Bayesian agent optimized with stochastic variational inference, and a GPT-4o agent. The agents select actions at each scenario to survive, adapting to increasingly challenging scenarios. Post-simulation analysis evaluates the ethical scores of the agent's decisions, uncovering the trade-offs it navigates to survive. Specifically, analysis finds that when danger increases, agents ethical behavior becomes unpredictable. Surprisingly, the GPT-4o agent outperformed the Bayesian models in both survival and ethical consistency, challenging assumptions about traditional probabilistic methods and raising questions about the source of LLMs' probabilistic reasoning.

What is addiction? Substance-specific biases in human beliefs and LLMs

Understanding how individuals conceptualize addiction is an important approach to the study of substance use etiology. We asked participants in a large free-response study of of intuitive conceptualizations of addiction among alcohol and cannabis users and co-users to share their beliefs about the benefits and harms of alcohol and cannabis, and to explain in simple terms what it means to be addicted. Using a frontier language model (ChatGPT-4o) we extracted structured representations of people's beliefs and explanations, assessing the extent to which responses represented 11 clinically relevant diagnostic symptoms from the DSM-5 section on Substance Use Disorders. People's beliefs showed clear substance-specific biases, attributing more clinically relevant symptoms to alcohol than cannabis. A prompt-context manipulation that contextualized participants' substance-neutral explanations as relevant to either cannabis or alcohol revealed evidence sometimes for similar, and for other times opposite direction, substance-specific biases imposed by the ChatGPT annotation process itself.

The Role of Context Gating in Predictive Sentence Processing

Prediction is a core computation in language, as humans use preceding context to implicitly make predictions about the upcoming word in a sentence. In order to do this, humans need some memory for context that is selective and adaptive. We take inspiration from the existing prefrontal cortex literature, in which computational models feature a biologically plausible gating mechanism that can actively maintain and rapidly update task-relevant information to improve performance on cognitive flexibility and working memory tasks. Here, we investigate the potential role of such gating mechanisms in maintaining context for prediction during real-time language processing. Using EEG data from a naturalistic story listening task, we first replicate previous findings that words of low predictability based on preceding context (high in surprisal) elicit larger N400 effects than predictable words. To study how gating may play a role in next-word prediction, we use a performance difference metric between language models with and without gating, which we show is sensitive to word-by-word working memory demand. We find that this gating metric is correlated with EEG amplitude in several later time windows after word onset, providing suggestions concerning the time course of context gating.

Another Brick in the Wall? A Null Effect of Boundaries on Spatial Memory Judgments in a Novel, Highly Immersive, First-Person Object-Placement Task

Spatial memory is evolutionarily important for many animals. Boundaries can systematically distort spatial memory in line with hierarchical models, while other evidence supports more metric path-integration theories. To test these competing theories, we developed a novel, highly immersive Object Placement Task (OPT) in virtual reality to examine the effect of boundaries on human spatial memory. Our task decomposes spatial memory into 3 components: 1) object-placement errors, 2) distance judgments, and 3) angular judgments (i.e., our single task provides multidimensional information about spatial memory). Thirty participants used the OPT to recall object positions in environments with or without a central boundary. Bayesian analyses showed that distance influenced memory performance, but boundary presence had no significant effect (Bayes factors favored the null models). These findings suggest a set of conditions in which boundaries may not impact memory. We discuss potential modulators of boundary effects and present our open-source task for future research.

Surprise isn't symmetrical: Adults' looking suggests non-perceptual considerations during dishabituation

People of all ages explore the world through looking. Recently, Raz, Cao et al. (2025) built an image-computable model (RANCH) that predicts adults' and infants' looking behavior to a large stimulus set, including graded responses to changes in pose, animacy, and number. This model succeeded despite having only a perceptual embedding space of stimuli. However, looking may be influenced by non-perceptual considerations. Using the same data, we found that adults' behaviors challenge a key assumption of perceptual-only account: since the perceptual distance between two items is symmetrical, behavior guided only by perceptual space should also be symmetrical. Yet, adults did not treat changes in different directions as mere reciprocal transformations. For instance, adults looked longer at magical appearance than disappearance. We suggest that image-computable models of looking behavior would benefit from representations of objects, in addition to perceptual features of images.

Emotional Parameters in Cognitive Architecture: Examination Through Simple Memory Performance

A key challenge in cognitive modeling is capturing how emotional states modulate internal cognitive parameters. While cognitive architectures such as ACT-R (Adaptive Control of Thought-Rational) provide a principled framework for simulating memory and decision-making, their emotional components remain underexplored. This study examines how individual differences in emotional states, particularly anxiety and affective valence, are reflected in core memory-related parameters of ACT-R. Across two experiments using a digits recall task, we introduced emotional variation via affective stimuli and applied a model-fitting procedure to estimate individual values for the mismatch penalty and activation threshold. Results from the second experiment revealed significant correlations between state anxiety and both parameters, suggesting that emotional traits systematically shift memory retrieval dynamics. Our findings offer empirical support for integrating emotion into cognitive architectures without introducing ad hoc modules, and contribute to broader efforts to align the Common Model of Cognition with affective science. This work highlights the potential of inverse modeling as a tool for understanding the emotion–cognition interface and opens new avenues for modeling individual differences in affect-sensitive cognitive systems.

Language experience and prediction across the lifespan: evidence from diachronic fine-tuning of language models

Humans predict upcoming language input from context, which depends on prior language experience. This suggests that older adults' predictions may differ from those of young adults, due to longer language exposure. Here we use sentence completion data from two age cohorts (YA = 18-35 y.o.; OA = 50-80 y.o.) and language models fine-tuned to particular decades of a diachronic corpus of American English to examine the relationship between changes in language statistics and differences in linguistic prediction across different age groups. We observed greater consistency in contextual probabilities within age groups compared to across age groups, indicating that YA and OA make subtly different predictions given identical context. Next-word prediction performance for the fine-tuned models decreased as the temporal distance between the fine- tuning and testing decade increased, indicating that language usage statistics changed over the span of a few decades. Further, GPT-2 surprisal values are more predictive of YA than OA contextual probabilities, suggesting that the language statistics, as captured by a model trained largely on internet text, aligns more with YA's internal model than OA's. However, both age groups' data are better fit by models fine-tuned on more recent corpus decades.

Who and when gets the race? Two processing routes for the advantages and penalties of pronominal ambiguity resolution

The current study investigates how pronominal ambiguity is resolved in real-time, focusing on the role of referent bias and task context. In two self-paced reading experiments, we tested whether ambiguity leads to processing benefits or costs modulated by the presence of a biased referent and the task manipulations. Experiment 1 showed that the ambiguity advantage emerges only when a biased referent is not selected, supporting reanalysis-based accounts such as the unrestricted race model (Van Gompel et al., 2000, 2001, 2005). Experiment 2, however, revealed a delayed ambiguity penalty, suggesting task-induced shifts in processing strategy that better fit a delayed interpretation account. These findings highlight that pronominal ambiguity resolution may involve two processing mechanisms shaped by the parser's evaluation space and the timing of selection.

Written in Stone: Lay intuitions about the emergence of formal rules

Humans navigate social life through shared expectations that guide behavior. Some expectations become formalized into explicit rules, but little is known about people's intuitions regarding why and when rules become formal. Across three experiments with U.S. adults, we examined these intuitions. Participants consistently predicted that formal rules would arise in contexts involving internal coordination challenges (e.g., large or diverse groups), but not in contexts involving external threats (e.g., natural disasters, resource scarcity). In Experiment 2, participants judged formal rules more likely when defiance or costly violations were expected—but these beliefs did not explain when participants actually inferred formalization. In Experiment 3, expectations of formality were better predicted by perceived group dynamics: low interpersonal closeness and high disagreement ("quibbling"). These dynamics predicted formality more strongly than concerns about defiance, ignorance, or cost. Our findings suggest that people see formalization less as a mechanism for enforcing norms, and more as a strategy for managing group interaction dynamics.

Gaze Insights into Partially-Encoded Representations of Objects and Categories

Studies of category learning have revealed individual differences in decision-making, such that the same stimulus may be categorized differently across individuals. Modeling accounts have explained these differences in terms of how attention weights are distributed across stimulus dimensions that distinguish between category responses. These weights are typically assumed to reflect an individual's beliefs about which dimensions are most relevant to their goals. The current work investigates the possibility that instead of being purely strategic, attention weights are constrained by what was encoded into memory during learning. Participants (N=120, age 18-25) completed a category learning task while gaze was recorded as an exogenous measure of attention. Model-based analyses using gaze to predict behavior revealed that accounting for partially-encoded representations was necessary for predicting individual differences in feature memory and categorization.

Can Large Language Models Predict Associations Among Human Attitudes?

Prior work has shown that large language models (LLMs) can predict human attitudes based on other attitudes, but this work has largely focused on predictions from highly similar and interrelated attitudes. In contrast, human attitudes are often strongly associated even across disparate and dissimilar topics. Using a novel dataset of human responses toward diverse attitude statements, we found that a frontier language model (GPT-4o) was able to recreate the pairwise correlations among individual attitudes and to predict individuals' attitudes from one another. Crucially, in an advance over prior work, we tested GPT-4o's ability to predict in the absence of surface-similarity between attitudes, finding that while surface similarity improves prediction accuracy, the model was still highly-capable of generating meaningful social inferences between dissimilar attitudes. Altogether, our findings indicate that LLMs capture crucial aspects of the deeper, latent structure of human belief systems.

Thinking through the past, present, and future: Language convergence-entropy is influenced by when you think of and how you feel

Under construal-level theory, psychologically distant concepts such as a far away land are generally seen more abstractly compared to concepts viewed as closer in time, space, or identity. As we mentally travel through time, dynamics of language contained in streams-of-consciousness may provide a look into how we drift through topical space. Here, we investigated self-convergence and entropy using a BERT-based method to see how language drift over the course of typed streams-of-consciousness may be shaped by temporal framing. We applied this method to a dataset where undergraduate students during COVID-19 shared their thoughts imagining life before, during, and after the pandemic. We find that post-pandemic, future-directed thoughts showcase greater drift compared to past and present thoughts, suggesting greater exploration. Interestingly, past thoughts showed the least drift, suggesting there may be differences in concreteness depending on the direction in time you travel and the ability to have impact over temporally-tethered events.

Leveraging prediction to investigate the mental lexicon: Evidence from an agglutinating language

While prediction has previously been considered a strong primitive in Subject-Object-Verb (SOV) languages, it is relatively unknown how prediction might manifest in morphologically complex languages with verbs that potentially undergo morphological decomposition into their constituent verb stems and affixes during online sentence comprehension. In this reading study, we looked at how such top-down (prediction) and bottom-up processes (decomposition) interact in an agglutinating language like Malayalam. We investigate two questions simultaneously (i) whether predicting a suffix actually confers any processing advantage during lexical access in real-time, similar to predicting a verb stem, and (ii) whether this can reveal something about the representations of verb stems and suffixes of words within the mental lexicon. We find that when predicted correctly, suffixes pattern similarly to verb stems, which seems to suggest that suffixes can have independent representations alongside verb stems, thereby aiding visual word recognition by acting as access points to activate lexical entries.

When Seating Matters: Modeling Graded Social Attitudes as Bayesian Inference

Humans can quickly infer social relationships from minimal cues, such as where people choose to sit in a meeting room. We investigated how people make graded, context-sensitive judgments about social attitudes beyond simple proximity-based heuristics. Using controlled seating scenarios, we compared participants' judgments to the predictions of Bayesian models: the interaction-probability model, which captures how one person's seat choice affects the probability that another person will initiate the conversation, and the interaction-cost model, which accounts for the effort required based on how far apart they sit from each other. Results showed that participants' inferences aligned best with the interaction-cost model, indicating sensitivity to effort and moving trajectory, rather than relying solely on proximity. Our findings suggest that higher-order cognition refines perceptual cues, enabling nuanced, graded social reasoning essential for complex social interactions.

How does multilingualism interact with early number concepts?

This study explores the impact of multilingualism on early number concepts in preschool children. We compared the performance of mono- and multilingual preschoolers on four tasks that test early number concepts. The tasks were counting, Give-N, magnitude judgements, and number identification. The sample included 59 Australian children aged 3.5 to 5.5 years, with 25 multilingual children exposed to at least one language in addition to English. Our results indicated that age was a significant predictor across all tasks, and mono- versus multilingual experience alone did not have a substantial effect on these tasks. However, there was a significant age-by-language experience interaction in the counting task, where older multilingual children counted significantly higher than the older monolinguals. These findings lend new insights into the nuanced role that multilingualism plays in the development of number concepts.

Visual Processing of Arabic-English Code-Switching: An Eye-Tracking Analysis

This study examines the visual processing of Arabic-English code-switching using an eye-tracking experiment, focusing on determiner-noun switching. Arabic-English bilinguals (L1 Arabic, L2 English) read English-framed sentences that were either monolingual or contained a code-switched noun phrase with an English determiner and an Arabic noun (e.g., John bought the كتاب) or an Arabic determiner and an English noun (e.g., John bought ال book). The Matrix Language Frame (MLF) model (Myers-Scotton, 1993, 2002) predicts that the matrix language (ML) determines the selection of functional elements, requiring determiners to come from English. However, eye-tracking results revealed the opposite pattern: codes-witching was more disfavored if the determiner came from English than if it came from Arabic, contradicting the MLF model's predictions. These findings suggest that grammatical constraints alone cannot fully explain code-switching patterns and that orthographic differences, script directionality, and switching costs play a role in bilingual sentence processing.

Adaptation to noisy language input in real time: Evidence from ERPs

Language comprehension often deviates from the literal meaning of the input, particularly when errors resembles more plausible alternatives. Such non-literal interpretations have been associated with a reduced N400 and increased P600, but it remains debated whether these effects reflect perceptual misrepresentation of the input or error correction. One way to tease apart these accounts is to examine how comprehenders adapt to a noisy linguistic environment. A perceptual error account predicts that increased exposure to noise leads to habituation to errors and more misperception, resulting in reduced N400 and P600 responses. In contrast, an error correction account predicts that comprehenders perform more error correction in noisy environments, leading to increased P600s, and potentially modulated N400s depending on the timing of the correction. In this study, we manipulated the proportion of errors in non-critical exposure sentences and measured ERP responses to different types of anomalies. The results replicated prior findings of reduced N400s for recoverable errors. Results in the P600 window were not replicated and it remains an open question which framework (error correction vs. perceptual error) best accounts for the data. Further, the results revealed substantial individual differences in processing words which may contain errors with implications for how participants adapted to additional noise in the environment.

Neural Representations of Social Interactivity: A Perceptual and Language Model Analysis

When given the opportunity, humans naturally engage in anthropomorphism, which may reflect a bias to engage in mentalistic attributions in understanding social interactions. In this experiment, we evaluate whether neural activity in social perceptual brain regions can be explained by perceptual cues of agency and interactivity, or by semantic models of written descriptions of Heider-Simmel style animations. Models were compared in representational similarity space using variance partitioning of the neural response from the STS, TPJ, and PCC. The right STS and TPJ were best explained by perceptual models of distance between the agents, an indicator of interactivity, and separately by the similarity structure of the free responses, which captured both action and interaction terms. Together, these results implicate the importance of contextual framing, either through perceptual features of interactivity or social context as implied by the nature of interactions, as defining features in neural representations of interactivity.

Predicting Human Choice Between Textually Described Lotteries

Predicting human decision-making under risk and uncertainty is a long-standing challenge in cognitive science, economics, and AI. While prior research has focused on numerically described lotteries, real-world decisions often rely on textual descriptions. This study conducts the first large-scale exploration of human decision-making in such tasks using a large dataset of one-shot binary choices between textually described lotteries. We evaluate multiple computational approaches, including fine-tuning Large Language Models (LLMs), leveraging embeddings, and integrating behavioral theories of choice under risk. Our results show that fine-tuned LLMs, specifically GPT-4o, outperform hybrid models that incorporate behavioral theory, challenging established methods in numerical settings. These findings highlight fundamental differences in how textual and numerical information influence decision-making and underscore the need for new modeling strategies to bridge this gap.

A Crosslinguistic Investigation on the Correlation between Functional Load of Tone and Tone-Melody Correspondence

Linguistic features like stress and tone are often reflected in how lyrics are set to music. Intuitively, the motivation behind this phenomenon is to ensure listeners can accurately understand the lyrics in a musical environment, which begs the question: If a phonological component is more useful for accurately understanding speech in a language, then is it more likely to be reflected in text-setting? This study explores this question focusing on tone and tone-melody correspondence. Functional load of tones and degrees of tone-melody correspondence were obtained for three languages that use pitch contrastively: Cantonese, Mandarin, and Japanese. It was found that the functional load of tones and degree of tone-melody correspondence in these three languages did not correlate. Since Cantonese and Japanese alone exhibit the correlation, reasons for why Mandarin breaks the possible pattern are discussed. This study is a look into how linguistic grammar and experience interacts with musical grammar in a behavior that simultaneously involves language and music.

Role of Sensory Processing Sensitivity in driving Maladaptive Music Use

Sensory processing sensitivity (SPS), a trait marked by heightened reactivity to stimuli, is linked to emotional dysregulation and stress-related problems. Its subscales include Ease of Excitation that reflects emotional sensitivity to internal and external demands, Low Sensory Threshold reflects sensory overload susceptibility, and Aesthetic Sensitivity denotes appreciation for subtleties. While music can sometimes amplify maladaptive outcomes (rumination, avoidance), SPS's role in such behaviours, especially in non-Western contexts, remains underexplored. This study examines how SPS drives maladaptive music listening in 673 Indian adults. Network analysis and structural equation modelling revealed sensitivity to external demands elevated psychological distress which in turn predicted maladaptive music use. Reactivity to internal demands may reduce maladaptive music use but exacerbated it when mediated by external demands, reflecting preference for avoidant coping in Indian context. Findings emphasize SPS's role in maladaptive behaviours, demonstrating how sensory reactivity interacts with traits to shape distress-driven music use as an emotional regulation mechanism.

Visual Imagery Vividness Predicts the Complexity of Induced Hallucinations

The current study utilizes the Ganzflicker paradigm—a flickering stimulus that induces visual hallucinations—to provide insight into the internally-generated visual experiences that correlate with individual differences in visual imagery. Here, we analyzed rich narrative descriptions of Ganzflicker hallucinations from 4,365 participants using natural language processing, sensorimotor norms, and AI visualizations. We find that overall perceptual richness and visual detail in descriptions increase with imagery vividness. Examining the specific content of these descriptions reveals that vivid imagers report more face and hand-related content than those with weaker imagery. Exploratory AI-generated visualizations of these descriptions provide additional insights, as those with weak imagery report patterns of simple visual features, like colors and geometric forms, while strong imagers' hallucinations are filled with complex real-world stimuli. These findings suggest imagery differences may lie not in early visual processing but in the integration of basic visual features into complex object- and scene-level representations.

Minding the Politeness Gap in Cross-cultural Communication

Misunderstandings in cross-cultural communication often arise from subtle differences in interpretation, but it is unclear whether these differences arise from the literal meanings assigned to words or from more general pragmatic factors such as norms around politeness and brevity. In this paper, we report three experiments examining how speakers of British and American English interpret intensifiers like ``quite'' and ``very,'' finding support for a combination of semantic and pragmatic factors. To better understand these differences, we developed a computational cognitive model where listeners recursively reason about speakers who balance informativity, politeness, and utterance cost. A series of model comparisons suggest that cross-cultural differences in intensifier interpretation stem from (1) different literal meanings, (2) different weights on utterance cost. These findings challenge accounts based purely on semantic variation or politeness norms, demonstrating that cross-cultural differences in interpretation emerge from an intricate interplay between the two.

Investigating the Relationship Between Rumination and Executive Functions: The case of Inhibition and Switching

The present study aimed to examine the relationship between rumination and two core executive functions, inhibition and switching, through an experimental design. Undergraduate participants (N=153) were randomly assigned to a rumination induction, a negative mood induction, or an abstract distraction control. Participants completed a task-switching paradigm before and after induction, providing inhibition and switch-cost indices. The Ruminative Response Scale (RRS) and the Leiden Index of Depression Sensitivity (LEIDS-R) were administered to measure trait rumination and cognitive reactivity, respectively. Significant three-way interactions were observed, with participants low in trait rumination and high in cognitive reactivity in the negative mood induction group demonstrating the least switching costs. Interestingly, no significant differences emerged among those with either high or low levels of both traits across all conditions. The study showed a complex picture between rumination and cognitive reactivity, suggesting that the impact of rumination on executive function performance depends on reactivity levels.

Evolution on the Lexical Workbench: Disentangling Frequency, Centrality, and Polysemy in Language Evolution

How do words evolve in their usage and meaning over time? We investigate the relationship between word frequency, semantic richness, and network centrality through longitudinal analysis of the Corpus of Historical American English (1820–2019). Using measures of semantic richness and network position, we find that a word's betweenness centrality—its tendency to bridge different semantic domains—consistently predicts both its future semantic richness and frequency of use. This relationship strengthens over longer time intervals, with the strongest effects observed across a 100-year span. Notably, while frequency and semantic richness are correlated as established in the literature, our results indicate that there was no directional relationship between frequency and semantic richness, while network centrality exerts a significant influence on both of these factors. Our results suggest that a word's position within the semantic network might play a crucial role in its evolution: words that bridge different semantic domains are more likely to develop new meanings and change in frequency over time. These findings offer new insights into the mechanisms driving language change.

Naturalistic action sampling as foraging in the option space

Human decision-making involves navigating unbounded spaces of possible goals, subgoals, and action sequences. Yet, computational models typically assume pre-defined option sets. This creates a critical gap between the algorithms developed in cognitive science research on decision-making and the open nature of real-world decisions. We propose that option generation in open-ended settings operates as a search through structured decision space. Drawing on foraging theory, we hypothesized that option generation follows Lévy flight distributions, a pattern observed in both spatial foraging and memory retrieval. We found that the inter-generation time between consecutive responses in open-ended option generation problems approximated a Lévy distribution, while semantic distances demonstrated properties of heavy-tailed distributions. These findings reveal connections between action planning, information search, and memory retrieval, suggesting shared computational principles in how humans explore unbounded decision spaces.

Improving Interpersonal Communication by Simulating Audiences with Large Language Models

How do we communicate with others to achieve our goals? We use our prior experience or advice from others, or construct a candidate utterance by predicting how it will be received. However, our experiences are limited and biased, and reasoning about potential outcomes can be difficult and cognitively challenging. In this paper, we explore how we can leverage Large Language Model (LLM) simulations to help us communicate better. Based on ideas from cognitive science such as the Rational Speech Act model, we propose the Explore-Generate-Simulate (EGS) framework, which takes as input any scenario where an individual is communicating to an audience with a goal they want to achieve. EGS (1) explores the solution space by producing a diverse set of advice relevant to the scenario, (2) generates communication candidates conditioned on subsets of the advice, and (3) simulates the reactions from various audiences to determine both the best candidate and advice to use. We evaluate this framework on eight scenarios spanning a range of interpersonal communication settings. For each scenario, we collect a dataset of human evaluations across candidates and baselines, and show that our framework's chosen candidate is significantly preferred over popular generation mechanisms for LLMs. Finally, we demonstrate the generality of our framework by applying it to real-world scenarios described by users on web forums.

Visual attention and cross-linguistic effects in reading: Simulations with BRAID-Acq, a probabilistic model of reading

Theories of reading are mostly based on English, that is rather atypical among alphabetic orthographies due to its inconsistent orthography-phonology mappings. Differences in length effects have been observed between languages. Psycholinguistic characteristics of the orthography, such as orthographic depth, seem to have an impact on reading strategies and could be correlated with different visual-attentional profiles. However, no computational model has yet demonstrated the impact of language-dependent visual-attentional mechanisms on reading. This study explores these effects using BRAID-Acq, a probabilistic reading model with a visual-attentional module. We simulated word and pseudoword reading in English, German, and French to examine how orthographic depth and visual attention shape processing. Our simulation results suggest an effect of the orthography on processing time. In particular, English requires a larger attentional quantity for efficient processing of words and pseudowords, offering a novel interpretation of difficulties in reading acquisition in English.

Less Than the Sum of its Parts: Complex Models of Cognition Struggle to Capture Regional Activity within Otherwise Well-Fitting Model Structures

Dynamic Causal Modeling is a widely-used method for examining brain connectivity. Most commonly, it is applied to brain regions showing strong responses to experimental tasks, comparing different network configurations based on the temporal dynamics of the neural signals. It can further be applied to models employing a theory-driven selection of brain regions, showing a weaker experimental effect. However, it is unclear if these effects provide sufficient temporal information for Dynamic Causal Modeling to reliably identify the best-fitting model. This study investigated the regional predictive fit in a theory-driven model which has been found to consistently outperform alternatives using Dynamic Causal Modeling. Results revealed issues with the fit of some regions and subjects, raising concerns regarding the reliability of model comparisons using Dynamic Causal Modeling with regions selected based on theory instead of a strong experimental effect.

Framing in context: Disabling conditions and alternative causes in health communication

Should a health campaign emphasise the potential gains from compliance (e.g., "If you quit smoking, you'll reduce your risk of lung cancer") or the potential losses from non-compliance (e.g., "If you don't quit smoking, you won't reduce your risk of lung cancer")? A large literature on so-called goal framing, or message framing, assumes that such messages are equivalent, but their persuasiveness may vary, for instance, depending on the perceived risk associated with the recommended behaviour. As no existing hypothesis received conclusive empirical support, we propose a novel theoretical approach. We argue that goal frames must be analysed as arguments interpreted in con- text. We report an experiment showing the effect of the participants' background beliefs about disabling conditions and alternative causes on the persuasiveness of positive and negative frames recommending detection behaviour.

Capturing Curiosity: Task-Based Differences in Children's Exploratory Behavior

This study explored differences in children's information seeking in the two exploration tasks aligned with proposed curiosity frameworks. One task provided an open-ended unlimited information seeking design assessing the frequency of exploration attempts across similar options; the second was a constrained information seeking design with limits on how much could be explored, focusing instead on what children chose among varying levels of uncertainty. Children's information seeking did not relate between the two tasks, and children give different explanations for their motivation for seeking information that aligned with the different designs; in the open-ended task children's exploration was motivated by more superficial and perceptual features, while in the constrained task they described desiring information and mentioned uncertainty and mystery. Potential implications of the results are discussed.

The illusion of credibility: How the pseudosciences appear scientific

The pseudosciences often bear a striking resemblance to the sciences. Using a mimicry account as a framework, this paper investigates how the appearance of social media posts influences people's perception of the content of such posts as scientific. We present the results of two empirical studies. The first, preparatory study, identifies typical characteristics of "scientificness" in social media posts to inform feature manipulation within the main study. The main study then examines what happens if the features are systematically manipulated. The findings support the hypothesis that pseudoscientific digital content benefits from using features of scientificness. We discuss implications for understanding the appeal and persistence of pseudoscience.

Modeling Cue-Based Retrieval and Prediction – One Morpheme at a Time

This work proposes an extension to the cue-based retrieval theory of sentence processing: memory retrieval and predic- tion processes during sentence comprehension take place at the morpheme level instead of at the word level. We illustrate this proposal by extending an existing cue-based retrieval model from word-level to morpheme-level processing, and show that our model better captures the interactions between memory re- trieval and predictive processing. Specifically, we extend the model reported in Patil and Lago (2021), which accounted for the interaction between retrieval and prediction during the pro- cessing of German possessive pronouns, but failed to general- ize to structures involving determiners (Oltrogge, Ver´ıssimo, Patil, & Lago, accepted). Our results show that modeling at the morpheme level captures retrieval-prediction interactions more precisely. The model successfully predicts the pattern of prediction onsets across German possessive pronouns and determiners. The proposed morpheme-by-morpheme model is further supported by psycholinguistic evidence suggesting that humans naturally decompose words into their constituent mor- phemes

Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation

While convolutional neural networks (CNNs) have come to match and exceed human performance in many settings, the tasks these models optimize for are largely constrained to the level of individual objects, such as classification and captioning. Humans remain vastly superior to CNNs in visual tasks involving relations, including the ability to identify two objects as `same' or `different'. A number of studies have shown that while CNNs can be coaxed into learning the same-different relation in some settings, they tend to generalize poorly to other instances of this relation. In this work we show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning, which explicitly encourages abstraction and generalization across tasks.

Looking beyond parental reports: systematic biases in early word recognition assessment

This study examines convergence between parental reports and behavioral measures in assessing early word knowledge of twenty-eight 14-month-old Korean infants. We compared infants' word recognition patterns with parental reports using full and shorter versions of the Korean MacArthur-Bates Communicative Development Inventories (MCDI-K). Our analyses revealed three key patterns. First, while parents showed consistent judgment between the full CDI and the target-word checklist, the checklist demonstrated better convergence with eye-tracking measures, which accounted for baseline looking biases. Second, parents' reporting accuracy varied systematically with item difficulty: for early-acquired words, parents showed higher agreement with eye-tracking than for later-acquired words. Third, exploratory analyses suggested a possible asymmetry in word category recognition, with infants showing stronger recognition of nouns than verbs in the eye-tracking task, contrasting with more balanced verb-noun knowledge in parental reports. These findings show that assessment methods capture different aspects of early word knowledge.

Label Similarity and Stimulus Similarity Interact in Categorization

When learning to categorize stimuli, do we assume similar things should have similar labels? Are people more likely to respond with closer labels (e.g. 2-1 vs 2-4) when stimuli are more similar to each other? Across five experiments, we report evidence of such a bias and demonstrate that it can surface across a wide range of stimulus modalities and features, and persists regardless of participants' prior knowledge of the dimensions relevant for categorization. We also characterize some of the limits of this effect: it appears sensitive to the specific configuration of label-stimulus mappings, and may depend on overt similarity relations in label space. At minimum, our findings indicate the need to consider label-stimulus configurations when designing categorization experiments. They also hint more broadly at how label-to-stimulus mappings may affect how we structure novel categories.

The Role of Siblings on Infant Language Exposure in Daylong Audio Recordings

The language children hear every day is important for language learning. Different sets of speakers may lead to different kinds of speech: Speech directed at a child or speech directed to another adult or a sibling. We aimed to quantify how the presence of a sibling, specifically, affects dynamics of speech in a home. Using day long audio recordings from homes in the United States, we measured the amount of target child-directed and overheard speech in the home of infants in families with and without older siblings. Infants with an older sibling experienced significantly more overheard speech and significantly less speech directed to them, than infants without an older sibling. However, families with and without older siblings did not differ in the total amount of child-directed input, directed to either sibling. These findings suggest a more complicated relationship between overheard speech and language learning.

Short Intervention during Sustained Attention Tasks Preserve Performance Without Reducing Mind-Wandering

Despite extensive evidence that performance declines over time in sustained attention tasks, questions remain about whether interventions such as short breaks can address this issue. This study investigated the effects of a rest-break vs. task-switch during the Sustained Attention to Response Task (SART) on task performance and mind-wandering (MW) frequency. In addition to measuring performance via accuracy and reaction time (RT), drift-diffusion modeling (DDM) was implemented to assess performance, and the effectiveness of the interventions was evaluated across different mental states (on-task vs. off-task). Results showed that both interventions preserved accuracy compared to the no-break group. The rest-break intervention maintained stable no-go drift rates, while the task-switch intervention showed an increase in boundary separation. However, neither intervention reduced rates of MW. Additionally, the effects of interventions were more pronounced when participants reported being on-task. These findings suggest that short breaks can help sustain performance, yet do not necessarily halt MW.

Epistemic Monocultures and the Effect of AI Personalization

It has been argued that when scientists employ algorithmic tools to assist in problem-solving, epistemic monocultures may emerge in which research tools, topics, findings, etc. are homogenized. As a result, fertile areas of research might be left unexplored, impeding scientific progress. To explore the nature of these epistemic monocultures, we develop an agent-based model where agents have the ability to query an AI system to assist in their search of an epistemic (NK) landscape. In general, we find that AI use negatively affects the community of researchers by reducing heterogeneity, but both the rate of AI queries and how AI is used impact the ultimate success of the community. We then implement a potential solution suggested in the literature, AI personalization, and find somewhat mixed results on its potential for mitigating homogenization in research communities.

An Effective Strategy to Reduce Interference: Effects of Unitization on the Development of Memory Binding

Memory binding is the ability to bind together multiple components in a memory trace. Previous studies have shown that one way to improve memory binding is to integrate multiple components into the same item in a meaningful way – a strategy known as unitization. To further investigate unitization as a source of development of memory binding, in terms of whether a unitization strategy has different effects on different memory structures and different age groups, the current study incorporated a semantic unitization strategy into a memory binding task and recruited multiple age groups for the experiment. Results showed that unitization led to a marked improvement in three-way binding for 7-year-olds and adults, but not for 5-year-olds, suggesting that semantic unitization could be a driving force for the development of complex memory binding in childhood and even beyond adulthood, but its effect in young children might still be limited due to their immaturity.

Parents underestimate young children's abilities which may undermine their parenting practices

Parents' beliefs about children's abilities shape their parenting practices. But how accurate are parents at estimating what children are truly capable of? Here, we test the hypothesis that U.S. parents underestimate young children's abilities to complete challenging, multi-step tasks, and in turn, intervene beyond children's developmental needs (a behavior known as "overparenting"). In Studies 1A and 1B, parents (N = 130) of preschool-aged children underestimated their children's abilities, especially on practical (vs. academic) and novel (vs. familiar) multi-step tasks. In Studies 2A and 2B, we found that parents' (N = 109) underestimation has potential negative consequences: Parents who believed their child was less capable were more likely to take over tasks and provided less encouragement for independent actions. These findings suggest that parents underestimate young children's abilities, which may hinder the development of children's learning and autonomy.

When Default Options Explain Away Preferences: A Causal Reasoning Account of Mental State Reasoning from Default Options

People often infer that those who actively switch away from a default option have stronger preferences than those who passively accept it (termed asymmetric preference inferences). We test whether this classic effect reflects rational causal inference about how defaults provide alternative explanations for others' mental states. This account predicts that asymmetric inferences should occur only when accepting the default provides an alternative explanation for choice (e.g., following a recommendation), and that asymmetry should diminish or disappear when it does not (e.g., a default licensing indulgence in a preferable option). In a pre-registered study (N=120), participants showed this effect: They made asymmetric inferences only when the default provided an alternative explanation for preference, and made symmetrical inferences when it did not. These findings suggest this classic effect reflects rational causal inference, providing a framework for predicting when people make asymmetric preference inferences from defaults.

Efficient communication drives the semantic structure of kinship terminology

Semantic distinctions are encoded variably in kinship terminology, the set of words that denotes family members. Nonetheless, it has been suggested that kinship terminology, like other linguistic domains, is constrained by opposing pressures to be simple yet expressive. Here, we use this insight to explore how the meaning space for kinship is structured cross-linguistically. Under the assumption that kinship systems map forms to meanings in a compressible, structure-preserving manner, we designed a metric for identifying which semantic features are most important for distinguishing individuals in a kinship system. For 1229 kinship systems, we calculated the correlation between semantic similarity (the weighted sum of shared semantic features between individuals) and wordform similarity (the edit similarity between terms). We then identified the optimal weight for each semantic feature in each language, confirming that kinship systems vary in which semantic features they encode, and that the features themselves vary in the extent to which they are encoded. Additionally, we identified that semantic features are encoded hierarchically; more simple and more informative features are weighted highest in general. By identifying this constraint on the distribution of forms and meanings in kinship terminology, our results provide new insights on how kin terms are structured for efficient communication.

Latent speech representations learned through self-supervised learning predict listeners' generalization of adaptation across talkers

Unfamiliar accents can pose a challenge to speech recognition. However, listeners often adapt quickly to novel accents, and even generalize this adaptation across talkers with the same accent. We investigate how such cross-talker generalization---critical to effective speech perception---is achieved. We take advantage of advances in automatic speech recognition to test whether comparatively simple similarity-based inferences can explain cross-talker generalization in human listeners. We use the latent perceptual space learned by the HuBERT model---shaped by the statistics of the speech signal and the objective to recognize speech---to meaningfully measure the similarity between talkers' pronunciation. We find that word-level similarity in this latent space predict listeners' ability to successfully generalize across talkers. We discuss consequences for theories of adaptive speech perception. In particular, our results explain why cross-talker variability is not a prerequisite for cross-talker generalization (contrary to influential accounts).

Learning From ‘What Might Have Been': A Bayesian Model of Learning from Regret

Regret is a common emotion that might either catalyze or impair decision-making. What determines whether regret will be helpful or harmful in a given situation? We test the hypothesis that regret is more likely to hinder decision-making during the early stages of learning, when information is limited, but help during later stages of learning, when the learner has a better understanding of the environment. We introduce a Bayesian model of learning from regret, in which the "counterfactual weight" parameter – reflecting how strongly individuals update their beliefs about foregone outcomes – predicts both learning outcomes and the intensity of subjective regret. We find that probing regret early in the learning phase leads to worse performance than probing regret later or not at all. This work has important implications for both cognitive and affective science, shedding light on the appraisal mechanisms by which regret influences decision-making.

The functional view of intuitive etymological explanations

People routinely make up stories to explain why things are called what they are called. These "intuitive etymological explanations" (IEEs) show up in children and adults, and even become cultural narratives shared across generations. Yet, they're typically wrong. As a result, scholars have historically ignored them or treated them as mere curiosities that are irrelevant to our linguistic competence and even interfere with our theories of language evolution. Contrary to this view, we propose that IEEs may be a functional activity that people engage in to learn and maintain a massive, ever-changing lexicon. In Experiment 1, we find preliminary evidence that IEEs, whether self-generated or culturally-transmitted, can support word learning in comparison to control conditions in which participants engage with contextual word use (both self- generated and culturally-transmitted). In Experiment 2, we find that, despite being incorrect, culturally-transmitted IEEs can support word learning more than true etymologies. Across two preregistered experiments, our results suggest that intuitive etymological explanations, though typically incorrect, may facilitate language use by building structures of form and meaning out of our linguistic experience.

Children Spontaneously Design Curricula to Tackle Challenging Tasks

We study how children develop a causal curriculum to achieve a challenging goal that is not solvable at first. Adopting the Procgen environments that include various challenging game tasks, we found that 5- to 7-year-old children actively used their current level competence to determine their next step in the curriculum and made improvements to their performance during this process as a result. This suggests that children treat their level competence as an intrinsic reward, and are motivated to master easier levels in order to do better at a more difficult one, even without explicit reward. However, our findings also suggest that children's self-designed curricula may not always be the most effective design. Rather, repeatedly practicing on the difficult target task may be sufficient. Notably, when constrained to stay on the target task instead of crafting their own curricula, more children actually succeeded and made greater progress in the game, suggesting that children perceive a curriculum as beneficial, even when focusing on a singular difficulty might prove more effective.

Numbers and counting: Silent gesture and artificial language learning do not always reflect typological patterns

In classifier languages, a sequence consisting of a noun (N), a numeral (Num), and a numeral classifier (CL) could in principle occur in one of six possible word orders. However, the cross-linguistic distribution of these word orders is highly uneven. Specifically, classifier languages tend to use Num-CL orders and, furthermore, N-medial orders are completely unattested in the world's languages. We use an artificial language learning paradigm (Experiment 1) and a silent gesture paradigm (Experiment 2) to test the hypothesis that typological patterns arise from cognitive biases at the level of individual speakers. In contrast to studies that examined coarser grained ordering effects, our results do not align with typological preferences. We consider the possibility that cognitive biases might not play a role in "finer grained" ordering phenomena involving units such as classifiers, whose role in an utterance is more about grammatical well-formedness than a strong contribution to meaning.

Few-Shot Learning of Visual Compositional Concepts through Probabilistic Schema Induction

The ability to learn new visual concepts from limited examples is a hallmark of human cognition. While traditional category learning models represent each example as an unstructured feature vector, compositional concept learning is thought to depend on (1) structured representations of examples (e.g., directed graphs consisting of objects and their relations) and (2) the identification of shared relational structure across examples through analogical mapping. Here, we introduce Probabilistic Schema Induction (PSI), a prototype model that employs deep learning to perform analogical mapping over structured representations of only a handful of examples, forming a compositional concept called a schema. In doing so, PSI relies on a novel conception of similarity that weighs object-level similarity and relational similarity, as well as a mechanism for amplifying relations relevant to classification, analogous to selective attention parameters in traditional models. We show that PSI produces human-like learning performance and outperforms two controls: a prototype model that uses unstructured feature vectors extracted from a deep learning model, and a variant of PSI with weaker structured representations. Notably, we find that PSI's human-like performance is driven by an adaptive strategy that increases relational similarity over object-level similarity and upweights the contribution of relations that distinguish classes. These findings suggest that structured representations and analogical mapping are critical to modeling rapid human-like learning of compositional visual concepts, and demonstrate how deep learning can be leveraged to create psychological models.

Step-by-step analogical reasoning in humans and neural networks

Both humans and large language models (LLMs) perform better on some reasoning tasks when they are encouraged to think step by step. However, it is unclear whether these performance gains are based on similar principles. In this work, we investigate two hypotheses: (1) that these benefits arise due to the presence of local statistical structure in the training data, where intermediate steps of reasoning may be common but any specific reasoning trajectory is rare, and (2) that sequential processing improves reasoning by mitigating interference. Using LLMs and transformers trained on a synthetic dataset, we show how analogical distance effects previously observed in humans and LLMs may be explained by the presence of local statistical structure. Testing both humans and LLMs on a novel word analogy task, we find that interference caused by semantic similarity can hurt performance and drives humans to engage in a sequential reasoning process. Our findings show that both locality structure and interference may be key principles underlying the benefits of step-by-step thinking.

Evaluating testimony from multiple witnesses: exploring qualitative intuitions

This study further explored a novel reasoning error. When faced with evidence from multiple sources, a substantial number of lay reasoners inaccurately integrate cues of reliability and report number. Particularly when further reports are less reliable than initial (highly reliable) reports. When evaluating the added value of supplementary corroborative reports, we find that, in most instances, participants are equally likely to provide correct or incorrect qualitative judgements. When using a sequential presentation and explicitly prompting participants to consider the impact of additional credible evidence, 36.7%-45% indicate that their beliefs should remain the same and 10% or less indicate that their beliefs should decrease. Only a third correctly believed that in each instance of corroborating evidence the likelihood of the target hypothesis should increase. Qualitative judgements also significantly impacted the accuracy of belief estimates; deviations from normative, Bayesian, predictions at the group level are explained by sub-groups with incorrect qualitative intuitions.

Infants and Toddlers Expect Others Will Shun the Previously Excluded and Instead Approach the Previously Included

Navigating social affiliation adaptively is a critical task of human life. If parsing the social world into affiliative groups forms a core, generative mechanism of the evolved human mind, even infants may differentiate between minimal depictions of inclusion and exclusion. Furthermore, whether groups include or exclude others may cue their individual value as social partners. If so, infants may expect third-party observers to continue avoiding those others exclude and prefer those they include, further perpetuating discrimination of the already marginalized. Here, we show that 10-18 m.o. infants (n=96) look longer when a neutral observer approaches a novel agent whom an abstract group previously excluded, rather than included, in an animated violation-of-expectation paradigm. We found no effect of participant age. Movements were identical across scenarios, differing only in a delay between the excluded agent and the group. These findings indicate that even infants infer that observed exclusion versus inclusion will generalize to other interactions with new social partners.

Intuitions about prosocial backfiring: Four to seven-year olds' understanding of when helping might cause offense

Children are attuned to prosocial behavior from early in development and engage in helpful and cooperative behaviors. However, helping is not always helpful. Decades of research has shown that unsolicited offers of help can threaten the self-esteem of recipients, especially to the degree that recipients perceive themselves as competent. We know that young children view helping positively and appreciate the benefits of helpful actions. To what extent are they also aware of possible harms? Are young children aware that unsolicited offers of help may upset others, especially to the degree that the intended beneficiary is able to perform the task alone? Here, we show that both older (N= 30,  mean: 7.02; range 6- 7.97) and younger (N= 30,  mean: 4.95; range 4.02-6) children understand that unsolicited offers of help are more likely to upset high than low competent recipients.

When Less is More: Students' Use of Diagrams and their Perception of Diagram Use in an AI Tutor for Algebra Learning

It is critical to understand how students' monitoring activities are related to their actions during learning. In particular, studies have not fully explored how students' spontaneous use of visual representations relate to their perception of its usefulness and their learning outcomes, especially in interactive learning environments. This study, using a math intelligent tutoring system, examines the relations between students' perceptions of the usefulness of using diagrammatic scaffolding and their actual patterns of spontaneous diagram use in for secondary-school algebra. Results show that students who evaluated diagrams as useful used diagrams more frequently but showed less learning gains, compared to those who evaluated diagrams as not useful and did not use diagrams frequently. We discuss implications of this finding by connecting with prior work that focuses on drawing as diagram use. This study shows the importance of understanding how spontaneous use of diagrams might or might not help student learning.

Exploring Associations Among AI Usage, Anthropomorphism, and Perceived Human Uniqueness in Adolescents

The growing prevalence of artificial intelligence (AI) prompts reflections on the nature of human identity, particularly regarding perceptions of human uniqueness. Adolescents today interact with AI more frequently than any previous generation, yet little is known about the psychological implications of AI on their development. This study explores the associations among AI usage, anthropomorphism, and perceived human uniqueness in adolescents. Through a survey with 487 adolescents aged 13 to 19, we found 1) older adolescents perceived less agency and experience in humans compared to younger ones, whereas no age-related differences were observed in AI usage, anthropomorphic tendency, and perceptions of AI; 2) higher AI usage and anthropomorphic tendency were associated with reduced perceptions of human uniqueness in both agency and experience; and 3) anthropomorphism could serve as a psychological mechanism linking AI usage and perceived human uniqueness. This study contributes to broader philosophical and societal discussions about AI and human uniqueness.

Toddlers Use Vocal Cues to Infer Male Dominance Over Females in Right-of-Way Conflict

Men hold more power compared to women, a phenomenon that is globally evident and widely documented. Preverbal infants mentally represent social dominance, and by the age of 18 months can distinguish male and female voices and associate them with faces of the corresponding gender. Here, we investigated whether 18- to 24- month-old toddlers (N=48) expected male-voiced agents to prevail over female-voiced ones in a right-of-way conflict. Using a violation-of-expectation paradigm, we found that toddlers looked longest when female-voiced characters prevailed in dominance conflicts, suggesting they expected that male-voiced agents would prevail instead. Acoustic features of male voices that perceptually suggest greater formidability/physical size (i.e., lower fundamental frequency) may account for this effect. Alternatively, vocal indicators of speaker sex might trigger very early conceptualizations about gender and dominance.

Continuity and discontinuity in children's number acquisition

Recent research has suggested that some children, after they learn the meaning of "four," proceed to learn the next few numbers one at a time (Krajcsi & Fintor, 2023). This claim is in direct contrast with previous theories that argued for an inductive leap after learning "four" (Carey, 2009). We assessed children's number knowledge using an adapted Give-N method that captured higher set sizes, and tests of executive function and working memory as an exploratory window onto the changes across stages. First, results support the claim that children exhibit partial-number knowledge beyond "four". Second, executive function was associated with children's jump from knowing numbers up to "four" (subset-knower stages) to knowing numbers above four, while working memory was associated with the change from partial to putatively full knowledge of number word meanings. These findings offer new insight into the conceptual change process and suggest a potential two-fold sequence in children's induction of cardinality.

Analogical Relatedness Between Exemplars of Schema-Governed Categories

The traditional perspective on analogical thinking has shown that relational similarity is key in determining analogical relatedness, outweighing entity similarity. However, evidence supporting this perspective comes from studies where the combination of the elements composing the compared facts does not activate schema-governed categories whose mismatch could compete with similarity between relations during the evaluation of analogical relatedness. In Experiment 1, we assessed the relative impact of common category membership and relational similarity on judgments of analogical relatedness. Pairs of events where only a common category was present received higher scores than pairs where neither a common category nor similar relations were present, and also than pairs maintaining only similar relations. In Experiment 2, we examined the extent to which judgments of analogical relatedness were affected by whether the compared situations fared similarly along relevant dimensions of the schema-governed categories to which they belonged. Ratings were higher for pairs where the analogs matched compared to pairs where they did not match. We concluded that in comparisons in which at least one of the events activates a schema-governed category, people assess analogical relatedness through criteria that depart from those postulated by traditional studies.

Hierarchical Abstraction Enables Human-Like 3D Object Recognition in Deep Learning Models

Both humans and deep learning models can recognize objects from 3D shapes depicted with sparse visual information, such as a set of points randomly sampled from the surfaces of 3D objects (termed a point cloud). Although deep learning models achieve human-like performance in recognizing objects from 3D shapes, it remains unclear whether these models develop 3D shape representations similar to those used by human vision for object recognition. We hypothesize that training with 3D shapes enables models to form representations of local geometric structures in 3D shapes. However, their representations of global 3D object shapes may be limited. We conducted two human experiments systematically manipulating point density and object orientation (Experiment 1), and local geometric structure (Experiment 2). Humans consistently performed well across all experimental conditions. We compared two types of deep learning models, one based on a convolutional neural network (DGCNN) and the other on visual transformers (point transformer), with human performance. We found that the point transformer model provided a better account of human performance than the convolution-based model. The advantage mainly results from the mechanism in the point transformer model that supports hierarchical abstraction of 3D shapes.

Neglect zero: evidence from priming across constructions

Recent studies use semantic structural priming to show that various cases of linguistic strengthening happen through a common mechanism: generation of implicatures through alternative-based (scalar) reasoning. In this paper, we used priming to investigate another group of cases, where strengthening is postulated to follow from the tendency to systematically neglect structures that verify a sentence by virtue of an empty configuration (neglect-zero): empty-set quantifiers ('at most/fewer than') and disjunction under a universal quantifier. We report data indicating semantic priming between these two structures, but not between them and scalar 'some'. We propose that 1. there is a common mechanism in use for strengthening constructions postulated to follow from the neglect-zero tendency, and that 2. this mechanism is different from the one involved in alternative-based reasoning.

Chain of Thought Still Thinks Fast: APriCoT Helps with Thinking Slow

Language models are known to absorb biases from their training data, leading to predictions driven by statistical regularities rather than semantic relevance. We investigate the impact of these biases on answer choice preferences in the Massive Multi-Task Language Understanding (MMLU) task. Our findings show that these biases are predictive of model preference and mirror human test-taking strategies even when chain of thought (CoT) reasoning is used. To address this issue, we introduce Counterfactual Prompting with Agnostically Primed CoT (APriCoT). We demonstrate that while Counterfactual Prompting with CoT alone is insufficient to mitigate bias, APriCoT effectively reduces the influence of base-rate probabilities while improving overall accuracy. Our results suggest that mitigating bias requires a slow thinking process which CoT alone may not provide as it tends to reinforce fast thinking model bias under some prompting methodologies. APriCoT is a step toward developing more robust and fair language models that can think slow.

Reading comprehension involves adding simple and composite discourse referents to a mental model

During reading, the mind continuously builds and updates a discourse model—a cumulative mental world representing the gleaned information. A key operator in this process is establishing novel discourse referents—entities in the model that can be picked out. For instance, ‘She bought wool, sponges and steel' establishes three simple referents (italicized). By contrast, ‘She bought sponges of steel wool…' establishes two simple referents forming a composite, itself referenceable (e.g., ‘…and glued them together.'). Here, in an online reading study, we target the cognitive basis of establishing simple and composite referents in the discourse model. Participants (n=43) read 72 five-sentence English stories sentence by sentence. The fourth, critical sentence featured: (i) three simple referents (‘wool, sponges and steel'; simple3); (ii) two simple referents (‘steel wool and sponges'; simple2), or (iii) two simple referents forming a composite referent (‘sponges of steel wool'; composite). Crucially, across conditions, critical sentences had identical lengths and lexical items, while remaining sentences were fully identical. A true/false comprehension task followed each story. We hypothesized that additional referents would increase reading times (RTs) beyond syntactic, semantic, and lexical factors. Multiple regression analyses showed significant effects of adding simple (RTsimple3>RTsimple2) and composite (RTcomposite>RTsimple2) referents. The effects appeared only on critical (but not on subsequent) sentences, possibly reflecting the cognitive operators of establishing, rather than maintaining, novel referents. Our findings pave the way for future work investigating hypotheses of hierarchical structure-building in mental discourse models.

Sketching with generative AI: verbal but not visual inspiration mitigates cognitive fixations

Symbolic visual sketching is a hallmark of human creativity, enabling the externalization of abstract concepts through figurative representations. Yet, creative expression can be constrained by pervasive conceptual associations—culturally learned mappings between abstract ideas and standard visual forms (e.g., a dove symbolizing peace). Generative AI has the potential to liberate such fixations due to AI's access to a broad range of content and ideas, but it remains unclear whether and how inspiration from verbal or visual modalities better mitigates fixations. Here, we hypothesized that the verbal modality induces greater conceptual divergence than the visual modality by bypassing perceptual constraints, whereas the visual modality may reinforce perceptually familiar mapping of visual representations. Participants generated sketches of abstract concepts (e.g., "time") before and after receiving GPT-4-generated verbal or visual inspiration. Drawings were analyzed using deep neural networks—by comparing perceptual features (VGG16-based) and semantic-perceptual content (CLIP- based)— as well as both human and GPT-4 scoring for creativity. We found that verbal inspiration significantly increased semantic distance and uniqueness, whereas visual inspiration led to minimal semantic divergence from the initial sketches. Importantly, low-level perceptual features remained unchanged across conditions, indicating that verbal prompts primarily influenced high-level conceptual framing of the sketches rather than their visual features. These findings demonstrate the effect of modality on mitigating cognitive fixations, with the verbal modality enhancing more unconventional visual sketching.

The causal role of counterfactuals in responsibility ascriptions to ignorant agents

It is now well-established that counterfactual reasoning takes place when people make moral judgments. Less is known about which counterfactuals lead to stronger moral judgment, especially when judging agents who unknowingly produce negative consequences. We explored the relationship between counterfactual salience and responsibility ascription in two experiments. In Experiment 1, we asked people to produce counterfactual alternatives to a vignette they read spontaneously. We manipulated whether agents who produced harm knew the relevant information beforehand and what the reasons for the possible ignorance were. The counterfactual type that people first came up with (e.g., related to external factors or agent's actions) mediated the relationship between the condition and responsibility ratings. Experiment 2 investigated the causal connection between certain counterfactual types and responsibility ascription. We show that guiding people to consider alternative perpetrator's actions leads to a higher tendency to ascribe responsibility than considering victim's actions.

Language assessment for multilingual children in Germany - developmental factors, environmental influences, and individual differences

Assessing language abilities in multilingual children is critical for identifying necessary support. The SPEAK project (German acronym for "Language assessment of multilingual children") validates a comprehensive test battery for 4- to 8-year-olds in Germany. This study reports on data from 207 multilingual children with 50 first languages other than German. The battery includes German versions of internationally established tools: Nonword Repetition Task (phonological complexity), Cross-Linguistic Lexical Task (vocabulary comprehension and production), and Sentence Repetition Task (grammar), alongside a parental questionnaire. Results show that age strongly predicts task performance, with earlier exposure to German improving phonology, vocabulary and grammar. Parental education also consistently predicts outcomes. Suspected Developmental Language Disorder negatively affects receptive vocabulary and grammar. Findings highlight the complex interplay of factors in multilingual language development.

TableCritic: Refine Table Reasoning via Self-Criticism and Tool Library

The rapid development of large language models (LLMs) has spurred their applications to tabular data. Prior research has investigated prompt engineering and external tool integration (e.g., code interpreters) to enhance LLMs' table comprehension, yet existing approaches struggle to generalize across diverse table-based reasoning tasks due to task-specific fixed workflows. Additionally, LLMs' inherent limitations, including hallucination and unreliable reasoning, necessitate iterative refinement mechanisms for robust performance. Inspired by cognitive-inspired problem-solving strategies, where iterative reflection and tool augmentation improve decision-making, we propose TableCritic, a functionally grounded computational framework for adaptive table reasoning. Following the functional-structural model taxonomy, our framework operates at the algorithmic level, implementing a dynamic feedback loop: (1) LLM-driven self-critique evaluates intermediate outputs and (2) tool-guided error correction. This task-agnostic architecture achieves performance transparency through modular tool composition while avoiding neurobiological implementation constraints. Experiments show that TableCritic significantly improves accuracy and outperforms baselines across various table-based tasks.

Games Agents Play: Towards Transactional Analysis in LLM-based Multi-Agent Systems

Multi-Agent Systems (MAS) are increasingly used to simulate social interactions, but most of the frameworks miss the underlying cognitive complexity of human behavior. In this paper, we introduce Trans-ACT (Transactional Analysis Cognitive Toolkit), an approach embedding Transactional Analysis (TA) principles into MAS to generate agents with realistic psychological dynamics. Trans-ACT integrates the Parent, Adult, and Child ego states into an agent's cognitive architecture. Each ego state retrieves context-specific memories and uses them to shape response to new situations. The final answer is chosen according to the underlying life script of the agent. Our experimental simulation, which reproduces the Stupid game scenario, demonstrates that agents grounded in cognitive and TA principles produce deeper and context-aware interactions. Looking ahead, our research opens a new way for a variety of applications, including conflict resolution, educational support, and advanced social psychology studies.

Using transfer learning to identify a neural system's algorithm

Algorithms generate input-output mappings through operations on representations. In cognitive science, we use algorithms to explain cognition. For example, we use tree-search algorithms to explain planning, reinforcement learning algorithms to explain exploration, and Bayesian algorithms to explain categorization. There are often many cognitive science algorithms consistent with a subject's performance on a task. How are we supposed to choose? It is natural to think of algorithms as causal models of brain processes. Thus, a natural method for choosing an algorithm is to look for parts in the brain corresponding to the steps of the algorithm. However, we haven't found many cognitive science algorithms using this method. This has led some to view cognitive science algorithms as merely normative, indicating the ideal input-output mapping without attributing any particular operation to the brain. It has led others to view cognitive science algorithms as merely useful fictions; useful insofar as they allow us to predict behavior, but fictional insofar as they inaccurately describe the causes of that behavior. They recommend explaining cognitive processes using other frameworks, such as dynamical systems theory. As an alternative, we suggest identifying a neural system's algorithm by assessing how quickly it learns alternative input-output mappings, that is, its transfer learning profile. The basic idea is that, depending on which algorithm is being used, different input-output mappings will be easier to learn, allowing us to recover its original algorithm from its transfer learning profile. We use artificial neural networks to demonstrate that this proposal productively applies to multiple networks and tasks. We conclude that transfer learning is a promising approach for integrating algorithms with neural networks and thus for integrating cognitive science with systems neuroscience and machine learning.

Theory of Mind and Social Anxiety in Emotional Attachment to AI Chatbots in Individuals with Autistic Traits

As conversational AI systems like ChatGPT become increasingly adept at socially engaging interactions, users are more likely to form emotional attachments to these technologies. This study explores the relationship between autistic traits and emotional attachment to ChatGPT, emphasizing the mediating roles of Theory of Mind and social anxiety. A sample of 286 participants completed the study. The structural equation modeling analysis revealed that Theory of Mind partially mediated the relationship between autistic traits and emotional attachment to the chatbot, while social anxiety did not show a significant mediating effect. These findings underscore the critical role of individual differences in shaping attachment to AI, suggesting opportunities for personalized designs and raising questions about the psychological implications of such bonds.

The effect of physical and psychological distances in everyday memory retrieval across older and young adults

This study examined how episodic memory performance in young and older adults is influenced by both the physical and psychological representations of locations in everyday life. Over five weeks, participants' GPS location data were collected via a smartphone app and later used in a memory recall test and post-survey. Results showed that both physical and psychological sparsity (i.e., the degree to which a location was spatially or psychologically distinct from others) positively affected memory performance in both age groups. However, only young adults exhibited an interaction effect between physical and psychological sparsity on response accuracy. This difference may stem from older adults' narrower GPS sparsity distribution and fewer location points, suggesting that their narrower range of visited locations was insufficient to reveal this interaction. Our study offers a novel contribution by quantitatively utilizing a psychological measure of memory representation through personalized data and analyzing its relationship with a physical indicator.

Towards Cognitive Synergy in LLM-Based Multi-Agent Systems: Integrating Theory of Mind and Critical Evaluation

Recently, the field of Multi-Agent Systems (MAS) has gained popularity as researchers are trying to develop artificial intelligence capable of efficient collective reasoning. Agents based on Large Language Models (LLMs) perform well in isolated tasks, yet struggle with higher-order cognition required for adaptive collaboration. Human teams achieve synergy not only through knowledge sharing, but also through recursive reasoning, structured critique, and the ability to infer others' mental states. Current artificial systems lack these essential mechanisms, limiting their ability to engage in sophisticated collective reasoning. This work explores cognitive processes that enable effective collaboration, focusing on adaptive theory of mind (ToM) and systematic critical evaluation. We investigate three key questions. First, how does the ability to model others' perspectives enhance coordination and reduce redundant reasoning? Second, to what extent does structured critique improve reasoning quality by identifying logical gaps and mitigating biases? Third, the interplay of these mechanisms can lead to emergent cognitive synergy, where the collective intelligence of the system exceeds the sum of its parts. Through an empirical case study on complex decision making, we show that the integration of these cognitive mechanisms leads to more coherent, adaptive, and rigorous agent interactions. This article contributes to the field of cognitive science and AI research by presenting a structured framework that emulates human-like collaborative reasoning MAS. It highlights the significance of dynamic ToM and critical evaluation in advancing multi-agent systems' ability to tackle complex, real-world challenges.

Cooperation, Deception and Theory of Mind in a Cyclic Game with Inter-Player Signalling

In the Mod game, actions are laid out on a circle. Each round, players choose an action simultaneously and gain points for each player they are one step ahead of in clockwise direction. Cooperation is rarely used. This article facilitates cooperation and deceit by adding a signalling phase where one player signals which action they will play. Our novel Mod-Signal game lets players cooperate by adhering to their signal, but they can also lie by playing a different action. In our experiment, humans play the two-player 24-action Mod-Signal game with an agent and with each other. While cooperative play is faster and yields more points, players predominantly lie and play non-cooperatively. Furthermore, our participants usually use no more than second-order theory of mind. While the Mod game is mainly played competitively, our Mod-Signal game can also be used to investigate cooperation and deception in the context of theory of mind.

Does cooking involve 'math'?: The relationship between math conception and math anxiety in Indian elementary and middle-school students

Math is all around us, but propensity to notice the role it plays in everyday life might differ from person to person. Here, we test whether children with broader conceptions of math experience lower levels of math anxiety. In Study 1, we gathered data from 98 Indian middle schoolers in Vadodara, Gujarat. Children who categorized more activities in a provided list as "math" demonstrated more positive attitudes towards math on a math anxiety scale. We also found that breadth of math category predicted how skilled children believed themselves on activities they included in their math conception. In Study 2, we explore when these effects emerge. We tested 94 children aged 7-10. We found that while children in this range exhibit significant variability in math conception, their breadth of math conception does not predict their math anxiety. We discuss implications of our findings for interventions to mitigate math anxiety in children.

Sparse coding generates efficient representations for autoassociative memories

We propose a two-layer computational neuroscience model for storing and retrieving sensory patterns in memory. The first layer, sparse coding, generates condensed yet explicit representations adapted to the statistics of natural scenes. The second layer, a complex-valued associative memory model, can store patterns generated by the first layer and recover partial or corrupted versions of them. We demonstrate the model's collective effectiveness at denoising and recalling sensory patterns from a dataset of natural images, with both layers providing complementary contributions to improving the peak signal-to-noise ratio. In addition, the invariance of the model to pairwise phase differences allows for partial generalization to similar scenes. Collectively, these principles are consistent with prior theory and experiments in neuroscience, and lead to potential predictions about inference mechanisms in biological neural networks.

Modeling the effect of cortical magnification on feature detection and individuation

Some researchers have argued that visual representations in the periphery differ qualitatively from those in the fovea (e.g., Balas et al., 2009; Freeman and Simoncelli, 2011; Rosenholtz et al., 2012). Consistent with this proposal, He et al. (1997) showed that crowding in the periphery disrupts the ability to individuate features but doesn't disrupt feature detection. We hypothesized that He et al.'s demonstration could be accounted for simply in terms of cortical magnification alone. We tested this hypothesis by presenting He et al.'s stimuli to a neurally realistic model of V1 (Heaton & Hummel, 2022) that incorporates cortical magnification but posits no other differences between foveal and peripheral early visual representations. The model's performance captured He et al.'s findings, suggesting that cortical magnification alone is sufficient to account for the differences between foveal and peripheral visual perception.

The origin of the possible: 12-month-olds' understanding of certain, likely, and unlikely events

To predict and prepare for near-future outcomes, infants must respond to the variability in their probability. Adults achieve that with modal concepts that quantify over multiple possibilities, but whether and how infants can do the same is unclear. In two preregistered habituation experiments, we asked whether infants can distinguish outcomes based on physical probability level (100% vs. 66% in Experiment 1. 66% vs. 33% in Experiment 2). 12-month-olds were habituated to events with 66% probability, and their proportion of looking at 100% (Exp 1., N=35) and 33% (Exp 2., N=24) events were measured before (i.e., baseline) and after habituation (i.e., test). We found that infants' proportion of looking at events with 33% probability (in Exp 2), but not at events with 100% probability (in Exp 1), increased from baseline to test. Thus, 12-month-olds distinguish likely events from unlikely ones but not from necessary events.

Judging the Judges: Displacing and Inverting the Turing test to Investigate the Interrogator

The Turing test typically evaluates machine intelligence by asking whether a human judge can distinguish between human and AI conversational behavior. But the test also serves as an evaluation of the judge, upon whose discriminative capabilities the merit of the test depends. We investigate this dependency by replicating two variations of the Turing test: (1) a displaced test, where human participants judge transcripts of previously conducted interrogations, and (2) an inverted test, where AI systems make similar judgments. Comparing these with traditional interactive tests, we find that displaced judges perform similarly to interactive judges, and LLM judges perform significantly worse than humans. This challenges assumptions about the importance of real-time interaction, and suggests that accuracy is not significantly impacted by displacement, but may be impacted by differences in a judge's model of human vs. AI behavior. Our results have implications for societal risks of AI, as systems that can consistently deceive both interactive and passive observers could enable large-scale online impersonation and manipulation.

Folk epistemological attitudes toward using virtual reality (VR) to learn about others

Virtual reality (VR) simulations purport to provide a uniquely immersive means of understanding experiences different from our own, potentially serving as "empathy machines". The utility of such simulations, however, is controversial. We examined people's preferences for learning about others' experiences of visual impairment and sexual harassment, through VR versus firsthand testimony. We find that people have a general preference for VR over testimony, expecting VR to be able to provide a high understanding of others' experiences. The preference for VR over testimony was more pronounced for learning about visual impairment than sexual harassment, and prior experience with sexual harassment reduced the perceived value of VR relative to testimony. These findings raise concerns about epistemic justice, as reliance on VR may undermine deference to firsthand accounts.

What's in the Box? Reasoning about Unseen Objects from Multimodal Cues

People regularly make inferences about objects in the world that they cannot see by flexibly integrating information from multiple sources: auditory and visual cues, language, and our prior beliefs and knowledge about the scene. How are we able to so flexibly integrate many sources of information to make sense of the world around us, even if we have no direct knowledge? In this work, we propose a neurosymbolic model that uses neural networks to parse open-ended multimodal inputs and then applies a Bayesian model to integrate different sources of information to evaluate different hypotheses. We evaluate our model with a novel object guessing game called "What's in the Box?'' where humans and models watch a video clip of an experimenter shaking boxes and then try to guess the objects inside the boxes. Through a human experiment, we show that our model correlates strongly with human judgments, whereas unimodal ablated models and large multimodal neural model baselines showed poor correlation.

Adaptive Social Learning using Theory of Mind

Social learning is a powerful mechanism through which agents learn about the world from others. However, humans don't always choose to observe others, since social learning can carry time and cognitive resource costs. How do people balance social and non-social learning? In this paper, we propose a rational mentalizing model of the decision to engage in social learning. This model estimates the utility of social learning by reasoning about the other agent's goal and the informativity of their future actions. It then weighs the utility of social learning against the utility of self-exploration (non-social learning). Using a multi-player treasure hunt game, we show that our model can quantitatively capture human trade-offs between social and non-social learning. Furthermore, our results indicate that these two components allow agents to flexibly apply social learning to achieve their goals more efficiently.

Reasoning about similar causal structures among mechanical systems

Across two experiments (N = 256), we test children's ability to recognize similar causal structures among mechanical systems. In Experiment 1, 4- to 7-year-olds were shown unique sets of three machine types (a causal chain, a common effect, and a common cause) and asked to judge which machines were most similar. We find that 6- to 7-year-olds, but not 4 to 5-year-olds, spontaneously match machines that share the same causal structure. However, all children relied primarily on timing cues when making similarity judgments. In Experiment 2, we control for timing cues, instead asking children to discriminate causal structure by observing an intervention on each machine. We find that, in the absence of perceptual cues, only 8- and 9-year-olds successfully matched machines based on structural similarity. We discuss potential explanations for these findings and consider ways to support recognition of common causal structure in the learning environment.

On the Separability of Human Navigational Behaviors in Virtual Reality

Human navigation is shaped by cognitive strategies, spatial awareness, and learned heuristics, yet existing models struggle to capture individual differences in wayfinding. To investigate the cognitive basis of navigational behavior, we conducted a virtual reality experiment where participants maneuvered around a human obstacle in a controlled, static environment. Using trajectory-based features, we classified participants with PartNet, a neural network that outperformed ElasticNet and Random Forest classifiers. While PartNet captured subtle yet consistent behavioral patterns, its interpretability was limited. To address this, we developed an analysis pipeline revealing key behavioral factors, showing that navigational styles differ primarily in midline adherence and speed. Clustering and embedding analyses further demonstrated participant separability, highlighting both individual distinctions and shared tendencies. By identifying structured variability in navigation, our work advances cognitive models of spatial decision-making, informing theories of wayfinding, predictive modeling of human movement, and applications in assistive navigation and urban design.

Validating Generative Agent-Based Models of Social Norm Enforcement: From Replication to Novel Predictions

As large language models (LLMs) advance, there is growing interest in using them to simulate human social behavior through generative agent-based modeling (GABM). However, validating these models remains a key challenge. We present a systematic two-stage validation approach using social dilemma paradigms from psychological literature, first identifying the cognitive components necessary for LLM agents to reproduce known human behaviors in mixed-motive settings from two landmark papers, then using the validated architecture to simulate novel conditions. Our model comparison of different cognitive architectures shows that both persona-based individual differences and theory of mind capabilities are essential for replicating third-party punishment (TPP) as a costly signal of trustworthiness. For the second study on public goods games, this architecture is able to replicate an increase in cooperation from the spread of reputational information through gossip. However, an additional strategic component is necessary to replicate the additional boost in cooperation rates in the condition that allows both ostracism and gossip. We then test novel predictions for each paper with our validated generative agents. We find that TPP rates significantly drop in settings where punishment is anonymous, yet a substantial amount of TPP persists, suggesting that both reputational and intrinsic moral motivations play a role in this behavior. For the second paper, we introduce a novel intervention and see that open discussion periods before rounds of the public goods game further increase contributions, allowing groups to develop social norms for cooperation. This work provides a framework for validating generative agent models while demonstrating their potential to generate novel and testable insights into human social behavior.

People use mixed strategies to make efficient but structured inferences about agents in roles

Roles are a pervasive part of our social landscape, but little is known about the mental models people use to reason about agents who occupy roles. In this paper, we test three computational models for role-based reasoning against participant performance in a social inference task. We find evidence that people exhibit mixed approaches which broadly track the computational efficiency of simpler models, but still retaining the structure of Bayesian inference models. These findings shed light on the mechanics of this important social cognitive system and pave the way for future work in this area.

Does Connecting the Processes and Products of Science Facilitate Learning? A Schema-Based Approach

Understanding both the processes and products of science are core components of science literacy, but do these types of knowledge interact during learning? We propose that Nature of Science (NoS) understanding can act as a schema to facilitate comprehension of science content. Across two experiments, we tested whether NoS lessons about theory change improve students' comprehension of psychology lessons which are centered around theory development. In Experiment 1, undergraduates who watched a NoS lesson showed improved NoS understanding, but this understanding did not lead to better comprehension of a matched psychology lesson compared to control. In Experiment 2, three NoS lessons were experimentally integrated into a college psychology course, preceding content lessons involving theory change. While this intervention did not improve learning, we found several relationships between science beliefs and academic performance. This work contributes to our limited understanding of how these distinct components of science knowledge interact during learning.

Do Large Language Models Recognize and Utilize Non-Mandated Pragmatic Enrichments?

Large language models (LLMs), despite being trained primarily on a word prediction task, show remarkable language production and comprehension abilities. Whereas larger and more recent models have achieved partial success on various pragmatic tasks, most have only been evaluated on their ability to draw "mandated" pragmatic inferences (e.g., implicature, presupposition) in which the felicity of a sentence is at stake. In this study, we focus on conversational elicitures (Cohen & Kehler, 2021), a type of non-mandated pragmatic inference that, in the class of cases considered here, involves the potential inference of a causal relation between a proposition denoted by a matrix clause and one derived from a relative clause associated with a direct object (e.g., in sentences like "Melissa detests the children who are arrogant and rude", the inference that the detesting is a result of the arrogance/rudeness). We investigate whether LLMs are able to draw such inferences and use them in downstream syntactic processing. Our results suggest that larger and more recent models do in fact exhibit these capabilities, at least to some degree.

Socially Situated Navigation: Social Rank and Sex Influence Spatial Navigation Strategies in Japanese Macaques

Primates' social interactions are grounded in temporal and spatial relationships, with physical proximity commonly used to assess affiliation, dominance, and tolerance. Yet proximity is often treated as a static, categorical measure rather than a dynamic, continuous process. Here, we combine computer vision and environmental markers to precisely quantify short-range social distances in two groups of Japanese macaques housed in large outdoor enclosures. Our social tolerance test results show that, when entering a food-baited circle, macaques positioned themselves at greater-than-chance distances from conspecifics, particularly to dominants. Furthermore, lower-ranking individuals tended to follow more indirect paths before approaching the food resource, suggesting they weigh social risks alongside physical positioning. By treating social proximity as a dynamic process, our study provides new insights into how primates navigate social and physical environments. This illustrates the potential of our method for more nuanced measures of group organization, tolerance, and decision-making.

Neural basis of individual differences in tonal effects on perceived duration

Studies in speech perception have consistently found that the perceived duration of a syllable is significantly influenced by the dynamics of the contour of its fundamental frequency (f0). Syllables with a dynamic f0 contour are perceived as longer than those with a flat f0, even though their acoustic duration is identical; high f0 syllables are perceived as longer than low f0 syllables of the same acoustic duration. Yet, while some listeners exhibit the expected perceptual normalization patterns, others show no f0-induced perceptual adjustments. This study investigates the neural foundation for this individual variability by examining listeners' scalp-recorded frequency-following response (FFR), a measure of phase-locked auditory encoding in humans that has been used to study subcortical processing in the auditory system. Our findings reveal that the FFR predicts listeners' duration estimation performance in different f0 contexts. Additionally, the FFR predicts the magnitude of the f0 influence on perceived duration, which highlights the complex interaction between sensory processing and speech perception.

Estimating Intuitive Physical Parameters using Markov Chain Monte Carlo with People

A central question in cognitive science is the degree to which human and animal brains have adapted to and internalized the physical laws that govern the motion of objects. In this project, we propose a new method to estimate aspects of our intuitive sense of physical laws. Rather than assuming that humans internalize the form of Newtonian physics as found on Earth, we instead designed a procedure which allowed us to estimate which forms of physical laws feel most natural and intuitive to human participants. Our approach combines Markov chain Monte Carlo with People (MCMCP) and a custom parameterized physics engine. Each proposal of the MCMCP chain instantiated a world with new physical parameters and participants judged which of two scenes seemed more natural. Preliminary results show that this approach can quantify the precision of people's estimate of the direction and strength of gravity.

Investigating implicit and explicit expectations in perceptual decision making

Expectations, or the prior probability of a choice outcome, are powerful sources of evidence for improving decision making under uncertainty. Most expectations in the real world are learned implicitly on the basis of statistical properties of observers' environments. However, most studies investigating effects of expectations on perceptual decisions explicitly instruct observers on prior probabilities within the experiment, and thus fail to capture the experience-dependent uncertainty of real-world expectation learning. Here, we report data from a novel expectation-guided perceptual decision making task specifically designed to address this gap. Human observers (n=21) learned, through experience, probabilistic relationships between cues and images. Then, they explicitly reported both an estimate of each cue's prediction and a confidence rating in that estimate before performing a cued perceptual decision task. We find that, although these measurements are highly correlated, confidence in an explicit report is the primary factor that interacts with implicit expectations to shape perceptual decisions.

Solving strategic social coordination via Bayesian learning

Repeated social coordination is a crucial aspect of daily life in which individuals strategically distribute labor and resources, often to accomplish complex tasks and goals. However, social coordination is also very challenging because humans often have competing interests, especially when successful coordination persistently leaves one party better off, entrenching inequality. Here we use a novel task, the Asymmetric Social Exchange (ASE) Game, to study how individuals learn to coordinate with different kinds of social partners and how individual trait variability on key social dimensions related to negative evaluation (i.e., social anxiety), impacts compliance with disadvantageous conventions (N = 675). Using two kinds of Bayesian models, one that learns from experience and one that builds a causal model of others' hidden motivations, we show that differences in coordination strategies arise from both individual learning differences and from expressed social preferences. Further, we find that social anxiety increases compliance with disadvantageous payoffs.

Parental Broad Autism Phenotype Traits and Their Influence on Early Social Interaction and Attention

Parental mental health subclinical features, such as stress, anxiety, and depression, have been reported to significantly influence the dynamics of parent-infant interaction, which sets the stage for early attention, learning, and social communication development. However, less is known about the influence of cognitive and social features, such as those related to broad autism phenotype (BAP) traits, despite their documented impact on attention control and sensory processing. The present study examines how parental BAP traits may relate to parent-infant interaction by focusing on their behaviors, using head-mounted eye-tracking to provide objective measures. Results indicated that BAP traits were related to rates of parent sustained attention and object handling but did not predict infants' sustained attention during the interaction. The findings of variability in parental play behaviors based on BAP traits raise important questions regarding the direct impact of parental characteristics on early social interaction, infants' potential adaptations to their learning environments, and the significance of BAP traits in infant development.

The Folk Ethics of Self-Defense: An Emprical Study on the Moral Permissibility of Killing Apparent Threats

Philosophers of self-defense debate whether it can be morally permissible to kill an aggressor who only appears to threaten you. In developing moral theories of self-defense, these philosophers sometimes make (untested) conjectures about what most people believe about self-defense. This paper aims to explore lay judgments on this issue. To do so, we conduct three pre-registered experiments manipulating the actuality of a threat. Across abstract and concrete scenarios as well as within-subjects and between-subjects designs, results consistently show that laypeople judge certain self-defensive killings morally permissible regardless of whether the aggressor poses a genuine threat or a merely apparent threat. These findings oppose Objectivist views on self-defense, which hold that self-defense is only permissible when facing a genuine threat. Instead, they support Subjectivism and what we call the "It's Complicated View", both of which hold that apparent threats can justify lethal self-defense (albeit with possible variation in permissibility ratings).

Training Methods in Categorization: A Comparison of Classification and Observation on Rule Adoption and Rule Consistency

This study compares classification and observation training in categorization tasks involving multiple rules, including an optimal XOR rule and suboptimal uni-dimensional rules. Participants (N = 192) were assigned to either condition, with classification involving active categorization and feedback, while observation involved studying the pair of category label and item together. Results showed that classification participants outperformed observation participants in accuracy and exhibited greater consistency in strategy use. Bayesian modeling revealed no significant difference in rule adoption between conditions, but classification led to fewer strategy switches and lower error rates. These findings suggest that classification training enhances performance by fostering stronger commitment to adopted strategies. The study highlights the importance of strategy commitment in categorization and questions the reliance on overall accuracy alone as a performance metric.

Improving Unsupervised Task-driven Models of Ventral Visual Stream via Relative Position Predictivity

Based on the concept that ventral visual stream (VVS) mainly functions for object recognition, current unsupervised task-driven methods model VVS by contrastive learning, and have achieved good brain similarity. However, we believe functions of VVS extend beyond just object recognition. In this paper, we introduce an additional function involving VVS, named relative position (RP) prediction. We first theoretically explain contrastive learning may be unable to yield the model capability of RP prediction. Motivated by this, we subsequently integrate RP learning with contrastive learning, and propose a new unsupervised task-driven method to model VVS, which is more inline with biological reality. We conduct extensive experiments, demonstrating that: (i) our method significantly improves downstream performance of object recognition while enhancing RP predictivity; (ii) RP predictivity generally improves the model brain similarity. Our results provide strong evidence for the involvement of VVS in location perception (especially RP prediction) from a computational perspective.

A "p < .05" Boundary Effect in the Encoding and Retrieval of p-values from Scientific Texts

‘Statistical significance' is more than just a label. Cognitive psychological theories suggest it may represent a mental concept of a category of p-values due to the pervasive practice of dichotomous interpretation of p-values. This paper builds on previous research identifying categorical boundary effects in the initial information processing of p-values by examining the encoding and retrieval of p-values embedded in the context of scientific abstracts. A sample of 30 U.S. graduate students in the psychological sciences read blocks of abstracts, then were prompted to recall certain details, including p-values. Results show that memory for p-values was skewed away from the .05 boundary, suggesting that training in dichotomous ‘p < .05' thinking may lead to categorical biases in memory for p-values. These results set up experiments to test mechanistic hypotheses of boundary effects on statistical cognition as well as the efficacy of teaching interventions to address and ameliorate these categorical biases.

Abstracts with Poster Presentation

Homogeneity Bias in Small Number Word Comprehension

Number words are often used in noun phrases referring to objects (e.g., "two balls"). This usage may suggest to children that numbers quantify same-kind sets. We investigated this hypothesis using a novel task that assessed English-speaking children's (aged 3-11 years) preference for same-kind sets when asked for two or three objects. Across four experiments (N=167) children's preference for same-kind sets was evaluated at the basic level (cows, horses, etc.), subordinate level (grey horses, brown horses), and superordinate level (animals, food). Children did not show any bias at the subordinate level, but they tended to produce same-kind sets when presented with items that differed at the basic or superordinate level. At the basic level, counting fluency was associated with children's production of same-kind sets. Our results suggest that children interpret number words as referring to same-kind sets and this bias is particularly strong in children with limited number knowledge.

The Ritualization of Complex Skill Acquisition: Cultural Transmission of Mastery of Indian Miniature Paintings

Cumulative cultural transmission preserves traditional knowledge and practices for generations, yet this phenomenon is understudied in the context of Asian artistic traditions. Indian Miniature painting provides an ideal context for studying the cultural transmission of mastery due to its apprenticeship model embedded in history and tradition. The learning and practice of Indian miniature painting provide an opportunity to examine whether these artistic processes resemble ritualized transmission, characterized by procedural rigidity, adherence to established norms, and religious symbolism. We interviewed 262 artists, ranging from novices to experts from Rajasthan, India, to explore the functions of ritualized transmission in acquiring mastery. The results revealed a strong consensus among artists regarding adherence to structural flow in art production with minimal deviation. Most artists agreed that novices should conform to traditional aesthetics more than experts. Mastery of technical skills and knowledge of cultural traditions were deemed essential before innovation, underscoring how ritualization requires strict adherence to established practices. For many artists, the process resembled a religious act, emphasizing customs, purity, and reverence. Our research highlights the functions of ritualization in acquiring mastery to ensure the preservation of distinctive aesthetics, uniformity in learning, and integration of spiritual and religious beliefs in the cultural transmission of traditional arts.

Finding structure in logographic writing with library learning II: Grapheme, sound, and meaning systematicity

Writing systems are structured to depict the various facets of human language, from sounds to meanings. Chinese writing, as a logographic system, offers a distinctive opportunity to study the structural relationships between written forms and their sounds and meanings all at once. In this companion paper to Jiang et al. (2024), we explore a computational model based on library learning that can capture the compositional structure of Chinese characters and their relationship to sound and meaning. We extend the written-only library learning framework from Jiang et al. (2024) by incorporating written-sound joint compression and distributional semantic representations. The joint compression component allows the model to uncover structural relationships between a character's graphical components and its pronunciation, mirroring the function of phonetic and semantic radicals in Chinese orthography. With distributional semantics, the model also learns systematic links between the graphical structure and the meaning of characters, enabling it to predict the meanings of unseen characters based on their constituent parts. Moreover, our model allows us to explore historical shifts in how written Chinese has represented spoken language. We anticipate that our library learning model to be a unified computational account of writing's interaction with multi-level structures of human language. Full paper available at https://jiang.gy/assets/pdf/jiang2025grapheme.pdf

Speakers strategically adjust their descriptions based on perceived memorability

When talking about the world in front of us, humans are remarkably efficient communicators. Our referential expressions help listeners efficiently find what we're talking about by strategically adding color or material words. But most conversations involve things not physically in front of us. In these cases, do we also use language to efficiently help a listener retrieve an item from memory? Across two experiments, we asked participants to describe images to help a listener recall them. In Experiment 1 (n = 600), participants spontaneously incorporated expectations about memorability by providing more description for images they expected to be less memorable. People's descriptions aligned more with subjective memorability estimates rather than objective, empirically-derived metrics. In Experiment 2 (n = 300), we replicated this pattern even when participants had no access to their listener's prior experience. Together, this work provides evidence that speakers spontaneously guide listeners' mental processes to effectively facilitate memory recall.

Repetitive Negative Thinking Naturally Emerges in a Model that Learns to Gate Affective Content into Working Memory

Why do we sometimes fall into repetitive negative thinking (RNT) patterns, such as rumination and worry? To address this question, I trained a meta-control model (Todd et al., 2008) on affective working memory (WM) content by providing a reward signal when items (representing thoughts) were gated into WM. The model still facilitated adaptive motor action, as in the original implementation, yet it also exhibited the defining characteristics of RNT. Specifically, its thought became repetitive (because it learned to selectively gate items into working memory); negative (because it persisted in, and only slowly extinguished, gating into WM negatively valenced on-theme and distractor items, respectively); and difficult to control (because, after extensive learning, it had a high chance of selecting the most probable WM gating strategy, irrespective of the value of an exploration parameter). Germane to clinically relevant RNT, such as pathological rumination and worry, the model's thinking was more unproductive when the valence of one thought sequence strongly biased the next one. This work helps to establish meta-control as a formal algorithmic framework for RNT. It may catalyze clinical research on RNT by providing a bridge to computational findings.

Dendrophilia versus continuity in hierarchical reasoning

Hierarchical reasoning might be qualitatively unique to humans or alternatively may arise from quantitative differences in cognitive resources. We tested hierarchical reasoning abilities across adults, children, crows, and monkeys, evaluating two hypotheses: the Strong Dendrophilia Hypothesis, which posits human uniqueness, and the Continuity Hypothesis, which attributes differences to variations in information-processing capacity. Using Bayesian modeling, we found that hierarchical reasoning (dendrocapacity) is not exclusive to humans, though the tendency to engage in it (dendroproclivity) varies across age and species. Adults exhibited the strongest dendroproclivity, while children and non-humans showed graded performance influenced by cognitive resource demands and task complexity. Hierarchical mechanisms such as Last-In-First-Out (LIFO) stacks were prevalent across groups. These findings challenge human uniqueness in hierarchical reasoning and suggest its emergence through incremental increases in cognitive capacities across development and evolution.

CoRe: Cognitive Reasoning Framework for Zero-Shot Table Understanding and Reasoning

While Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding, they still struggle with reasoning over table-based structured data, particularly in zero-shot settings. Tasks like question answering (QA), SQL generation, and numerical reasoning often fail due to insufficient task-specific training. To address these challenges, we propose CoRe (Cognitive Reasoning Framework for Structured Data), inspired by cognitive science principles of hierarchical and iterative reasoning. CoRe structures reasoning into multiple stages, allowing LLMs to better navigate the intricacies of table-based data. Evaluations using advanced LLMs, including Qwen-Plus, GPT-4o mini, and GLM-4-Plus, on datasets like HybridQA, BIRD, and DocMath-Eval show consistent performance improvements. CoRe outperforms zero-shot state-of-the-art (SOTA) on HybridQA and BIRD by 11.4% and 1.8% in Exact Match (EM) respectively, and achieves the best accuracy on DocMath-Eval's complong subset with 37.0%, approaching RAG-based SOTA. These results highlight CoRe's effectiveness as a robust framework for zero-shot reasoning over structured data.

Generation and Evaluation in the Human Invention Process through the Lens of Game Design

Humans do not just follow rules and solve problems created by others: we modify those rules, set new goals, and create new problems—so can we be inventors and innovators. Creating a good rule or a good problem, however, depends not just on the ideas you come up with but on how you evaluate such proposals. Here, we study invention through the lens of game design. We focus particularly on the early stages of novice, "everyday" game creation, where the stakes are low. We draw on a dataset of over 450 human created games and conduct a model-based analysis of how people invented new games based on prior experience. We consider two different cognitive mechanisms that may be at work during the early processes of intuitive game invention: an associative proposal based on previous games one has seen, and evaluation based on simulations of play. In particular, we aim to understand two possible evaluation schemes (model-free and model-based) that a commonsense-based game creator may use to refine their initial draft proposals. We find that the generated games are best described by a model which incorporates both rapid model-free evaluations and slower, model-based estimates of game quality at a population level. Our work serves as a step forward towards the proposal and evaluation process in human invention. See https://sites.google.com/view/gen-eval-game-creation for additional details and preprint.

Pareto optimality reveals the core computations of the human brain

The human brain supports complex behaviors through diverse functional connectivity patterns. We propose Pareto optimality as a novel framework to understand this functional organization. According to Pareto theory, systems optimizing multiple competing goals do so by balancing trade-offs along a low-dimensional "Pareto front" defined by archetypes that each optimize a single goal. Applying Pareto analysis to resting-state fMRI data (HCP, N=1200), we found that individual connectomes lie on a low-dimensional triangle. The three archetypes represent core computational goals: minimizing energetic cost, supporting cognitive control and goal-directed behavior, and enabling internal processing and memory. These goals are reflected in connectivity patterns, network topology, information flow, behavioral and clinical associations. The framework generalizes beyond rest to task-based brain states, and a simple neural model illustrates the trade-offs' computational basis. Pareto optimality offers a principled approach to decompose brain function into core computations across conditions, populations, and stages of life.

Quantifying Recursive Mentalizing Depth for Social Navigation

Effective social navigation requires individuals to infer others' intentions and adjust their movements accordingly to avoid collisions. A key aspect of this process is recursive reasoning, where individuals anticipate that others are also inferring their intentions. In this study, we quantitatively measured whether humans exhibit spontaneous mentalizing during navigation and developed a computational model to demonstrate the basic principle. Then, we introduced a novel framework for quantifying the recursive depth of human mentalizing during social navigation. Using a Doors-choosing task within a VR environment, participants navigated between two doors while avoiding a virtual human. Analyzing choice patterns, confidence levels and walking trajectories, we found that participants engaged in one or two levels of recursion, with respective probabilities of 80% and 20%. This study provides a quantitative estimation of the recursive depth of mentalizing in navigation and establishes a foundation for integrating human recursive reasoning into socially intelligent agents.

Learning Hidden Causal Factors from Psychometrics Data Using Distributional Information

Understanding latent variables and their causal mechanisms is central to psychological theory, yet most latent variable models in psychology have largely remained correlational. This work attempts to address three pivotal issues: identifying useful information from observational data that reveal latent causal factors, developing algorithms to leverage this distributional information, ensuring the identifiability of the recovered latent factors and their causal structure. We introduce a generalizable framework for discovering hidden causal structures from observed distributions in psychometric data. Applied to survey datasets on personality traits, teacher burnout, and multitasking behavior, our method uncovers hidden causal factors and their intricate interactions. Additionally, our findings offer an alternative perspective on psychometric scoring, grounded in the strength of the learned causal relations. These insights contribute to behavioral modeling and measurement and await further confirmatory studies to validate their implications for psychological science.

Memory reconsolidation: Exploring how to reactivate and modulate episodic memories

According to reconsolidation hypothesis, consolidated memories can be reactivated and become labile for a period of time, followed by a re-stabilization process that allows them to be strengthened, weakened, or updated. Novel research has begun to investigate if music-based interventions (MBI) could modulate memory reconsolidation. The present study aimed to address this issue through two experiments. First, Experiment 1 aimed to test a paradigm for reactivating episodic emotional memories by comparing different reactivation tasks. It was found that a reactivation task with incomplete reminders was able to reactivate emotional memories and to strengthen them by reconsolidation. Experiment 2 assessed if a MBI after the reactivation of such memories was able to modulate its reconsolidation. We found that listening to arousing music after memory reactivation interfered the reconsolidation process, reducing retrieval. Possible underlying mechanisms are discussed to continue the path toward application of the findings in clinical and educational settings.

Increasing effective charitable giving with personalized LLM conversations

Despite substantial charitable giving, donations often fail to maximize impact. While a variety of persuasive strategies can increase donations to effective charities, their success depends on individual differences. Large Language Models (LLMs) offer a powerful solution to this problem by dynamically personalizing persuasive strategies. In a pre-registered experiment (N=1952), we tested whether personalized LLM conversations could increase donations to the Against Malaria Foundation (AMF), rated one of the world's most effective charities. Participants allocated $1 between their favorite charity and AMF after being assigned to either: (1) a personalized persuasive LLM conversation, (2) a static LLM-generated persuasive message, and (3) a control conversation. Personalized LLM conversations significantly increased donations to AMF by 46.6%, outperforming the static message (28.7% increase). Personalized LLMs also shifted moral attitudes about charitable giving. Our findings highlight the potential of AI-driven personalization to enhance effective giving and provide new insights into the psychology of charitable persuasion.

Understanding the heterogeneity in learning mental actions

Individuals who can easily learn from the consequences of their mental actions are more likely to benefit from Cognitive Behavioral Therapies, which seek to replace maladaptive mental behavior with adaptive mental behavior. Although learning the optimal mental (cognitive) action is harder than learning the optimal motor (overt) action in general, not all individuals show this trend (Hitchcock & Frank, 2024). In this study, our goal is to understand the source of this heterogeneity so as to facilitate the targeted deployment of CBT to individuals who are most likely to benefit from it. To address our goal, the original task (Hitchcock & Frank, 2024) was modified with the aim of improving the internal consistency to allow measuring individual differences more robustly. We also collected data on candidate individual differences—working memory capacity, and traits such as perseverative thinking, the need for cognition, inattentiveness, and cognitive ability—to investigate whether these measures could predict differences in performance. Split-half reliability improved for most conditions in the current task and remained the same for one condition compared to the original task. Although participants were better at learning the optimal overt action compared to the optimal cognitive action in both training (p < 0.001) and test phases (p = 0.006), replicating the original results, performance in the cognitive condition was numerically higher than overt condition in 37% of the participants in the training phase and 38% of the participants in the testing phase, indicating similar heterogeneity across participants as in the original study. The need for cognition measure predicted higher accuracy in the cognitive (than overt condition) in the training phase (p = 0.046) and in the test phase (p = 0.001). This finding, if replicated, has important clinical implications as it could help identify patients who are most likely to benefit from interventions that rely on learning adaptive cognitive actions.

Testing Explanations for Why Math Anxiety Predicts Poor Math Performance

Math anxiety (MA) reliably predicts poor math performance (MP). Several theoretical mechanisms have been proposed to explain this relation, and numerous interventions have been designed to mitigate it. However, it is unclear whether these hypothesized mechanisms are distinct from one another, and thus it is also unclear whether different interventions target distinct or overlapping mechanisms. We developed indices for the four leading candidate mechanisms in the literature and used a competing mediation framework to test their unique and combined explanatory capacity. Combined, they accounted for 64% of the MA-MP relation, and 3 of 4 candidates provided unique explanatory capacity. While on the whole the current literature seems to have the right of it, our results also indicate there is not a single, unifying key to unlocking the MA-MP relation, and there is unlikely to be a single ‘magic bullet' intervention. Further theoretical and practical implications are discussed.

Using Gesture and Language to Establish Multimodal Conventions in Collaborative Physical Tasks

A quintessential feature of human intelligence is the ability to create ad-hoc conventions over time to accomplish shared goals efficiently. Prior research has primarily focused on unimodal communication or communication mediated by a 2D screen. We study how multimodal communication using gesture and language changes during physical collaboration. We paired human participants and used augmented reality to isolate voice and hand gestures. One participant saw a 3D virtual tower and provided instructions to the other participant, who constructed the physical tower. Participants became faster and more accurate by forming conventions that made use of both gestural and linguistic abstractions. Redundancy was used to emphasize a change in the established convention. Based on these findings, we extend unimodal probabilistic models of convention formation to multimodal settings while capturing various modality preferences. Our work serves as a building block for convention-aware and physically situated intelligent agents.

The influence of scene discontinuity on the relation between mind wandering and event boundaries

Mind wandering during film viewing can impede comprehension and learning, and prior studies have reported conflicting results regarding its occurrence at event boundaries in narrative versus educational films. This study examines whether scene discontinuity – changes in time, place, character, and action – affects mind wandering differently across film genres. Replicating previous findings, we found that in the narrative film, less mind wandering occurred at event boundaries compared to other parts of the film. Conversely, in the educational film, more mind wandering was reported at event boundaries. Our analysis revealed that the narrative film exhibited higher scene discontinuity at event boundaries than the educational film. Importantly, across both film types, greater scene discontinuity at event boundaries was associated with decreased mind wandering. The differing levels of scene discontinuity between narrative and educational films may explain the contrasting patterns of mind wandering observed, highlighting the influence of film structure on cognitive engagement.

Emotional Dynamics in Art Appreciation: Aesthetic Engagement with Realist and Surrealist Artworks

In a temporally extended perceptual encounter with an artwork, an ideal challenge presents a manageable degree of unpredictability. This unpredictability triggers emotional satisfaction derived from resolving perceptual uncertainties, which, in turn, leads to rewarding experiences that motivate further engagement. Here we simulated and amplified perceptual unpredictability across three stages: a blurred version of the artwork, a clear version, and prolonged exposure to the clear artwork, and measured viewers' dynamic emotional responses throughout this unfolding process. Our findings reveal significant associations between individuals' aesthetic experiences and shifts in their emotional responses, specifically concerning pleasure and arousal. Additionally, Participants exhibited stronger emotional changes related to pleasure and dominance when unfolding realist paintings (certainty expected), compared to surrealist artworks (uncertainty expected). Overall, the results suggest that viewers generally appreciate unpredictability and experiences that transcend their preconceived expectations, fostering deeper exploration of the scientific inquiry into the predictive processing in human aesthetic appreciation.

Changing Response Patterns: The cognitive Mechanisms behind Stereotype Threat Effects on Women

Stereotype Threat describes the negative impact on cognitive performance caused by the activation of negative social stereotypes. This study investigates the underlying cognitive mechanisms of gender-specific Stereotype Threat effects through a diffusion model analysis. An online experiment was conducted with 612 men and women, randomly assigned to either a Stereotype Threat or a control condition. Their performances in a mathematical task and an emotion recognition task were compared. To model the underlying cognitive processes, parameters of the drift diffusion model were estimated. Results showed no differences in accuracy as an effect of Stereotype Threat. However, women in the Stereotype Threat condition exhibited a higher threshold separation when performing the mathematical task, indicating more conservative response tendencies, compared to men and the control groups. The study addresses the need for potential interventions such as stereotype awareness to mitigate these effects and calls for further research into cultural and social influencing factors.

Tracking Uncertainty During Uncertain Tracking

Multiple object tracking is often studied in settings where objects are largely observable. However, tracking often occurs in settings with much greater uncertainty. Objects can frequently go in and out of view, requiring us to constantly update our estimates of where things might be, and assess whether or not something new has appeared. To accomplish this, people need to rely on top-down inferences to fill in the gaps of uncertainty. To study this phenomenon, we introduce a novel ``firefly tracking" paradigm, in which people need to estimate the quantity and dynamics of an unknown objects under highly sparse observations. We model human behavior on this task and demonstrate how probabilistic inference in a generative model captures human uncertainty during challenging tracking tasks. ------------------------------- Paper available at https://yonifriedman.com/publications/CogSci2025_Tracking-Uncertainty.pdf

Willingness for social sharing of emotion with conversational AI and humans in mediated communication: A comparison across different interfaces and motives

This study investigates how interface differences affect willingness for social sharing of emotion with conversational AI depending on motives (cognitive support vs. social-affective support vs. capitalization), while comparing social sharing of emotion with humans. Perceived impressions (warmth and competence) are examined as correlates of willingness ratings. Data from 195 Japanese undergraduates were analyzed. The results showed that for social-affective and cognitive support motives, participants preferred text-based modality over voice-based modality, particularly text-based modality without an avatar. For the capitalization motive, participants preferred interfaces with avatars. Moreover, perceived warmth was positively related to willingness for social sharing with AI for social-affective support and capitalization motives, whereas perceived competence was positively related for cognitive and social-affective support motives. A different pattern of results was found for social sharing of emotions with humans. This study provides novel insights that contribute to the design of conversational AI interfaces.

The Influence of Indirect Social Relationships on Human Social Navigation

In social navigation, individuals adjust their behavior based on social environment, often modifying proximity to others according to the strength of their relationships. While people generally maintain greater physical distance from strangers and allow closer proximity to familiar individuals, social relationships often extend beyond direct connections, forming complex networks of indirect ties. This study investigates how these indirect social relationships influence human navigation behavior. We quantified participants' relationships through self-reported ratings and constructed a social network. Our findings revealed that human navigation was shaped not only by direct familiarity but also by the broader structure of social networks. We developed a quantitative model integrating both direct familiarity and indirect social distance to predict human navigation and verified its robustness by showing how human navigation dynamically adjusts to variations in network structure. This study highlights the significance of social relationships and demonstrates their role in shaping how individuals navigate social environments.

The Vigor of Punishment: Control of Movement Vigor in Social Decision–Making

Individuals typically respond faster and with greater velocity when pursuing rewarding options. However, people sometimes forgo personal gain to punish unfair behavior. How vigorously do they engage in such costly punishment? We introduce a novel framework linking neuroeconomic decision-making to movement vigor. In Study 1, using a motor version of the Ultimatum Game, we found that movement vigor increased with offer value for accepted offers but decreased with offer value for rejected ones (costly punishments). In Study 2, we examined the factors driving this reversal. Using a social economic exchange task, we found that punishment vigor was not driven by either the self-incurred cost or the absolute cost inflicted on the other, but rather by the efficiency of the punishment, that is, the ratio of other-cost to self-cost. These findings suggest that when people incur personal costs to punish, movement vigor accurately tracks the weighting of other-inflicted costs against self-costs.

The Role of Eye Movement Consistency in Aging-Related Decline in Face Recognition

Face recognition is shown to involve a prolonged developmental process where one gradually develops a more consistent but idiosyncratic visual routine with increasing age. In contrast to this developmental trend, here we found that aging was associated with decreased eye movement consistency during face recognition, which in turn contributed to declined face recognition performance. Although aging was not associated with changes in eye movement pattern, a less eyes-focused eye movement pattern predicted poorer face recognition performance together with lower eye movement consistency and age in older adults, suggesting that the idiosyncratic eye movement patterns acquired during early adulthood continue to account for their face recognition performance. Their decreased eye movement consistency was associated with declines in selective attention and inhibition control, suggesting difficulties in the execution of the learned visual routines. These findings have important implications for ways to facilitate older adults' face recognition to promote healthy aging.

Architecture and the Self: Empirical Inquiry on Christopher Alexander's Theory of Living Structure

The Biophilia hypothesis emphasizes humanity's intrinsic desire to connect with nature, which in turn nurtures authenticity. Architectural designs that echo natural patterns can evoke feelings of wholeness, inspiration, and comfort, as proposed by architectural theorist Christopher Alexander in his Theory of Living Structure. This study empirically examined the connection between architecture and the authentic self. Participants engaged with Chinese philosophical texts from Chuang Tzu and The Analects of Confucius to explore their authentic selves. They then evaluated image pairs, assessing preference, liveliness, and self-connection, with one image exemplifying living structures characterized by multiple scales, varied patterns, and interconnected centers. Our participants exhibited a strong preference for living structures. Notably, individuals with lower susceptibility to external influence, an essential component of authenticity, were more likely to perceive living structures as self-connected. Our findings offer valuable insights into human-centered architectural design aimed at fostering authentic living.

Do infants use cues of saliva sharing to infer close relationships? A replication of Thomas et al. (2022)

In their 2022 study, Thomas and colleagues found that when observing third-party interactions, infants, toddlers, and children might use saliva-sharing as a cue for inferring relationship thickness. The present study is the first external attempt to replicate their key findings with infants aged 8.5 to 10 months (n = 50). We used the original stimuli and the original design, but instead of running the study online and coding gaze direction manually from video recordings, we tested infants in a laboratory and measured gaze behavior with an eye tracker. Our study successfully replicated one of the main findings of the original study (longer looking at the saliva-sharing actress) while failing to replicate the other one (looking first at the saliva-sharer). These findings confirm that infants rely on certain behavioral cues for mapping social relationships among third-party individuals.

Developmental Trajectories of Working Memory Updating from Early Childhood to Adolescence: A Meta-Analysis

Working memory updating, the process of replacing outdated information with new data, is a crucial cognitive function for learning that develops significantly throughout childhood and adolescence. However, the developmental trajectory and task-specific patterns remain inadequately understood. This meta-analysis examined 64 studies (N = 22,572 participants) investigating working memory updating performance across five age groups (3-17 years) using various updating paradigms. Results revealed a significant positive developmental trend, with the most substantial improvements occurring between the ages of 3 and 8 years. Additionally, task-specific analyses demonstrated various developmental patterns, with keep track tasks—the selective updating of relevant information from specific categories while discarding irrelevant data—showing the most pronounced age-related improvement. Together, these findings suggest that working memory updating follows a systematic developmental progression, influenced by task-based variations, offering valuable insights into cognitive development and potential applications for educational practices

Preschoolers can form conventional pacts with each other to communicate about novel referents

Learning language requires learning not only the content of language, but also how to use language to communicate. Iterated reference games provide a window into such skills, requiring rich communication as participants converge on mutually understandable names for initially novel referents. Some early experiments are interpreted as evidence that 4-5-year-old children cannot converge to the mutually understandable names needed to communicate in an iterated reference game. Here, we revisit young children's referential communicative abilities using a simpler, child-friendly paradigm. Across 51 pairs of children, we found that 4-5-year-olds successfully established reference with each other. Children were 85% accurate, and they often used descriptions similar to their partner's. These findings suggest that children's capacity to construct effective referring expressions in novel contexts emerges earlier than once thought, consistent with the view that children show early pragmatic competence in supportive contexts.

Dissecting the interplay between corpus properties, algorithm, and word segmentation performance

This study investigates how corpus-level properties in Korean child- and adult-directed speech shape word segmentation across four algorithms: Transitional Probability, Diphone-Based Segmentation, PUDDLE, and Adaptor Grammar. Utterance length consistently impacts segmentation, with shorter utterances improving performance, particularly for PUDDLE, DiBS, and AG. Word length affects transitional probability algorithms, while hapax legomena introduce challenges for forward TP and AG. Interjections negatively influence AG, but not the others, and larger corpus size benefits PUDDLE. Register effects are limited, with forward TP and PUDDLE performing better on child-directed speech. These patterns highlight algorithm-specific sensitivities, with utterance length emerging as the most consistent factor. Our findings underscore the importance of considering both input properties and algorithm design when studying word segmentation in Korean. Future work should explore cross-linguistic comparisons, larger balanced corpora, and the role of multimodal cues in segmentation.

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Zero-shot coordination (ZSC)—the ability to adapt to new partners in a cooperative task—is critical for human-compatible AI. While prior work has focused on training agents to cooperate on a single task, these specialized models fail to generalize to new tasks, even if similar. We study how reinforcement learning on a distribution of environments with a single partner induces general cooperative skills that support ZSC with many new partners on many new problems. We introduce two Jax-based procedural generators that create billions of solvable coordination challenges. We develop a new paradigm called Cross-Environment Cooperation (CEC), and show that it outperforms baselines quantitatively and qualitatively when collaborating with real people. Our findings suggest that learning to collaborate across diverse scenarios encourages agents to develop general norms effective for collaboration. Together, our results suggest a new route toward designing generalist cooperative agents that interact with humans without requiring human data.

Irrational Speaker or Wonky World: Modeling Prior Revision and Prior Update

Pragmatic accommodation is a key factor in maintaining smooth real-life communication, yet it is largely overlooked in pragmatic reasoning models such as the Rational Speech Act (RSA) model. The current study explores ways of extending the basic RSA model: revising the listener's belief of common ground, adjusting the listener's belief of speaker rationality, and doing both simultaneously. We evaluated model predictions for utterances varying in utility levels (i.e., how useful an utterance is for state updating) and sentence polarities. We find that (i) contra prior findings, low utility does not always trigger the expected extra inferences, but high utility does, (ii) higher utility is associated with higher speaker rationality, and (iii) our combined model predicts that lower speaker rationality and lower wonkiness co-occur. Theoretical implications of these findings are discussed.

The Role of Caregiver Linguistic Input in Infant Joint Attention and Early Language Development: A Longitudinal Study

This study investigated the relationships among infants' attention-following (AF) abilities, caregiver linguistic input, and early language development. Using longitudinal data from home observations (6–9 months) and structured lab assessments (9–12 months), we found that AF performance at 9–12 months was significantly correlated with receptive and expressive language scores at 12 months, though these associations weakened by 18 and 22 months. Caregiver attention-directing utterances were positively associated with both 6-month AF performance and 12-month language outcomes, although their frequency declined from 6 to 9 months, suggesting caregivers adjust their strategies as infants' AF skills develop. Although moderation analyzes did not reach statistical significance, trends indicate that higher levels of caregiver attention-directing speech might enhance the relationship between AF and language outcomes. These findings highlight the dynamic role of caregiver linguistic input in shaping infants' early attention sharing and language development. The results also inform intervention approaches for infants at risk for language delays.

Theories of Mind as Languages of Thought for Thought about Thought

What kind of thing is a theory of mind? How can we formalize the various theories of mind we study as cognitive scientists? In this paper, we argue that it is valuable to think of a theory of mind as a kind of programming language: one that is specialized for setting up and reasoning about problems involving other minds. Drawing on ideas from the theory and history of programming languages, we show how this perspective can help us formalize concepts in a theory of mind, precisely articulate differences between multiple theories of mind, and reason about how we develop our theories-of-mind over time. (The full version of this paper is available at: https://kach.github.io/memo/tomalot)

Seeing through Occlusion: Uncertainty-aware Joint Physical Tracking and Prediction

Humans can track objects and predict their motion even when they are temporarily occluded. How does the absence of changing visual evidence alter predictive beliefs about a moving object? In our study, participants were tasked with continuously anticipating the destination of a simulated ball in occluded and un-occluded 2.5D environments. Our findings reveal that humans actively update their judgments throughout the period of occlusion while making predictions grounded in physical realism, even as occlusion impairs accuracy. To model this behavior, we integrate perception with physical reasoning, unifying tracking and prediction. This is implemented via massively parallel probabilistic inference in a hierarchical generative model for the motion of intermittently visible objects, represented using the GenJAX probabilistic programming platform. This model predicts time-varying human judgments more accurately than alternative models, suggesting that humans integrate perception and physics to reason about occluded motion. Paper is available at https://arijit-dasgupta.github.io/jtap/.

Empathy in Explanation

Why do we give the explanations we do? Recent work has suggested that we should think of explanation as a kind of cooperative social interaction, between a why-question-asker and an explainer. Here, we apply this perspective to consider the role emotion plays in this social interaction. We develop a computational framework for modeling explainers who consider the emotional impact an explanation might have on a listener. We test our framework by modeling human intuitions about how a doctor should explain to a patient why they have a disease, depending on the patient's propensity for regret. Our model predicts human intuitions well, better than ablations suggestive that people do indeed reason about emotion when giving explanations. See https://sites.google.com/view/empathy-in-explanation for further details and pre-print.

Ensemble Physics: Perceiving the Mass of Groups of Objects is More Than the Sum of Its Parts

Imagine pouring a box of granola into a bowl. Are you considering hundreds of individual chunks or the motion of the group as a whole? Human perceptual limits suggest we cannot be representing the individuals, implying we simulate ensembles of objects. If true, we would need to represent group physical properties beyond individual aggregates, similar to perceiving ensemble properties like color, size, or facial expression. Here we investigate whether people do hold ensemble representations of mass, using tasks in which participants watch a video of a single marble or set of marbles falling onto an elastic cloth and judge the individual or average mass. We find first that people better judge average masses than individual masses, then find evidence that the better ensemble judgments are not just due to aggregating information from individual marbles. Together, this supports the concept of ensemble perception in intuitive physics, extending our understanding of how people represent and simulate sets of objects.

Surprisal and developmental sentence processing: exploring the role of language exposure through neural language models

Children's language processing differs from adults' in idiosyncratic ways. Within adult data, various incremental processing phenomena have been shown to be predicted by neural language models (LMs) using surprisal as the linking hypothesis, where processing effort is determined by a word's log inverse conditional probability. Since LMs seem without strong inductive bias for natural-language-specific structures, with their word predictions determined by training on naturalistic data, these results potentially support exposure-based theories. However, it remains unclear how well LMs explain the developmental trajectory of human language processing. Here we evaluate LMs with developmentally-realistic training data—how well they predict six established child language processing phenomena, including cases where child and adult patterns differ. Our LMs correctly predict four of the six, but fail in cases involving thematic-role and pragmatic knowledge. Our results highlight the limitations of language-exposure-based theories and call for further empirical research on human language processing patterns throughout development.

Reverse-Engineering an Intuitive Psychology of Power

Humans readily make inferences of the social power dynamics at play across a wide range of environments. This ability requires people to possess an underlying intuitive theory of power. We tested 3 candidate formal models as hypotheses of how people judge which of two players has more power across 30 different economic games: Relative Expected Utility (the difference in expected resources), Relative Control over Resources (difference in control over the other player's resources) and Relative Choice (the difference in the amount of options each player can choose from). Our results across 3 human experiments reveal that human power judgments are best captured by combining Relative Expected Utility and Relative Choice models as joint predictors. This finding suggests that people perceive social power by considering not only who is expected to achieve their desired outcomes but also the extent of control each person holds within their environment.

Change blindness and cross-linguistic spatial relationships: potential effects of language on attention

Languages vary in how they categorize spatial relationships, yet the extent to which these linguistic distinctions shape cognition remains unclear. Using a change blindness paradigm, we examined whether linguistic categories affect change detection or whether certain spatial changes are universally more visually salient. In two experiments we presented images with changes in spatial relations that varied as a function of distinctions made in languages tested (Experiment 1) and the extent to which the changes were within the same spatial relation category or between spatial categories (and physically ‘possible' or impossible; Experiments 1 and 2). In Experiment 1 (English speakers) there was limited evidence for perceptual ‘syntactic' spatial violation as a predictor of detection. In Experiment 2 (English, Dutch, Spanish, Japanese participants) we tested if cross-linguistic differences in spatial categorization influence change detection. While no systematic effects of linguistic categorization were found, results suggest that changes between categories of spatial relations are detected faster. Our results also highlight the importance of considering individual language use when investigating the effects of language on cognition.

Noisy template matching: A mechanistic model of approximate number perception

The approximate number system (ANS) enables humans to rapidly estimate numerical quantities without relying on counting, yet the computational processes underlying the ANS remain mysterious. Here, we propose a mechanistic model for the ANS, built on the core idea that when presented with a stimulus array, observers first form a template representation through ensemble averaging and subsequently engage in a template-matching process, whereby each array item is compared to the template. With a limited set of plausible assumptions about the inherently noisy computational processes, our model naturally accounts for Weber's law, the main hallmark of number psychophysics, systematic numerosity underestimation and the coherence illusion. We further test two novel predictions about stimulus factors that modulate the strength of the coherence illusion and demonstrate high fidelity between predicted and obtained data. This model offers new insight into the computational mechanisms by which ANS representations are derived from visual input.

GPT-4o Lacks Core Features of Theory of Mind

Do Large Language Models (LLMs) possess a Theory of Mind (ToM)? Research into this question has found that LLMs succeed on a range of benchmark tasks. However, these evaluations do not test for the actual representations posited by ToM: namely, a causal model of mental states and behavior. Here, we use a cognitively-grounded definition of ToM to develop and test a new evaluation framework. Specifically, our approach probes whether LLMs have a coherent, abstract, and consistent model of how mental states cause behavior—regardless of whether that model matches a human-like ToM. We test our evaluation against GPT-4o and find that even though it succeeds in approximating human judgments in a simple ToM paradigm, GPT-4o fails at a logically-equivalent task and exhibits low consistency between its action predictions and corresponding mental state inferences. As such, these findings suggest that GPT-4o's social proficiency is not the result of a ToM.

Prosody in the Age of AI: Insights from Large Speech Models

Prosody affects how people produce and understand language, yet studies of how it does so have been hindered by the lack of efficient tools for analyzing prosodic stress. We fine-tune OpenAI Whisper large-v2, a state-of-the-art speech recognition model, to recognize phrasal, lexical, and contrastive stress using a small, carefully annotated dataset. Our results show that Whisper can learn distinct, gender-specific stress patterns to achieve near-human and super-human accuracy in stress classification and transfer its learning from one type of stress to another, surpassing traditional machine learning models. Furthermore, we explore how acoustic context influences its performance and propose a novel black-box evaluation method for characterizing the decision boundaries used by Whisper for prosodic stress interpretation. These findings open new avenues for large-scale, automated prosody research with implications for linguistic theory and speech processing.

Children use both controllability and variability for generalization

Humans build causal models to navigate their environments, act effectively, and pursue goals. Prior work has examined causal controllability and variability separately, showing that even young children are capable causal learners who seek novelty, surprise, and confounded evidence. However, it remains unclear whether they prioritize controllability and variability when both are available. We presented children (ages 5–10) and adults with three virtual machines: one offering controllability without variability, one offering variability without controllability, and one combining both properties through systematic input-output relationships. Across age groups, participants overwhelmingly preferred the machine with both controllability and variability when asked to perform various new tasks, generalizing and applying its abstract functional structure to different inputs and modalities. For further details, please refer to our Philosophical Transactions A paper titled "Empowerment Gain and Causal Model Construction: Children and adults are sensitive to controllability and variability in their causal interventions."

Spatial separation impedes encoding of the whole

Proportional reasoning is a ubiquitous part of human experience. Despite this ubiquity, proportion appears to be more difficult to think about in some contexts than others. Specifically, people seem to struggle more with proportional information represented by visually separated parts versus parts integrated into a whole. Why is this? One possibility is that spatial separation deters people from treating the whole as a singular unit that they can use in reasoning about a proportion. Here, we report the results of a study that tested this hypothesis by probing participants' ensemble perception of the wholes that they are exposed to in a concurrent proportion comparison task. Study results support the hypothesis that spatial separation impedes encoding of the whole.

Interjections as Tools for Sharing Mental States

Humans are intuitive mindreaders. We use our Theory of Mind to infer other people's mental states based on how they behave. Yet, humans are also motivated to ensure that others can infer their mental states easily and accurately. However, to act on this motivation, we must have tools to help others efficiently understand our minds, particularly when our behavior could be misunderstood. We propose that interjections—simple vocalizations like oh, oops, and ew—are an important set of linguistic devices designed to reveal our mental states quickly and efficiently. We provide initial evidence for this account, showing that people believe that interjections ought to be used as if they were designed to broadcast mental states and that people spontaneously produce these interjections significantly more often in the presence of an observer. Our work sheds light on how humans are not only proficient mindreaders, but may also be adept mindsharers.

Abstracts with Poster Presentation (accepted as Abstracts)

Modulating categorization skills: The impact of transcranial Direct Current Stimulation (tDCS) on the Prototype Effect

We present two studies utilizing tDCS to investigate the impact of anodal stimulation at the Fp3 site on categorization learning indexed by the prototype effect. This phenomenon is characterized by superior categorization for unseen category prototypes compared to both seen and unseen category exemplars. In our double-blind experimental design, participants were randomly assigned to one of two groups: anodal tDCS or sham/control. In Experiment 1a, we observed a pronounced prototype effect in sham/control, demonstrating significantly enhanced categorization performance for unseen category prototypes over 'old' (previously seen) exemplars. Critically, the application of anodal tDCS diminished this effect, hindering performance on prototype stimuli. Experiment 1b provided further validation of this finding, indicating that anodal tDCS disrupts the prototype effect concerning old exemplars. Interestingly, this significant reduction in the prototype effect was not replicated with 'new' (unseen) category exemplars. We contextualize our results within the framework of the tDCS and perceptual learning literature.

An Enactivist Interpretation of Aphantasia

Aphantasia is a phenomenon in which the aphantasic subject fails to voluntarily conduct sensory imagination. That is to say, the mental image aphantasic subjects generated during sensory imagination is much less vivid than what normal people would report that they can generate. This paper proposes an enactivist interpretation of aphantasia, which suggests that aphantasia is the phenomenon resulted from a disrupted balance between the top-down processing by brain activity and the bottom-up processing by body-environment interaction, which leads to: 1) the conceptual content emerging in body-environment interaction overwhelms the phenomenal characteristics of the mental image generated within the brain, and 2) the "phenomenon absence" of the generated mental image signified by the real-time body-environment interaction becomes so profound that the mental image generated in sensory imagination fails to be conceptualized as vivid from the first-person perspective.

A Computational Model of Chinese Word Processing

Chinese has many distinct characteristics that differentiate it from alphabetic languages, making it necessary to develop specific theories and models for its study. Previous research on Chinese lacks systematic computational models for lexical and semantic processing. To address this gap, we built a computational model that simulates the processing of Chinese words presented in isolation. The model can process both single-character and multi-character words. Moreover, it can simulate orthographic, phonological, and semantic processing of words, as well as their interactions. These research findings will help clarify the cognitive mechanisms of Chinese reading and deepen our understanding of human language processing. The established model can guide experimental research and has significant theoretical significance.

The Oscillatory Dynamics of Narrative Structure-Building

While many theoretical models describe the process of structure building during the construction of a mental model of a story, neurophysiological explorations of this process are limited. Here, we use time-frequency analysis of EEG data to explore the oscillatory dynamics associated with narrative comprehension of comics. Using an existing dataset wherein participants viewed each of six comic panels serially, we performed spectral decomposition from theta to gamma bands over the full extent of narrative processing (10+ seconds). Power incrementally decreased in both alpha (8 -12 Hz) and low beta (12.5-20 Hz) frequency bands as narratives unfolded. These results are contextualized in the attentional literature, where some suggest that alpha and low beta frequency bands act as suppression and enhancement mechanisms to modulate attention. This model also aligns with theoretical models of narrative structure building. Study findings are consistent with changes in alpha and low beta power reflecting domain-general narrative structure building processes during discourse comprehension.

Long-Term Cognitive Trajectory Prediction in a Chinese Cohort of Middle-Aged and Older Adults Using Causal Machine Learning

As global life expectancy increases, cognitive impairment and dementia are becoming increasingly common. With limited effective treatments available, early identification of cognitive decline markers is essential for timely intervention. While many studies focus on predicting cognitive impairment, forecasting the trajectory of cognitive development offers greater foresight, enabling interventions even before diagnostic thresholds are reached. This study identifies 10 key determinants of cognitive trajectories and introduces a novel 2-stage model, CoTTA (Cognitive Trajectory Tracking Algorithm). CoTTA integrates causal inference with predictive modeling to forecast cognitive trajectories using data from 9,345 participants aged 50–80 at baseline year from CHARLS, a nationally representative sample of middle-aged and older Chinese adults. By leveraging causal features, CoTTA predicts the risk of consistently low cognitive function over an 8-year period, outperforming baseline models, particularly in recall and F1 score. This approach offers a scalable solution for early intervention and long-term cognitive health management.

CoEmo: Modeling Cognitive Processes in Facial Expression Recognition through Action Units and Gender Perspectives

Facial expression recognition lies at the intersection of computer science and cognitive psychology, yet the cognitive structure underlying facial action unit (AU) and emotion processing remains unclear. Are AUs and emotions processed in parallel or sequentially? Does gender influence this process? We constructed a 3D face dataset annotated with AU amplitudes and emotion labels. To model cognitive processing hypotheses, we implemented parallel and sequential architectures via multi-task learning and pipelined CNNs. Gender-specific models were compared using representational similarity analysis (RSA) with theoretical emotion spaces. The parallel model (F1 = 42.9%) outperformed the sequential one (F1 = 17.1%), supporting the parallel processing hypothesis. RSA revealed that females' emotion recognition aligned with social distance between emotions, while males' performance was selectively influenced by anger representations. These findings suggest sex-specific representational structures in emotion processing and support parallelism as a plausible cognitive mechanism in facial expression recognition.

Measuring Belief Expectancy Violations in Psychotherapy with EEG

In psychological counseling, how individuals respond to statements that align with or contradict their personal beliefs can significantly influence therapeutic engagement and outcomes. While prior research has investigated belief processing in decision-making and judgment, its neural underpinnings in therapeutic contexts remain underexplored. This study addresses this gap by using electroencephalography (EEG) to examine the neural correlates of belief expectancy violations, focusing on the N400 event-related potential (ERP) component. In an experimental paradigm simulating counselor–patient interactions, participants were presented with statements that either confirmed or contradicted their preexisting beliefs. Our results show that belief-incongruent statements elicited significantly larger N400 amplitudes compared to belief-congruent ones. These findings suggest that the N400 may serve as a neural marker of belief expectancy violations in counseling-relevant contexts. This work advances our understanding of the cognitive and neural mechanisms underlying belief processing in psychotherapy and highlights the potential of EEG-based measures in informing future psychotherapeutic approaches.

Agency vs. Guan: The role of parental beliefs in shaping children's agency

Agency is an important yet elusive concept in cognitive development. In China, parents commonly hold the traditional belief of Guan, which literally means control but aims to cultivate agency—being self-disciplined and proactive in learning. To understand how these two opposite notions coexist, Study 1 developed a questionnaire to assess both in parents and revealed a U-shaped relationship: moderate Guan beliefs align with lowest agency beliefs, while high and low Guan beliefs are associated with greater agency beliefs. Study 2 used a vignette-based task to test how these beliefs shape children's perceived agency in decision-making. In learning contexts, control-oriented Guan belief negatively predicts children's perceived agency, while agency beliefs positively predict it. Additionally, children's perception of agency in low-agency scenario predicts their strategic responses to maternal control. These findings highlight children's active meaning-making role and suggest that shifting from control to agency beliefs may better support children's agency development. Keywords: children's agency; Guan beliefs; agency beliefs; Chinese parenting; daily decision-making

Exploring the Impact of Cognitive and Sensorimotor Activity on Arousal in an Embodied Learning Environment

Embodied cognition theory posits cognition as fundamentally situated within and enacted through the affective and physical body and environment. Grounding in this theoretical perspective, we investigate learners' fluctuations in arousal state in the context of a math pedagogical tool, Balance Board Math, that invites children to explore concepts through bodily movement on rockable boards. Balance Board Math's design invites bodily movement as both a means to explore mathematical concepts and as a means to provide affectively-regulating sensory input to the vestibular (balance) sense. In this pilot analysis, we explore data collected with electrodermal activity wristbands (N = 9017 from 6 participants) to examine how their arousal states varied in relation to their cognitive-affective-physical activity as they explored and learned concepts through Balance Board Math movement-based activities. Using a mixed-effects regression model, we analyze how bodily rocking movements with different situated meaning within learners' problem-solving, as well as the impact of reaching fluency within each activity, impacted their arousal. We found that rocking movements undertaken to generate graphs had the opposite impact on arousal from non-instrumental rocking movements, and that reaching fluency with enacting and explaining solutions to movement-based math problems was associated with reduced arousal. These findings highlight the interplay of cognitive and physical drivers of arousal regulation in embodied learning environments.

Learning Trajectories and Generalization: Converging markers for Rule- and Similarity-based Processes in Grammar Learning?

In several cognitive tasks involving categorization, function learning, or artificial grammar learning, judgements can made on the basis of rules, i.e. abstract and general information about structure, or on the similarity of current instances to previously encountered instances. In this paper, we examine two behavioural markers for rule- and similarity-based processes in language learning: learning trajectories and generalisation performance. A rule-based process is expected to generate a sudden increase in performance after the correct rule is acquired. A similarity-based process is expected to generate a gradually increasing trajectory. Generalisation is operationalized as the difference in performance between test items that were encountered in training and new items. Performance under a rule-based process is expected to be same for both types, as an extracted rule can be applied equally well. Better performance for old instances compared to new implies a similarity based process. We utilise data from a study by Menks et al. (2022) where participants were implicitly exposed to grammatical rules of Icelandic, and perform grammaticality judgement tasks across several sessions. We apply a Bayesian latent mixture model to analyse individual trajectories and classify them as step-wise or gradual. We find individual differences in how performance changes over time, as well as differences in generalisation performance. However, the expected underlying process based on trajectories and generalisation performance for participants do not converge. The form of learning trajectory of an individual was not related to generalisation ability. Future research is required to test if a step-wise or gradual learning trajectory is exclusive to a rule-based or similarity-based process.

Happy Faces, Faster Stops: The Cognitive Benefits of Dance in Emotional Contexts

Dance is more than physical exercise; it integrates cognitive, emotional, and motor skills. This study investigated the role of dance in modulating response inhibition in emotional and non-emotional contexts. We compared dancers (N = 15) and non-dancers (N = 21) on two response inhibition tasks: the non-emotional stop-signal task (NESST) and the emotional stop-signal task ESST (examined inhibition in the presence of emotional distractors). Inhibitory control was similar between the dancers and non-dancers in the non-emotional stop-signal task. However, a significant interaction between group and emotion was observed in the ESST, which may indicate that irrelevant emotional information modulates inhibitory control differently in both groups. More specifically, stop signals with irrelevant emotional happy faces (compared to angry and neural) facilitated inhibitory control in dancers only. These findings suggest that dance training is associated with enhanced cognitive control in emotionally salient contexts, particularly when processing positive emotional stimuli.

Promoting Actions to Conserve Biodiversity: A Cognitive Constraints Approach

We demonstrate that inducing the construction of a coherent, biodiversity-conserving moral narrative about one's place in the world can have a lasting impact on pro-biodiversity behaviors. Across two studies (n=447 and n= 509), one-time under-40-minute interventions leveraging two basic cognitive constraints — coherence and causal invariance — led to increased intentions to take biodiversity-conserving actions (Phase 1) and subsequent self-reports of engagement in these actions assessed a year later (Study 2 Phase 2, n=344). This sustained impact contrasts sharply with the typically short-lived (< 2 weeks) effects of pro-environmental messaging. Participants completed exercises implementing the constraints to foster an expanded sense of self. Results show that the expanded self (e.g., agreement with "I imagine myself to be part of a larger cyclical process of living") mediated reports of engagement in biodiversity-supporting actions (e.g., donating to biodiversity organizations). These effects held across political ideologies, suggesting the approach's broader applicability to other persuasion topics.

The Effect of Task Features on Children's Number Ordering Performance

The development of numerical cognition is a critical foundation for children's later mathematics performance, with number ordering as a key predictor of advanced competence. This study examines how children's number ordering performance varies by number adjacency and size. A sample of 104 kindergartners arranged triplets of numbers in ascending order, with trials featuring adjacent or non-adjacent and small (1–10) or large numbers. Logistic regressions showed that adjacency, size, and their interaction significantly predicted performance, ps<.05. Children were more accurate with adjacent versus non-adjacent numbers and small versus large numbers. The interaction revealed the adjacency effect was driven by a higher accuracy on small adjacent versus non-adjacent numbers (p<.001); response accuracy was similar for large adjacent versus non-adjacent numbers (p=.27). These results, significant after controlling for verbal counting, ps<.001, underscore the importance of number adjacency and size in designing tasks to measure numerical skills.

Benchmarking LLMs for Mimicking Child-Caregiver Language in Interaction

Child-directed speech (CDS) is characterized by its adaptive nature: Caregivers not only talk to children, but engage in dy- namic interactions with them. The adaptive/interactive nature of this type of language is understudied in computational mod- eling research, particularly given the limited availability of nat- uralistic data. While recent advances in large language models (LLMs) have demonstrated potential for generating viable syn- thetic dialogue data in various domains, their ability to capture the dynamics of child-caregiver communication remains un- explored. This paper introduces a systematic framework for evaluating LLMs' capacity to generate developmentally ap- propriate CDS in interaction, examining both static linguistic features and dynamic conversational patterns. We evaluated state-of-the-art LLMs (GPT-4o and Llama 3) against natural interactions from the CHILDES dataset using both single- and multi-turn testing approaches. In single-turn evaluation, mod- els generated responses to individual child utterances, enabling direct comparison with actual caregiver responses. Multi-turn testing assessed sustained interaction capabilities through sim- ulated child-caregiver dialogues. Our results show that while LLMs can successfully approximate surface-level linguistic patterns after few-shot prompting, they struggle with higher- level communicative aspects, with excessive alignment and re- duced diversity compared to natural interactions. Our bench- marking framework elucidates both the potential and limita- tions of LLMs in generating data that preserves the essential properties of child-caregiver language in interactions.

Enhancing Educator Support in MOOC Forums: A Multi-Task Model for Detecting Learning Confusion

Despite the popularity of MOOC, only a small percentage of course participants complete the course. Learners' confusion is one of the factors that impacts the overall learning process and ultimately leads to course attrition. However, extensive exchange of reviews often creates chaos, resulting in 'confusion' posts that can easily be ignored. To this end, we create a labeled dataset for post-type classification and use the public Stanford MOOCPosts dataset to detect learning confusion. We propose a hierarchical multi-task learning framework that combines post-type and confusion degree classification using fine-tuned BERT models and virtual adversarial training. Our model performs with an accuracy of 89\% and an F1 score of 88\%. Additionally, we integrate interpretability techniques to further enhance model transparency. This framework equips instructors with tools to identify and address learning confusion effectively.

Augmented Proof: Examining Structures to Support Geometric Proof Comprehension

To understand even a modest geometric proof, students must process an interwoven combination of symbolic, diagrammatic, geometric, and logical information. This amount of information density presents a daunting management task that students are known to perform poorly on. To address this challenge, we propose a two-column proof interface that structures the information management task according to diagram configuration schemas (DCS). We evaluated our design by comparing secondary school students' performance on proof comprehension tasks with DCS augmentation to using a typical two-column proof. Students using the DCS-augmented interface demonstrated improved overall reasoning and accuracy in geometric proof tasks compared to the traditional two-column format. They were also significantly better at identifying and correcting mistakes in proofs. These results suggest that managing complex information by integrating it in a coherent schema--like DCS--can support student understanding of proof.

Unified Fusion Network Model for EEG Signals

Advancements in brain-computer interface (BCI) technology emphasize the need to understand brain signals in emotions and social interactions. Electroencephalograms (EEG) are essential for analyzing brain activity and diagnosing neurological disorders but suffer from low signal-to-noise ratios and high noise levels, hindering accurate interpretation. To address this, we propose a unified information-theoretic framework for optimizing EEG signal representation and fusion learning. Guided by this framework, we developed EEG-FNN, a Fusion Neural Network model that integrates raw EEG data with Gramian Angular Field (GAF) transformations through an innovative attention-driven fusion technique. This approach captures diverse neural activity patterns, significantly improving the ability to distinguish neurological states. Experimental validation on short-duration BCI and long-duration clinical datasets demonstrates that EEG-FNN outperforms existing methods, achieving higher accuracy and robustness, thus confirming its potential as a reliable tool for EEG analysis.

Priming Effects on Verbal Analogy Performance Across IQ Levels in Humans and AI

It is well known that verbal analogy (VA) performance correlates with IQ; we hypothesize high IQ improves VA performance due to more comprehensive relation representation, and therefore priming with congruent (incongruent) relations should especially benefit (hurt) lower-IQ individuals. We find a significant overall priming effect (p<0.001), IQ effect (p<0.001), and a quadratic IQ effect on priming (p=0.012): mid-IQ shows greater priming effect than high-IQ, but low-IQ participants show no effect – due to lower priming accuracy (p<0.001) and the positive contribution of priming accuracy to priming effect (p=0.04). Separately, LLMs show similar priming and IQ effects (when prompted to be low- or high-IQ), with a change in error pattern between congruent and incongruent priming just like humans, suggesting possibly similar mechanisms. Overall, these findings deepen our understanding of the role of relation representation in VA and the influence of IQ.

Are You Sure About That? The Impact of Semantic Relatedness on Learning Through Testing, JOLs, and Passive Restudy

Recent work has shown that producing memory ratings during study may lead to greater retention than practice testing in some circumstances (Higham, 2023). This may be related to a phenomenon called judgment of learning (JOL) reactivity, in which making immediate JOLs during study can enhance later recall. However, JOLs and testing have not been directly compared in a typical testing effect (TE) paradigm. This study compared passive restudy, study with immediate JOLs, and testing in a TE paradigm. In Experiment 1, we found no clear TE and only tentative JOL reactivity when word pairs were not semantically related. In Experiment 2, the associative strength of the word pairs was increased. A robust TE emerged along with weak JOL reactivity. Importantly, testing significantly outperformed JOL and passive restudy. These findings are among the first to suggest that semantic relatedness is crucial for the TE and clarify how JOLs compare to testing.

Cognitive Representation in Large Language Models: Formalizing Psychological Constructs for Automated Questionnaire Generation

Understanding how psychological constructs are formalized into measurable representations remains a central challenge in cognitive science. This study examines large language models (LLMs) as cognitive artifacts capable of decomposing and formalizing such constructs. We evaluate LLMs' conceptual alignment with human cognition across three tasks: predicting psychological attributes from text, generating theoretical construct explanations, and creating psychometrically valid assessment items. Results show strong representational fidelity for personality traits (average r=0.582), moderate alignment for depression (r=0.515), and emerging capabilities for anxiety (r =0.259). Expert evaluations confirm that LLMs produce theoretically coherent construct explanations (mean rating=4.33/5), while generated questionnaires exhibit convergent validity with established measures. Our findings position LLMs as effective tools for exploring cognitive representation systems and suggest that theory-driven prompting enables reliable, automated questionnaire development. This work bridges AI formalization techniques with core questions in cognitive science, offering a scalable framework for studying and constructing psychological measurement tools.

Probing Mechanical Reasoning in Large Vision Language Models

Mechanical reasoning is a hallmark of human intelligence, defined by its ubiquitous yet irreplaceable role in human activities ranging from routine tasks to civil engineering. Embedding machines with mechanical reasoning is therefore an important step towards building human-level artificial intelligence. Here, we leveraged 155 cognitive experiments to test the understanding of system stability, gears and pulley systems, leverage principle, inertia and motion, and fluid mechanics in 26 vision language models. Results indicate that VLMs consistently perform worse than humans on all domains, while demonstrate significant difficulty in reasoning about gear systems and fluid mechanics. Notably, their performance on these tasks do not improve as number of parameters increase, suggesting that current attention-based architecture may fail to grasp certain underlying mechanisms required for mechanical reasoning, particularly those pertaining to mental simulations.

Curiosity as a Morphism of Interpretation

Curiosity is often viewed as information-seeking to fill knowledge gaps arising from expectation violations, typically measured statistically. However, natural systems can be represented as binary relational graphs consisting of entity types, instances, and collections, all of which must maintain relational consistency to ensure a coherent system. This consistency is assured through closure, making its interpretation central to behaviors such as curiosity. When curiosity is defined solely in statistical terms, it fails to guarantee relational consistency - a critical limitation. In contrast, interpreting the three components through closure offers a robust mechanism for verifying relational coherence. Closure is achieved when all relations are in agreement; any deviation signals inconsistency. In the absence of closure, agents seek to achieve closure by bringing in evidentiary support from memory and reinterpreting the relational structure. In this view, curiosity signals act by morphing interpretation to achieve relational closure, enabling the acquisition of coherent knowledge.

Sparse distributed memory constraints drive representational change as a function of temporal learning sequence

Prior work suggests that different learning sequences—blocking (spacing out overlapping information) vs. interleaving (intermixing related content)—bias memory representations toward integration or separation (e.g., overlapping or distinct representations) to support different functions. However, findings on how sequences influence memory representations remain inconsistent. We propose that individual differences in memory capacity, encoding style, and their interaction govern the balance between memory integration and separation. Using feedforward neural networks, we modeled inference performance while varying memory capacity and encoding sparsity versus distributedness. We find that blocked training promotes integration when memory capacity is low, while interleaved training enhances integration when capacity is high. Sparse representations benefit from blocked schedules by orthogonalizing related information, whereas distributed representations favor interleaved schedules that promote overlap and integration. These results highlight the critical role of individual differences in memory capacity and encoding constraints in shaping the effects of training sequences on memory representations.

Empathy or Utility: Children's Reasoning for Moral Consideration of AI

Do children believe AI deserves to be treated morally as humans, and if so, why? In this study, eighty children aged four to eight were randomly assigned to interact with a human or AI partner in a naturalistic storytelling activity, after which their moral considerations and rationale were elicited. The results indicated that although children believed they should treat both AI and humans morally, the motives behind their beliefs differed. Children's moral considerations for humans were driven by empathy for others' needs and feelings, whereas their considerations for AI were motivated by the desire to preserve its functional utility. These motives were corroborated by children's denial of AI's capacity to have feelings or basic physical needs. This study contributes to the literature on children's moral development by demonstrating that while their moral reasoning extends to non-human entities, their justifications reflect a distinct, domain-specific understanding rather than an anthropocentric confusion.

How the logic of bargaining shapes moral judgments about resource divisions

For recent contractualist accounts of moral cognition, moral judgments should coincide with what rational agents would agree to in a negotiation, accounting for their relative bargaining positions. But past research documents widespread egalitarian moral intuitions; impartiality may also require abstracting away from power asymmetries. How can these perspectives be reconciled? We suggest a key difference lies in whether the logic of bargaining drives the interaction, turning existing asymmetries into bargaining power differences. In Study 1, two parties engage in a take-it-or-leave-it negotiation. In Study 2, they can trade with a third party. In both cases, third-party moral judgments about the morally best split of a fixed amount overwhelmingly favor the advantaged party. They can be precisely predicted using classic models from bargaining theory. By contrast, moral intuitions are completely reversed—reflecting redistributive or egalitarian concerns—in a donation setting where the logic of bargaining does not apply.

Can LLMs model meaning in restorying interventions?

While cognitive science has made great progress in modeling a range of psychological phenomenon, the processes underlying how people interpret the meaning of their own experiences has mostly resisted formalization. In this paper, we explore a method for using large language models (LLMs) to simulate the effects of this interpretive process. We compare our LLM-based simulations to extant data on restorying interventions— which show that people are more likely to endorse their life stories as meaningful after being prompted to reflect on how their experiences fit into the narrative structure of a hero's journey. Across three simulations, we show that (1) LLMs are capable of modeling the effects from these restorying interventions, (2) they are sensitive to signals from restorying interventions other than the hero's journey, and (3) this pattern of results is broadly—though not entirely—consistent across several different LLMs. Ultimately, these simulations point towards how LLM-based computational models might generate novel predictions about the effects of restorying interventions on meaning in human participants.

MEKiT: Multi-source Heterogeneous Knowledge Injection Method via Instruction-Tuning for Emotion-Cause Pair Extraction

Although large language models (LLMs) excel in text comprehension and generation, their performance on the Emotion-Cause Pair Extraction (ECPE) task, which requires reasoning ability, is often underperform smaller language model. The main reason is the lack of auxiliary knowledge, which limits LLMs' ability to effectively perceive emotions and reason causes. To address this issue, we propose a novel Multi-source hEterogeneous Knowledge injection meThod, MEKiT, which integrates heterogeneous internal emotional knowledge and external causal knowledge. Specifically, for these two distinct aspects and structures of knowledge, we apply the approaches of incorporating instruction templates and mixing data for instruction-tuning, which respectively facilitate LLMs in more comprehensively identifying emotion and accurately reasoning causes. Experimental results demonstrate that MEKiT provides a more effective and adaptable solution for the ECPE task, exhibiting an absolute performance advantage over compared baselines and dramatically improving the performance of LLMs on the ECPE task.

A Brain-Inspired Multimodal Sentiment Analysis Framework via Rationale-Guided Representation

Multimodal sentiment analysis (MSA) recognizes human sentiments with various data modalities. Existing works primarily focus on efficient feature extractors for each modality or multimodal fusion frameworks. However, they do not exploit the sentiment-related prior knowledge in the human brain, limited in their performance when sentiment cues tend to be more implicit and ambiguous. To address this problem, we propose a rationale-guided prompt learning optimization framework (RaPo) inspired by the sentiment chains of biological brains. Specifically, we adopt a chain-of-thought prompt to analyze images with a large visual-language model, generate the corresponding contextual captions and rationales, which are then combined to derive the sentiment prompt. Finally, the prompt is used to optimize the RaPo framework through the designed rationale-guided prompt tuning. Experiments on several MSA tasks consistently show an outperformance of several state-of-the-art methods, with an average increase in accuracy by 2.8%.

Relationship between comprehension and blink synchronization

Blink synchronization may reflect a person's internal states. An interested listener unconsciously synchronizes their blinks to a speaker; however, it is unclear if this is related to their comprehension of what is being said. This study examined the association of blink synchronization of a listener to a speaker based on the listener's interest, comprehension, and empathy. Participants viewed a video clip explaining a research topic as a simulation of actual communication that requires comprehension, and their blinking patterns were measured using web cameras. Aspects of the internal states were assessed using a questionnaire. The findings revealed that participants with high comprehension synchronized their blinks to the speaker with a delay of a few hundred milliseconds. Additionally, levels of comprehension were significantly correlated with blink synchronization among the listeners. Therefore, blink synchronization may reflect comprehension levels in situations that require comprehension. Communication can be improved by utilizing these insights.

A Cognitive Framework for Timely AI Communication

AI systems and technologies that can interact with humans in real time face a communication dilemma: when to offer assistance and how frequently. Overly frequent or contextually redundant assistance can cause users to disengage, undermining the long-term benefits of AI assistance. We introduce a cognitive modeling framework based on Partially Observable Markov Decision Processes (POMDPs) that addresses this timing challenge by inferring a user's latent cognitive state related to AI engagement. A key component is counterfactual reasoning: the AI considers how well the user would perform independently and weighs the potential boost in performance against the risk of diminishing engagement with the AI. This adaptive strategy outperforms baseline policies where assistance is always provided or never provided. Our results highlight the importance of balancing short-term decision accuracy with sustained user engagement, showing how communication strategies can be optimized to avoid alert fatigue while preserving the user's receptiveness to AI guidance

Cognitive Strategies in Solving Indeterminate Term Series Problems: A Comparison with Total Order and Symmetric Partial Order Strategies

Term series problems have been used to study deductive reasoning based on propositions by asking for the order relations of given terms. Some tasks, known as indeterminate term series (ITS), leave certain order relations unspecified. This study explored computational models of how people solve ITS, examining the mental models they form and the way they derive answers. We tested two strategies: a total order strategy, which constructs all satisfying total order mental models, and a symmetry strategy, which collapses interchangeable terms into a total-order-like symmetric model. Statistical analyses of experimental data suggest that participants used total order strategies more often for easier ITS problems and symmetry strategies more for harder ones. This finding supports the hypothesis that encoding the order relations by symmetric structures can reduce cognitive load when solving ITS problems.

Self-Persuasion: A Novel Cognitive Approach to Effective LLM Jailbreaking

Large Language Models (LLMs) have been proven useful for various tasks but remain vulnerable to malicious exploitation. Attackers can bypass LLM safety restrictions ("jail") through carefully crafted "jailbreaking" prompts. To evaluate LLMs' security, researchers proposed various jailbreak techniques based on optimization, obfuscation, or persuasive strategies. However, these methods treat LLMs as passive persuasion targets, which overlooks LLMs' ability to reason actively. We propose Persu-Agent, a novel jailbreak framework based on Greenwald's Cognitive Response Theory. We focus more on LLM's internal cognitive processing of a prompt than the prompt itself. Persu-Agent uses the self-persuasion strategy to guide LLMs in generating justifications and rationalizing responses to harmful queries. The experimental results on advanced open-source and commercial LLMs revealed that Persu-Agent achieved an average jailbreak success rate of 84%, surpassing existing SOTA methods. Our work provides valuable insights into understanding LLMs' cognitive traits and contributes to developing safer LLMs.

SpikeBERT: A Language Understanding Spiking Neural Network Learned from BERT with Knowledge Distillation

Spiking neural networks (SNNs) provide a biologically inspired approach to deep learning, yet existing SNN models for language tasks remain simplistic and shallow, limiting their simulation of complex brain activity. This work investigates how deep SNNs process language, potentially advancing understanding of human language cognition. We introduce SpikeBERT, a pure SNN architecture adapted from spiking Transformers for text processing, coupled with a novel two-stage knowledge distillation method. First, SpikeBERT is pre-trained via knowledge distillation from BERT using unlabeled text. Second, task-specific fine-tuning occurs by distilling knowledge from BERT models trained on labeled data. Experiments demonstrate SpikeBERT outperforms state-of-the-art SNNs and achieves BERT-level performance on language understanding tasks with significantly lower energy consumption. The study bridges computational neuroscience and AI, offering insights into neuromorphic mechanisms of language processing. This energy-efficient framework advances SNN applications in NLP while providing a computational model to explore biological language cognition.

Exploring the role of visual attention in relational reasoning across two cultures

Relations like "same" and "different" are among the earliest abstract concepts, and this type of reasoning may be central to human cognition (Gentner, 2003). While relational reasoning is a ubiquitous part of human experience and a key feature of cognitive development, the path that young learners follow as they begin to reason about relations varies across cultures, with children showing differences in both relational matching performance, and preference for relational solutions in an ambiguous context as early as 3 years (Carstensen et al., 2019; Carstensen et al., 2023). Which features of the environment prompt the early emergence of these cross-cultural differences? Several prominent accounts implicate early appearing differences in visual attention, which could facilitate or impair relational reasoning by highlighting the relevant relational context. We explore the role of visual attention in relational reasoning through attention priming with 381 children in two cultural contexts, the US and South Korea.

A Framework for Modeling Cognitive Processes in Intelligent Agents Using Behavior Trees

Advances in deep multi-agent reinforcement learning learning (MARL) enable sequential decision making for a range of exciting multi-agent applications. The black-box characteristic of MARL restricts the safe and scalable application of decision models in practical deployment. However, existing interpretability methods for deep reinforcement learning models are are not suitable for addressing challenges posed by multi-agent environments and often inadequate in generating logical sequential decisions. We present an innovative framework called BT4MARL, which introduces the behavior tree structure to explainable MARL. The proposed method clusters state space by aggregating temporally related states and divides agents into several groups in the new state. Based on these clustered states and agents, we constructs behavior tree structures. In this way, we use an exploration technique based on pairing a combined behavior tree with the target model. We empirically show that our framework is effective in four benchmark MARL domains. Moreover, the results of a user study show that the generated explanations significantly improve performance and satisfaction. This work represents a significant stride towards addressing the challenges of explainability and performance in MARL applications.

Cross-Linguistic Similarities and Differences in People's Understanding of the Structure of Number Words

Number words follow a set of principles that some have proposed to be universal. According to Hurford (1975), these principles include phrase-structure rules along with additional arithmetic constraints. However, languages vary in how transparently their number systems reflect these principles; for example, the English teens (e.g., twelve) and Turkish decades (e.g., yirmi) deviate from them. How do such irregularities influence speakers' understanding of number word structure? To find out, we examined grammaticality judgments for novel number words among speakers of Chinese (with a transparent number system) and English and Turkish (with less transparent systems). While all groups demonstrated a better understanding of the phrase-structure rules than the arithmetic constraints, Chinese speakers showed more sensitivity to both the rules and the constraints than English or Turkish speakers. The results pinpoint ways in which the structure of number words is available to all speakers and ways in which it varies.

Beyond Emotion: Unraveling the Limited Role of Sentiment in Extended-Format Communication

Human communication is shaped by various factors, including linguistic structure, social context, and cognitive capacity. Among these, emotion plays a pivotal role in significantly influencing message delivery and reception. While emotional impact is prominent in social media posts, its effect in extended-format, information-rich communication, such as TED Talks, is less understood. This study focuses on six basic emotions (anger, disgust, fear, joy, sadness, and surprise) and examines their effects on TED Talk popularity using the NRC Emotion Lexicon and a BERT-based sentiment analysis model. Our findings reveal a stark contrast between social media and TED Talks: most emotions, including high-arousal emotions, have no significant effect on TED Talk viewership, and in some cases, intense emotional expressions negatively impact views. This study highlights the limited role of emotions in extended-format communication and underscores the importance of appropriate emotional expressions, shaped by context and audience expectations. By integrating transparent dictionary-based methods with contextually aware deep learning approaches, we provide a comprehensive framework for analyzing emotion-driven engagement in diverse communication settings.

An Experience-First Approach to Autistic Pragmatics

Pragmatic atypicality is widely considered to be a central characteristic of autism. This is often explained as a consequence of Theory of Mind deficits. However, this account is flawed and biased. In this paper, we revisit the Double Empathy Problem and provide an experience-first approach to autistic pragmatics. We start with proposing a mechanistic explanation of a link between experiential differences and intentionality understanding in linguistic contexts using the Interpretive Sensory Access theory. Then, we explain how theories of common ground in communication involve factors beyond intention recognition and even beyond cooperation, highlighting how the egocentric nature of communication is relevant to one's attention and experiences. Taken together, we put forward an experience-based approach to understand autistic pragmatic atypicalities. This view is compatible with many other non-linguistic characteristics well-documented in autism, and prioritizes the experience of autistic people, instead of framing it as a communication disorder with a "mind-reading failure".

Adaptive Multiview Fusion Transformer for EEG-Based Emotion Recognition

Emotion recognition is crucial for applications like human-computer interaction, safe driving, and remote education, with EEG-based methods providing more authentic insights than facial or speech-based approaches. In this paper, we propose an Adaptive Multiview Fusion Transformer (AMFT) that effectively fuses Differential Entropy (DE) and Power Spectral Density (PSD) features in EEG signals. AMFT consists of three main components—a Multi-Perspective Embedder (MPE), a Dual Cross Attention Module (DCAM), and a ClsTransformerEncoder (CTE)—which use multiview projection and iterative cross-attention to combine DE and PSD features. Experiments on the SEED-VII dataset show that AMFT achieves higher accuracy and F1 scores in both multi-class and binary-class emotion recognition tasks, with improved stability compared to existing methods. Ablation studies confirm the key role of the multiview embedder and cross-attention module in boosting performance, offering new insights for EEG-based emotion recognition and biomedical signal analysis. Our code is available at https://github.com/sizhiyier/AMFT.

The Impact of Short-Term Model Familiarity on Two-Year-Olds' Word Learning

Children's word learning occurs in rich social environments. Prior research suggests that young children prefer familiar social partners, facilitating imitation learning. However, the extent to which short-term familiarity influences word learning and generalization remains unclear. This study investigated whether two-year-old children learn and generalize novel object labels differently when taught by a familiar versus an unfamiliar experimenter. Familiarity was established through a brief play session before the word-learning task. Unexpectedly, the results revealed no differences in whether children learned the words from familiar and unfamiliar partners. In contrast, vocabulary size significantly predicted word generalization performance. These findings suggest that while social familiarity influences certain types of learning, word learning may depend more on cognitive and linguistic abilities than on familiarity with the speaker. This study contributes to our understanding of early word learning by highlighting the robustness of children's ability to learn and generalize language in diverse social contexts.

Modelling the Effects of Emotional States on Driving Speed and Crashes

Driving is a complex task requiring immense coordination between an individual's mental and physical faculties. Though it becomes automatic with practice and experience, the driver must constantly process stimuli from the environment and react accordingly. An individual's emotional state, both in terms of arousal and valence, plays a part in how drivers interact with the variables on the road while driving, which may significantly impact control during driving. The current study explores the influence of emotions, particularly pleasant, neutral, and unpleasant, on incidents of crashes and average driving speed. Emotions, particularly negative emotions, potentially impact decision-making and may lead to lower risk perception, leading to higher average speed and increased number of crashes. The hypothesis anticipates that the unpleasant emotional states of drivers may result in a higher speed and increased number of crashes. For emotion induction, 95 drivers were exposed to three sets of images - pleasant, neutral, and unpleasant, from the International Affective Picture System (IAPS). They were instructed to drive on a driving simulator while navigating challenging scenarios like pedestrian crossings and taking a right turn while judiciously measuring gaps between an oncoming traffic flow. Data analysis was done using linear mixed models, and the results suggested that emotions significantly impact the number of crashes and average speed. It also indicated a notable difference in the number of crashes and speed between pleasant and unpleasant states. The results align with the available literature that claims negative emotions can lead to more risk-taking behaviour and, thus, higher speed and crashes. This study can be used to predict drivers' behaviour, while different states of emotions and interventions can also be provided to enhance driving safety. n summary, the study emphasizes the pivotal role of emotions in influencing road safety. Keywords: Traffic Psychology, Emotions, Accident Prevention

A Computational Account of Epistemic Vigilance: Learning from Selective Truths through Bayesian Reasoning

Strategic actors often manipulate others' beliefs not by lying outright, but through selective truth-telling—also known as lying by omission or paltering—by withholding crucial details while avoiding falsehoods. For example, a pharmaceutical-funded investigator might truthfully report that some patients improved, while omitting that most did not. To guard against such selective disclosures, listeners must engage in epistemic vigilance: critically evaluating information in light of the speaker's potential agenda. In this work, we develop a Bayesian computational model of this process. We present three key findings: (1) credulous listeners who assume informative intent learn quickly in cooperative settings but are highly susceptible to persuasion; (2) vigilant listeners who account for potential bias more accurately recover the underlying true world states, even from purely persuasive speakers—albeit with slower convergence; and (3) this robustness stems from their ability to discount biased framings by reasoning about alternative utterances the speaker could have chosen otherwise.

Quantifying Movement Coordination in Human-Robot Interaction

Human-robot collaboration necessitates effective coordination strategies to optimize joint task performance. This study investigates how robot morphology and collaboration structure influence coordination dynamics during a shared assembly task within a virtual reality environment. Employing analytical methods from dynamical systems theory—specifically, Recurrence Quantification Analysis (RQA) and Cross-Recurrence Quantification Analysis (cRQA)—we examine temporal patterns of interaction between human participants and robotic partners. Participants engaged in two modes of collaboration: sequential, where actions alternate between partners, and simultaneous, where actions occur concurrently. Findings indicate that sequential collaboration fosters more predictable coordination patterns, whereas simultaneous collaboration, despite initial instability, leads to enhanced efficiency as participants adapt over time. Furthermore, robots with anthropomorphic features, such as Baxter, facilitate smoother and more stable coordination but do not consistently improve task completion speed. Conversely, less human-like robots enable faster task execution, albeit with reduced initial coordination quality. These results underscore the trade-offs between coordination stability and task efficiency, offering valuable insights for the design of adaptive and effective human-robot teaming strategies.

Self-Association Makes It Easier to Track Multiple Objects But Within Capacity Limitations: A Multiple-Object Tracking Investigation

The Self-Prioritization Effect (SPE) suggests that associating stimuli with the ‘self' affords processing advantages in perception (Sui & Humphreys, 2012), attention (Keyes & Brady, 2010), long-term memory (Turk et al., 2008; Klein, 2012), working memory (Yin et al., 2019; Roy et al., 2023), and decision-making (Polman, 2012; Hu et al., 2019). An important aspect of working memory is the ability for continuous tracking, maintenance & updation. The Multiple- Object Tracking (MOT) paradigm involves constant attention and maintenance in working memory and online decision-making. The current study was designed to test the tracking performances for self- associated stimuli (neutral shapes and colored shapes) compared to stranger-associated stimuli (neutral shapes and colored shapes) in a multiple object tracking (MOT) paradigm. Additionally, we used three different set-sizes (6,8,10), where half of the shapes were targets, and half were distractors, to check for boundary conditions with respect to WM load. We found that with increasing set-size, tracking performance decreases significantly. Also, whether targets or distracters belong to the same (homogeneous) or different categories (heterogeneous) moderates the tracking performance of the participants, as target shapes becomes more complicated when the target and distracters are of the same type, i.e., the homogenous conditions. We found that the tracking performances of the self-associated shapes were significantly better than the stranger- associated target shapes when the target and distractors were of different categories (heterogeneous condition) but only upto set-size 8 due to working memory limits. Additionally, we found that self- associated shapes or colored shapes were significantly poorer in tracking when target and distractors were of same categories (homogeneous condition) within the working memory limits. These results suggest that individuals can pay more attention to self-associated stimuli, which are maintained in working memory and have better focus of attention, despite limited capacity WM resources. Keywords: self-association, tracking, multiple object tracking, working memory, visual working memory, identity, location.

To Be Right or To Belong – Prediction & Reward in Social Conformity

To test whether conformity reflects an intrinsic reward signal, we devised a multiplayer economic choice task that pitted monetary gain against group consensus. Specifically, we assessed whether hedonic valence elicited by majority alignment would transfer to contextual stimulus features. Contrary to the characterization of social conformity as reflecting an intrinsic utility of consensus, we did not find evidence of reinforcement of contextual stimuli based on decision unanimity. This failure cannot be attributed to unsuccessful majority alignment manipulation, since stay/switch behavior reflected an integration of consensus and monetary rewards, nor can it be attributed to a failure to obtain reinforcement of contextual stimuli, since such effects were observed for monetary payoffs. Intriguingly, individual differences in social anxiety predicted the influence of social alignment on contextual reinforcement and stay/switch behavior, suggesting that the utility of conformity is modulated by social affect. The results reflect an imitative basis of normative conformity.

A preliminary study to Assess Self-regulated learning and Academic Emotional Regulation of College Students Using Smartphones

Many of the self-regulated learning (SRL) skills and academic emotional regulation (AER) strategies of student life remain hidden. iSense continuous sensing app provides a novel method to monitor and assess the impact of social interactions, sleep patterns, app usage, and mobility on SRL skills and AER strategies. In a year-long study involving 211 university students, the iSense collected behavioral data from smartphones continuously and unobtrusively. The results revealed significant correlations between passive sensor data and SRL/AER self-reports such as task strategies, time management, situational selection, and social support. These findings enhance our understanding of students' learning and emotional regulation behaviors while offering a foundation for personalized learning support tools based on mobile sensing. By applying dynamic behavioral data, iSense demonstrates its potential for advancing contextualized learning analytics and psychological intervention systems.

Institutional preferences in the laboratory

Designing effective social and policy systems is a vital and forbidding challenge made more difficult because, in real-world settings, individuals don't just passively accept the static environments imposed upon them: they act both within and upon the social systems that structure their interactions. Should we expect player-driven changes to the "rules of the game" to benefit cooperation, as agents tweak their environment toward non-zero-sum games — or hinder it because of the challenges of constant change? We introduce a laboratory setting to test whether groups can guide themselves to cooperative outcomes by manipulating the strategic environment that structures their interactions. By offering players "first-order" choices within an economic game (agency over behavior) along with "second-order" choices between games (agency over the rules of the game), we understand emergent cooperation in naturalistic settings in which the rules of the game are themselves dynamic and subject to choice.

How the Systematicity of Relational Language Affects the Learning of a Compositional System via Changing Attention

It is well established that relational language with different structures (e.g., "top, middle, bottom" used together are more systematic than "on, in, under") can lead to differences in learners' relational representations. However, the underlying mechanism or processes are less specified. We advance the "language systematizes attention hypothesis" through two eye-tracking experiments in which undergraduate students learn to map novel spoken artificial language (object names + relational terms) to novel visual configurations (shapes + relative locations). Participants either heard more systematic relational terms (consistently referring to relative spatial locations) or less systematic relational terms (probabilistically referring to relative spatial locations). Results confirmed that more systematic relational language elicits more selective and sustained attention patterns in learners, compared to less systematic language. However, the benefit in behavioral accuracy depends on the task difficulty. Findings have implications for how to structure language to guide attention and enhance learning outcomes.

Chosen or Assigned: Exploring the Notion of Choice in the Manifestation of the Ingroup Bias

Social categorization processes often rely on observable, psychologically salient attributes, including ascribed characteristics (e.g., age, gender, nationality) and chosen affiliations (e.g., profession, ideology). These dimensions shape how individuals classify others into ingroups and outgroups, influencing perception, evaluation, and behavior. This study examined whether the strength of ingroup bias varies depending on whether group membership is endowed (inherent and unchosen) or agentic (self-selected). Using a within-subjects design, participants completed a perceptual matching task across two experiments: one comparing endowed groups (INDIA vs CHINA), the other comparing agentic groups (COGSCI vs GEOLOGY). Ingroup bias was measured through reaction times and accuracy. Results showed significantly stronger bias toward endowed groups, with a notable interaction between group type and task context. These findings suggest that the origin of group membership fundamentally shapes the intensity of social bias and offer new insights into the mechanisms underlying intergroup behavior and the dynamics of social categorization.

Existing Models May Not be Able to Explain Letter-Position Encoding in Hindi: Evidence from a Priming Study

Letter-position encoding is one of the constituent processes in visual word recognition. While the existing models attempt to explain letter-position encoding in English & other European languages written using the Roman Script, letter-position encoding in Hindi written using the Devanagari Script has not been studied in detail. Given that Hindi is spoken/read by over 520 million people in India and the unique properties of the Devnagari Script (Vaid & Gupta, 2002; Kandhadai & Sproat, 2010; Share et al., 2015), the current study sought to investigate letter-position encoding in Hindi. 66 participants performed a lexical decision task that employed six-prime conditions to compare hypotheses from a) the position-specific coding scheme, b) local context sensitive coding scheme and the c) position overlap coding scheme. Interestingly, the results showed that none of the aforementioned coding schemes could satisfactorily explain the obtained data. These findings may be used to question the generalizability of the extant letter- position encoding schemes to the relatively understudied languages such as Hindi, which use scripts different from the Roman script.

Children's emotion vocabulary learning discloses a growing understanding of specific concepts

Emotion categories are complex and fuzzy concepts that children must learn to identify and differentiate in themselves and others. While prior research has shown that children's emotion-related vocabulary evolves from broad to narrow as they age, the role of metrics such as word specificity within the development of emotion vocabulary remains under-explored. We use WordNet, a hierarchically-organized lexical database, to study word specificity in interview data collected from children on emotion labeling. We show that as children's age increases, they tend to use increasingly specific emotion words and we also analyze this in the context of concept learning. Further, we show that young children sometimes use words that are typically thought of as not being acquired until an older age, which are selected strategically for the given context. These findings provide new insights into understanding vocabulary and concept learning changes over age that contribute to the learning of fine-grained emotion category labels.

Susceptibility to Semantic Illusions: Attentional Consequences of Other Errors

The output of language processing is fallible and nonveridical, as illustrated by the meaning-based Moses Illusion: When asked "how many animals of each kind did Moses bring on the Ark?," many people say ‘two' even when they know the biblical story is about Noah, not Moses. Susceptibility to such semantic illusions (the failure to notice the semantic error) is often attributed to people creating detailed representations emerging from top-down processing over constructing representations bottom-up from linguistic input. We present three studies exploring whether presence of a second (non-semantic) error, such as a missing word or a typo, impacts detection of a semantic error. Our results suggest that presence of a second error distracts participants from the semantic illusion. Furthermore, our results do not provide clear evidence for the claim that noticing a non-semantic error would trigger a shift to deeper processing, suggesting that shallow processing may be a cognitive default.

Experimentally extracting implicit instruments

Models of events represent the interactions of the entities involved. In the event "The chef chopped an onion," a chef and an onion are explicitly involved, and the event results in a chopped onion. However, it is also implied that an instrument, e.g., a knife, must interact with the chef and the onion. In this study, we investigate the extent to which people model different implicit instruments in event representations. We find that people's representations of events reliably include instruments that are implied to be involved even when they are not explicitly stated in the event description. These findings are consistent across different sentence constructions of events, suggesting that implicit instrument representation is robust in comprehension of events. We also show that implicit instrument representation persists despite lexical priming of other items, and that the representations provide evidence for the disambiguation of the Instrument semantic role from other semantic role categories.

Can paradigmatic associations be implicitly formed through parallel contexts?

The ability to form paradigmatic associations plays a crucial role in language comprehension and generalization. Previous studies have demonstrated that paradigmatic associations can be implicitly formed through sequential contexts even in nonlinguistic environment. In this study, we additionally examined whether paradigmatic associations can be implicitly formed when the contexts are presented in parallel with the target items. The results did not provide evidence for forming a paradigmatic association. We have discussed several possible factors that may have contributed to the results, which would help generate future experimental designs that are more sensitive in capturing the formation of paradigmatic associations.

Mamba-CCA: An Efficient Framework for EEG Emotion Recognition

Emotion recognition from electroencephalogram (EEG) signals is critical for applications in mental health, human-computer interaction, and adaptive systems. However, existing methods struggle with modeling long-term dependencies and addressing the ambiguity between emotion classes. To address these challenges, we propose Mamba-CCA, a novel framework that combines Selective State Space Modeling (SSM) with a Class Confusion-Aware Attention (CCA) mechanism. Mamba-CCA leverages the efficiency of Mamba's linear-time modeling to capture both local and global temporal features in EEG signals while significantly reducing computational costs. The CCA mechanism further enhances classification by dynamically resolving ambiguities between emotional classes. Experimental results on the SEED and SEED-V datasets demonstrate that Mamba-CCA achieves state-of-the-art classification accuracies of 96.02% and 83.54%, respectively, surpassing the previous best model, CSET-CCA, by 0.84% and 1.48%. Additionally, Mamba-CCA reduces inference time by 20.12% and computational cost by 21%, making it highly suitable for real-time applications.

Gaze-Guided Learning: Avoiding Shortcut Bias in Visual Classification

Inspired by human visual attention, deep neural networks have widely adopted attention mechanisms to learn locally discriminative attributes for challenging visual classification tasks. However, existing approaches primarily emphasize the representation of such features while neglecting precise localization, which often leads to misclassification caused by shortcut biases. This limitation becomes more pronounced when models are evaluated on transfer or out-of-distribution datasets. In contrast, humans leverage prior object knowledge to quickly localize and compare fine-grained attributes, a capability especially crucial in complex and high-variance classification scenarios. We introduce Gaze-CIFAR-10, a human gaze time-series dataset, along with a dual-sequence gaze encoder that models the precise sequential localization of human attention on distinct local attributes. In parallel, a Vision Transformer (ViT) is employed to learn the sequential representation of image content. Through cross-modal fusion, our framework integrates human gaze priors with machine-derived visual sequences, effectively correcting inaccurate localization in image feature representations.

Conventionalization of Graphic Representations of Abstract Concepts and Metaphors in an Experimental-Semiotic Communication Game

Abstract concepts are a hallmark of human cognition and culture. However, there is debate over how they are cognitively represented, and how they emerge and become conventionalized within a community. One influential approach is that abstract concepts are based on conceptual metaphors. This study investigates the emergence and conventionalization of abstract concepts and metaphors in an experimental-semiotic referential communication game in which participants communicate abstract concepts and metaphors via drawing. The study sheds light on the different strategies participants use to evoke abstract concepts and shows that participants decrease the number of strategies they use over subsequent rounds of interaction, converging on more successful strategies and thereby initiating a joint process of conventionalization. Acknowledgements: This research is part of the project No. 2021/43/P/HS2/02729 co-funded by the National Science Centre and the European Union Framework Programme for Research and Innovation Horizon 2020 under the Marie Skodowska-Curie grant agreement No. 945339.

Evaluating Planning Through Play: Exploring the Use of Mini Games to Assess Planning Abilities

Planning, or the ability to simulate and execute a sequence of steps toward a goal, is crucial for success in many activities. However, common tasks used to measure planning often fail to correlate with one another, suggesting they may not assess the same underlying skill. To explore a novel measurement of planning, this study examined performance on four planning mini games, a non-planning control game, and three standard planning tasks. Results revealed that the planning mini games exhibited stronger intercorrelations than traditional tasks, suggesting they may capture a more consistent and unified planning construct. Notably, two of the selected mini games emerged as particularly promising paradigms for assessing planning skills with reduced confounds from processing speed. These findings provide initial evidence that mini games such as those explored here could complement or replace traditional cognitive planning tasks, offering an appropriately complex evaluation of the multifaceted skill of planning.

IMPROVISER: Multi-persona Co-creation System Enhances Story Creativity

Storytelling, a cornerstone of human culture, thrives on creativity—a domain where modern AI systems often falter. While these systems excel at maintaining narrative coherence, they frequently produce technically sound yet predictable narratives. To address this gap, we introduce IMPROVISER, a collaborative system grounded in psychological theories of creativity, specifically the Blind Variation and Selective Retention (BVSR) framework. IMPROVISER leverages multiple AI personas, each embodying distinct narrative styles, to generate diverse story variations. Users iteratively refine these ideas through selective retention, balancing novelty, coherence, and personalization. In controlled experiments, IMPROVISER outperformed single-persona and chatbot-based systems on creativity and diversity, producing longer, more spatially rich stories without compromising readability. These results empirically validate BVSR's role in computational creativity and establish a framework for human-AI co-creation, demonstrating how AI can amplify—rather than replace—human creative potential.

Evaluating the Linguistic Competence of Large Language Models: Experimental Evidence from Center-embedding Structures

This study investigates whether large language models (LLMs) possess human-like syntactic competence by examining their handling of English center-embedded structures. Two experiments were conducted: one collected acceptability judgments from native speakers, confirming sensitivity to syntactic constraints; the other measured surprisal values from GPT-2 and Gemma 2 on grammatical versus ungrammatical center-embedded sentences. While both models distinguished between the two types probabilistically, they failed to replicate categorical human judgments. The results suggest that LLMs conflate low-frequency constructions with ungrammaticality, reflecting limitations in hierarchical syntactic understanding. It is argued that genuine linguistic competence requires more than statistical pattern recognition and advocate for integrating formal syntactic theory into model development. This work contributes to the ongoing dialogue between generative linguistics and AI, highlighting key distinctions between human cognition and current LLMs.

The Impact of Taxonomic Levels on Classifier Choice in Mandarin Chinese

Classifier choice has been widely studied, with previous research highlighting the influence of semantic features such as shape and animacy. This study, however, demonstrates that classifier choice—specifically the selection between general and specific classifiers—is also influenced by taxonomic categorization, where nouns are divided into three levels based on specificity: basic (e.g., "apple"), superordinate (e.g., "fruit"), and subordinate (e.g., "golden apple"). A picture naming task was conducted and our findings reveal a tendency for individuals to favor specific classifiers when nouns are at the basic than at the subordinate level. This challenges the prevalent assumption that general classifiers are predominantly chosen. We attribute this tendency to a cognitive economy principle and propose a novel explanation for classifier choice based on the theory of Uniform Information Density, a perspective previously rejected in previous studies. Overall, this research suggests new directions for investigating the cognitive and linguistic factors influencing classifier choice.

Connectives and clause order modulate Chinese pronoun resolution

While Chinese null pronouns consistently demonstrate a subject preference, overt pronouns exhibit a subject preference in forward reference but not in backward reference. Previous studies have attributed this pattern to syntactically imposed constraints. However, Su (2020) proposed that this pattern arises from language-specific discourse constraints prohibiting subject referents for overt backward pronouns in temporal clauses but allowing them in non-temporal clauses. To examine Su's account, we administered an offline sentence judgment experiment with 100 Chinese-speaking adults. The experiment employed contextually ambiguous sentences, involving manipulations of pronoun form (null vs. overt), anaphora type (forward vs. backward) and clause/connective (deshihou ‘when', zhiqian/zhihou ‘before/after' and ruguo ‘if'). Our results partly confirmed Su's account: Chinese overt backward pronouns showed a subject preference in non-temporal ruguo ‘if' clauses and temporal zhiqian/zhihou ‘before/after' clauses, but not in temporal deshihou ‘when' clauses. These findings suggest potential influences of event structure on Chinese pronoun resolution.

Effects of Science Fiction on Creativity: A Meta-Analysis

Creativity is essential for success in both social and professional settings, and science fiction has emerged as a potential tool for enhancing this ability. The present meta-analysis aimed to examine the effect of science fiction on creativity. A systematic literature search was conducted across diverse electronic platforms and databases, resulting in a final sample of five studies with robust methodological quality. The meta-analytic estimate of the overall effect size indicated a medium effect that was not statistically significant. However, the substantial heterogeneity observed among studies suggests that the influence of science fiction on creativity may vary depending on the context or study characteristics. These findings indicate that while science fiction shows promise for enhancing creativity, further research is needed to clarify the conditions and mechanisms that optimize its impact.

Learning Cognitive Agent-driven for Graph-aware Communication and Double Reward Path Finding Models

Cognitive agent-driven path planning is an interesting task. Inspired by deep learning and stochastic strategies, we proposes a Stochastic Policy and Locally Observable model for cognitive-agent path finding (SPLO). First, we proposes a method to use feature dimension to resolve the over-smoothing problem by reducing the correlation between feature dimensions. Then, we introduce a novel sparse reward function to encourage agents to explore promising paths. This function provides intensive rewards without requiring the agent to strictly adhere to the global plan at every step. Finally, we design a flexible stochastic strategy to update and train the model. The stochastic strategy uses alternating training between the value function and the policy function to speed up learning and prevent the strategy from prematurely converging to the local optimal value.

Who Detects Better? A Comparative Study on Misinformation Detection by Humans and Large Language Models

Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. However, their ability to detect and react to misinformation remains an open question, particularly in comparison to human cognitive mechanisms. This study investigates how LLMs and humans react to misinformation by analyzing their performance across five categories of errors: intellectual, common sense, reasoning, misleading, and logical errors. We construct the ErrorQuestionDataset, comprising 346 misinformation-related questions, and conduct an empirical study involving five state-of-the-art LLMs (ChatGPT-4o, Gemini-1.5flash, DeepSeek-v3, Hunyuan-Large, GLM-v4Flash) and 251 human participants. Our findings reveal distinct response patterns: while LLMs rely on statistical correlations and pattern recognition, humans leverage contextual reasoning and domain-specific knowledge. The results indicate that LLMs generally achieve higher accuracy than humans in error detection tasks, but their performances lack depth in reasoning-based assessments. Additionally, we identify five primary performance types—affirmation, negation, hesitation, questioning, and off-topic reactions—providing insights into the cognitive differences between LLMs and human cognition. Our study contributes to the broader understanding of misinformation detection and offers implications for enhancing the robustness and reliability of LLMs in real-world applications. Our code and dataset are available at https://github.com/anonymous-submission8888.

Effects of AI Explanation Length on User Trust and Acceptance

This study investigated how the length of AI-generated text explanations in Japanese influences trust and acceptance in a route selection task, a subjective decision-making task. Experiment 1 demonstrated that AI explanations increased trust and acceptance compared with no explanation; however, there was no significant difference between the effects of 100- and 300-character explanations. Experiment 2 further explored the threshold of explanation length by including 50- and 1,000-character explanations. The results showed that 300- and 1,000-character explanations increased acceptance compared to 50-character explanations; however, no differences emerged among the 100-, 300-, and 1,000-character explanations. Additionally, trust ratings were unaffected by explanation length. These results suggest that AI explanations have a threshold; in this study, AI explanations exceeding 100 characters in Japanese (approximately 50 words in English) did not lead to further changes in users' decisions to accept or reject AI recommendations.

Where Responsibility Lies in Human-AI decision making: The Role of Knowledge and Importance

As AI and robots are used in society, the relationship between people and AI/robots is becoming increasingly important; when AI/robots assist people, there is a need to focus on where responsibility for the results of the assistance goes. This study focused on the importance of prior knowledge and topics about AI/robots to investigate who is held accountable when AI/robot assistance fails. The experiment was conducted in a three-factor mixed design, and the ANOVA analysis was based on data from 588 participants. Results showed that prior knowledge about AI/robots increased their pursuit of responsibility toward them and their developer/provider. Additionally, shared responsibility for the AI/robot developer or provider increased when the topic was very important to that person. This study helps to reduce the risk for AI/robots developers or providers when they assist people by clarifying who is responsible for failures that occur when AI/robots assist people in different situations.

Four Gifts from AI's Founders

Marvin Minsky, Claude Shannon, Allen Newell, and Herbert Simon were four of eleven participants at the 1956 Dartmouth Conference, where the field of Artificial Intelligence was named. Each of these pioneering researchers helped lay the foundation for future AI research. Four of their seminal ideas are: 1) Society of Mind (Minsky), 2) Information Theory (Shannon), 3) Problem-Solving Theory (Newell & Simon), and 4) Bounded Rationality (Simon). Society of Mind contains a hidden blueprint for developing safe, SuperIntelligent AGI. Information Theory helps us identify the datasets that can best catalyze the growth of intelligent systems. Problem-Solving Theory explains how AI agents can communicate effectively and how we can increase the safety of AGI. Finally, Bounded Rationality illuminates the type of SuperIntelligence we might expect in the future and how humans might remain relevant when SuperIntelligent AI becomes vastly more intelligent than us.

An Investigation of [l]-[n] Merger among Hubei Dialect Speakers: Cross-Linguistic Evidence from Perception and Production

This study investigates Hubei dialect speakers' ability to distinguish and identify Mandarin and English words with [l]-[n] onsets through AX discrimination and identification tasks, along with their production in Hubei dialect, Mandarin, and English. Results showed lower accuracy in discriminating [l] and [n] in English and Mandarin, especially with stimuli containing a high vowel, nasal coda, or diphthong beginning with a high or front vowel. Participants with more dialect use and later Mandarin acquisition were less sensitive to the contrast. Directionality of merger varied across participants. No significant effects of linguistic or social factors were found in identification, except a positive effect of initial high vowels in Mandarin diphthongs. Acoustic measures showed no significant differences, suggesting an [l]-[n] merger. These findings provide insights into non-native speech perception and production and suggest implications for pronunciation training.

Bright shiny garbage: Video content shown to low-income children is characterized by higher flicker

We use a computer vision model to examine inequities in the quality of videos shown to children from different socioeconomic backgrounds. This work is foundational to understanding the origin of divergence in children's cognitive development. We use our model to quantify visual salience across three categories of children's media: ad supported, paid, and educational public television. We find that ad-supported media contains significantly higher levels of flicker, a feature of visual salience linked to disrupted processing and worse learning outcomes (Essex et al., 2022; Shepherd & Kidd, 2024). These results represent the first to quantitatively demonstrate a difference in the quality of media shown to low- versus high-income children. These findings confirm that children from low-income families are watching more visually salient content and thus more at risk for the potential harms this type of content poses.

BDARec: Balancing Diversity and Accuracy of Recommendation Model with Graph Neural Networks

Based on research in cognitive psychology, humans typically seek a balance between their preference for familiar things and the exploration of new ones during decision-making. Therefore, studying the relationship between accuracy and diversity in recommendation systems is particularly meaningful. In recent years, recommendation systems based on Graph Neural Networks (GNNs) have garnered significant attention for enhancing recommendation accuracy or diversity. However, existing works often improve accuracy or diversity at the expense of the other aspect, which is inconsistent with the complex needs of users. In this paper, we propose a novel Recommendation model that Balances Diversity and Accuracy with GNNs, called BDARec. Firstly, BDARec proposes a balanced neighborhood aggregation strategy to select diverse and accurate neighbor nodes for updating node embeddings in user-item bipartite heterogeneous graph. Secondly, to accelerate the convergence of BDARec, an enhanced category-boosted negative sampling strategy is proposed to select negative samples from the same category positive samples with a certain probability. Thirdly, we put forward a dynamic feature for each item to measure the importance of items in training phase. Finally, we conduct extensive experiments on three real-world datasets. Experimental results show that our model can even improve recall by 22.04%, hit_ratio by 16.46%, and coverage by 10.27% when compared to the state-of-the-art comparison algorithm, which verifies that the proposed model can achieve the best balance between diversity and accuracy.

Interactive Dynamics with Concepts Varying in Abstractness and Vagueness

This work explores the social dynamics triggered by abstract and concrete concepts (e.g., fantasy vs. bottle) integrating traditional research methods with an interactive paradigm. It introduces Vagueness, a semantic dimension that distinguishes between determinate and vague concepts (e.g., subtraction vs. freedom). A preliminary norming study reveals that while Vagueness is partially predicted by Abstractness, it represents a distinct and meaningful semantic component. The main study further demonstrates that Abstractness and Vagueness uniquely shape how dyads evaluate the quality of their collaborative interactions.

The uncanny valley phenomenon triggered by a task-irrelevant dimension of objects

We feel uncanniness for human-looking robots (uncanny valley). Regarding the mechanism, we hypothesized that the visual system automatically tries but misses to categorize a registered object, and therefore negatively evaluates it. However, it is unclear whether the automatic categorization induced the negative evaluation since all previous studies directed participants to evaluate the object in terms of a previously categorized dimension. The present study examined whether the uncanny valley is automatically triggered by the categorization failure in a task-irrelevant dimension. Participants categorized a morphed figure, the shape of which has been known to trigger the uncanny valley, in terms of color, and evaluated the likeability of it. The uncanny valley occurred based on the task-irrelevant shape dimension of the objects, even though the preceding color categorization was successful. These findings suggest that the visual system evaluates the likeability of the registered objects in response to the automatic categorization and its failure.

Synesthetic Metaphors May Involve Conceptual Mappings

This norming study examines the metaphoricity of 99 English crossmodal expressions called "synesthetic metaphors", such as sweet voice and vivid memory. American English speakers rated how metaphorical each of the synesthetic metaphors is and how easily the synesthetic meaning of the adjective can be separated from its original meaning (adjective ambiguity). The results revealed four key findings. (1) Some synesthetic metaphors were rated more metaphoric than others, and their metaphoricity was positively correlated with their adjective ambiguity. (2) Synesthetic metaphors with sensory nouns were rated as metaphorical as those with non-sensory nouns that are assumedly more metaphorical. (3) Synesthetic metaphors that combine taste and smell words were rated low in metaphoricity. (4) The emotional valence of the adjectives did not influence the metaphoricity of synesthetic metaphors. These findings suggest that, contrary to some previous proposals, some synesthetic metaphors do involve metaphorical mappings and are more than mere evaluative expressions.

Computation in Context: A Comparison of Fraction Story Problems and Symbolic Arithmetic

Fraction arithmetic is an essential skill for students to advance to higher level math, but it is also important for daily life. The current study examines how sixth grade students with math learning difficulties perform on symbolic arithmetic problems and analogous word problems to determine what makes some fraction arithmetic easier for students to solve when given symbolic notation, and others easier to solve when problems are embedded in word problem contexts. We found that students make similar errors on both symbolic arithmetic problems and word problems, and they also struggle to identify the correct operation for a word problem.

Exploring spatial and temporal dynamics of language comprehension in the brain with CCG

In computational psycho/neurolinguistics, it has been investigated how the human incremental sentence processing is reflected in behavioral/neural data. Previous studies have been using a metric called node count, the number of the syntactic nodes from the represented trees, which predicts neural activity that presumably deals with the structural complexity in the sentence processing. However, node count does not dissociate different operations that derive the syntactic structures, and these distinct operations and other metrics derived from a grammar haven't been fully investigated for human neural data. In this respect, Combinatory Categorial Grammar (CCG), a linguistically well-motivated theory, was employed in this study. This work explores CCG-derived metrics to investigate whether these metrics contribute to predict the neural data (EEG and fMRI). The results revealed that these metrics improved the fit in relevant ERP components and the language-related regions.

Recommender Systems Issue Polarization in Social Networks

This paper explores how personalized recommendation algorithms may inadvertently create echo-chambers, leading to the emergence of political issue polarization in online networks. Using an Agent-Based Modelling (ABM) approach, we conducted two studies, simulating users in three distinct algorithm designs. In Study 1, we found that personalized recommendation algorithms had a small but significant effect on increasing polarization, even among rational Bayesians. In Study 2, we introduced users with novelty-seeking preferences. Contrary to previous literature, our findings suggested interventions targeting personalization bubbles are ineffective, as introducing novelty-seeking preferences had no significant effect on reducing polarization levels. Together, our findings highlight the importance of algorithmic influence in creating online polarization, offer implications for social media network design, and urge caution regarding existing interventions aimed at minimizing polarization.

PiPMRE: A Pipeline Based on Language Model for Medical Relation Extraction

Medical relation extraction (MRE) is commonly known to extract entities and their relations jointly from a medical text, which has attracted much attention in recent years. Previous studies treat MRE as a sequence tagging task, which results in either a challenging design of the tagging schema or a failed extraction of multiple relations, due to intricate relationships among medical entities. In this work, we review the task from a linguistic perspective and propose a novel pipeline framework, PiPMRE, developed on language models to enhance MRE performance. Specifically, PiPMRE consists of a relation generator and a relation filter. Given a text, the generator first yields multiple relational triplets, and then the filter scores each triplet and retains only those that pass the borderline as the final results. Implementing PiPMRE requires no tagging schema, instead, we use a simple template to reformulate the input text while ensuring entities and relations are generated in contextual order. Extensive experimental results on two public datasets demonstrate the advancement of PiPMRE. It surpasses the previous state-of-the-art by an average of 5.6 recall points and 4.4 accuracy points. PiPMRE's superiorities are also demonstrated in few-shot settings.

Efficient Multi-dimensional Optimization in Abstractive Summarization via Mixture-of-Learnable Prompts Tuning

Human beings prefer to read concise summaries that rephrase the exact ideas of a document using novel statements. Consequently, previous works endeavor to coordinate the faithfulness and abstractiveness of automatic summarization, yet this leads to increased computation or data overhead. To address this problem, we propose a novel prompt tuning approach, MoLP, which allocates the optimization of parallel objectives in abstractive summarization to learnable prompts and effectively relieves the cost burden. Inspired by the neural mixture-of-experts model, MoLP learns input-specific expert prompts to optimize saliency awareness, faithfulness, and abstractiveness, respectively, and learns a task-specific router prompt to weigh and polymerize the experts' effects. More importantly, these lightweight prompts are learned from separate tasks, each built upon a heuristic summary of the same document, significantly saving computing costs and improving data utilization. In experiments, we plug MoLP into frozen language models following the classical prompt tuning setting. Extensive evaluations across four benchmark tasks witness the model-generated summaries with simultaneously improved faithfulness and abstractiveness scores. Few-shot learning tests also underscore the advanced generalization of our method.

Psychological Flexibility and Coping Strategies Influence Well-Being: The Mediating Role of COVID-19 Related Stressors During the Second Wave Among Indian Students

The detrimental impact of the long-overdue pandemic has been widely acknowledged. However, the varied responses to its unprecedented challenges, particularly among young adults in college, are not well known. This study explores the role of psychological immunity in COVID-related trauma among Indian students. A significant proportion of students reported psychological distress, depression, and stress during the second wave between May and June 2021, with COVID-19 infection and related worries correlating with poorer mental health. Mediation analysis indicated that psychological flexibility negatively predicted distress, depression, and stress, while avoidance coping strategies showed a positive association with these outcomes. These findings suggest that psychological flexibility serves as a protective buffer against the impact of the pandemic, fostering resilience, while avoidant coping exacerbates its adverse effects. Interventions like Acceptance and Commitment Therapy (ACT) may enhance psychological flexibility and mitigate maladaptive coping, improving student mental well-being.

Adaptive Spaced Retrieval Practice in Algebra I: A Classroom-Based Study

Despite a robust body of research supporting the benefits of spaced retrieval practice, few studies have implemented this approach in mathematics education within real-world classroom settings while addressing individual differences, such as variations in students' prior knowledge and performance levels. The current study tested adaptive spaced retrieval practice in Algebra I classrooms using a computerized system powered by the SuperMemo 2 (SM2) algorithm. Algebra I students practiced 10 selected learning objectives (LOs), with 6 assigned to the adaptive spaced practice condition and 4 to a no-practice control condition. Results revealed that LOs with lower initial accuracy were practiced more frequently with narrower spacing intervals. In contrast, LOs with higher initial accuracy received fewer practices with wider spacing, leading to significant improvements in accuracy from pretest to posttest. Although no significant differences were observed between the spaced practice and no-practice conditions, the findings highlight the potential of adaptive spaced retrieval systems to address individual learning needs to optimize their effectiveness in mathematics education.

Why Do Students Struggle with Percentage Problems? Examining Challenges in Answer and Task Formats

Although students' difficulties with percentages are well-known and crucial for daily life, their underlying causes remain unclear. The present study aimed to address this gap by systematically analyzing basic characteristics of percentage problems---with a focus on answer (e.g., open-ended vs. multiple-choice) and problem (e.g., mere calculation vs. word problems) formats. We first evaluated potential biases in the frequency distributions of specific answer and problem formats, before analyzing their association with students' performance, leveraging a naturalistic large-scale data set (\textgreater18,000 students; 1.5 million problems). Students were most frequently confronted with the most difficult answer and problem format, with more than half of all percentage problems formulated as word problems using an open-ended answer format. In contrast, problems including visualizations were least common even though they were performed best. We conclude that higher frequencies of visualization problems early on when percentage learning may ease harder problems like word problems.

Synthetic Data Generation with Large Language Models for Improved Depression Prediction

Automatic depression detection is gaining traction at the intersection of psychology and machine learning, but concerns over data privacy and scarcity persist. We propose a pipeline using Large Language Models (LLMs) to generate synthetic data that enhances depression prediction while addressing ethical concerns. Starting from recorded clinical transcripts, our chain-of-thought prompting involves two steps: the generation of a synopsis and sentiment analysis from the original transcript, and the generation of synthetic data based on these summaries and a new depression score. The resulting synthetic data not only performs well in terms of utility and fidelity, but also balances the severity distribution in training datasets, improving prediction of depression intensity. Our method offers a practical solution to augmenting limited, sensitive data while preserving statistical integrity. This framework provides a robust framework for advancing mental health research and applications without compromising patient privacy.

Situation-Dependent Emotion Regulation in Adolescence: Crying Alone or in Presence of Others?

This study aimed to investigate how individuals regulate emotional expression in different crying situations. Using a questionnaire survey, we examined the everyday crying experiences of 173 Japanese university students, focusing on how empathy and personality traits influence the decision to cry alone or in presence of others. Statistical analyses revealed that individuals with higher agreeableness scores tend to suppress crying in presence of others when experiencing emotions triggered by a close person's sadness or by fictional and documentary characters. However, in situations involving achievements or defeats, they are more likely to share their emotions by crying with others. These findings suggest that highly agreeable individuals are particularly sensitive to social norms and adjust their emotional expressions to align with contextual appropriateness. This study highlights the role of personality in emotional regulation and provides insight into the social functions of crying in interpersonal contexts.

What Factors Influence Goal Setting? Insights from Text-Based Task Generation

Humans set a variety of goals and generate diverse tasks everyday, but the psychological factors underlying this process remain underexplored. To identify the core characteristics influencing individual goal-setting, we conducted a text-based task generation experiment where participants were asked to report activities or games freely using different daily objects. Participants' values and traits were also measured to obtain the personal characteristics. We conducted a pre-registered evaluation experiment to establish the connections between these psychological factors and task attributes. Our results show that human goal-setting behaviors are shaped by personal values and cognitive styles. People with higher openness to change value tend to generate more novel and stimulating tasks, while people with a systematic thinking style perform worse in complex environments compared to simpler ones. Despite the widespread use of large language models (LLMs), our further comparisons reveal that LLMs fail to capture the diversity of human responses, showing a systematic bias towards mental tasks, even when instructed to simulate individual profiles. These results offer new insights into the psychological factors driving individual goal-setting, and highlight the limitations of current LLMs in stimulating human behaviors.

Stochastic Metalevel Markov Decision Processes: Proposal and Validation through Ecologically Valid Experiments

In daily life, humans selectively search for information about options to make decisions. The metalevel MDP framework, proposed to understand this information search process, has so far been evaluated only for its predictive performance regarding group differences in summary measures under unrealistic scenarios. This study examined whether metalevel MDP can explain information search processes in more realistic decision-making situations. We proposed a chatbot-based experiment that enables diverse information search actions, as well as a novel analysis method that extends metalevel MDP as a probabilistic model, allowing for the estimation of latent variables and more rigorous evaluation of statistical model fit. The experimental results showed that the model partially explains the information search process, and we discussed potential directions for model improvement.

M2TQA:A Metacognitive Framework for Multi-Table Question Answering

A Metacognitive Framework for Multi-Table Question Answering Processing structured data is critical in finance, healthcare, and science. While single-table question answering has advanced, multi-table QA remains challenging due to schema understanding, cross-table reasoning, and complex natural language queries. We propose M2TQA , a novel framework inspired by human cognitive and metacognitive mechanisms. M2TQA integrates metadata extraction, query decomposition, and a metacognitive module to enable interpretable, robust solutions for MTQA. It dynamically simulates human-like reasoning through feedback loops, bridging gaps between natural language understanding and structured data processing. Experiments on four benchmarks show M2TQA outperforms baselines by 94.54% and 33.24% in F1 scores. This work advances MTQA and highlights metacognition's role in AI, fostering interdisciplinary connections between cognitive science and artificial intelligence.

A Computational Cognitive Model for Processing Repetitions of Hierarchical Relations

Patterns are fundamental to human cognition, enabling the recognition of structure and regularity across diverse domains. In this work, we focus on structural repeats, patterns that arise from the repetition of hierarchical relations within sequential data, and develop a candidate computational model of how humans detect and understand such structural repeats. Based on a weighted deduction system, our model infers the minimal generative process of a given sequence in the form of a Template program, a formalism that enriches context-free grammar with repetition combinators. Such representation efficiently encodes the repetition of sub-computations in a recursive manner. As a proof of concept, we demonstrate our model's capability on short sequences from music and action planning. The proposed model offers broader insights into the mental representations and cognitive mechanisms underlying human pattern recognition.

Bayesian Model of Goal Direction Inference in Animacy Perception from Moving Dots

The phenomenon of perceiving lifelikeness in the movements of non-living objects is referred to as animacy perception. This study hypothesized that when humans infer intentionality from motion information, they initially estimate the direction of the goal of the movement in a Bayesian manner. The magnitude of change in this estimated direction reflects the strength of intentionality and self-propelledness, which are correlated with the perceived strength of animacy. We tested this hypothesis through an experiment in which participants evaluated animacy, intentionality, and self-propelledness for dots moving on a screen. The results revealed that although the magnitude of motion direction changes did not directly influence intentionality and self-propelledness, both the magnitude of goal direction changes and variance had a significant impact. These findings suggest that animacy perception may be realized through hierarchical Bayesian estimation.

Control and Responsibility Role in Moral Judgment and Decision-Making

Based on Blame Theory and Integrative Moral Judgment and Decision-Making (MJDM), this paper examines the role of responsibility for the situation and control over the consequences of the situation in first-person, realistic and non-sacrificial dilemmas. In a 2x2 within-subjects design, 69 participants had to read and solve four dilemmas. Their choice, affect felt, perceived utility, and moral acceptability values were measured. Analyses showed that increased control led to more pro-social choices, and that responsibility increased pro-sociality only when control was low. Control and responsibility also influenced affect, utility, and moral acceptability judgments, corroborating the integrative MJDM model. These results show that control and responsibility play a key role in MJDM but differ from blame theory, where responsibility is necessary for blame and for control to matter. In contrast, in first-person MJDM, control suffices for pro-social action, and responsibility influences decisions only when control is absent.

The detection of configuration and identity changes of object arrays in infancy

Depictions are spatio-temporal arrangements of symbols carrying information about the entities the symbols stand for. We hypothesized that symbolic displays should specifically facilitate the encoding of spatial relations. We tested this hypothesis in 10-month-old infants by exposing them to configurations of unfamiliar objects in a communicative context (which should promote the interpretation of the objects as potential symbols) and in a non-communicative context. We assessed infants' memory for the object arrays by measuring looking times to arrays that matched the original display or altered the identity or the configuration of the objects. A Bayesian hierarchical model on log-transformed looking times revealed a main effect of identity change but no significant interaction between configuration change and communicative context. However, within-condition comparisons showed significantly longer looking to identity changes in both contexts, and to configuration changes only in the communicative condition, suggesting that symbolic displays specifically enhanced the encoding of spatial relations.

Accessing the meanings of sublexical forms during visual word recognition

How are complex words recognized during the early moments of visual word recognition? What roles do full word and constituent frequency play in semantic processing? The present study addressed these questions by employing a word-picture relatedness task with brief stimuli presentations designed to tap the early mapping of orthographic input onto semantic representations. The main manipulation involved first presenting a picture depicting the target word's constituent (200 ms), followed by the presentation of the target word (56 ms). We compared the rate of positive relatedness judgements elicited by picture-word pairs between suffixed (SKI-skier), pseudo-suffixed (MOTH-mother), and non-suffixed words (CAN-canoe). Results suggest that the "constituents" of all three word types are semantically accessed, although with a suffixed word advantage. Regression analyses did not corroborate behavioral findings as no full word and constituent frequency effects were obtained. We discuss the implications of these findings for models of the visual word recognition system.

The dimensionality of individual differences in perceptual decision making

Perceptual decision-making is the process of integrating perceptual evidence and prior experience to a decision. Yet even simple tasks show systematic deviations from optimality. To explore the suboptimalities and their latent structure, we analyzed behavioral data from 155 participants performing a Bernoulli clicks task, each completing 500 trials identifying the side with more clicks. The data were fit with a customized neural-network incorporating temporal kernel weighting individual clicks, side bias, and win–stay/lose–shift effect. Weights on these suboptimalities exhibited substantial variabilities across participants but were captured by a concise structure: two dimensions represented temporal integration kernel, two dimensions reflected win–stay/lose–shift kernel, and one dimension corresponded to side bias. This compact five-dimensional structure and random noise explained the observed suboptimalities. Our results indicate seemingly complex individual differences can be decomposed into a small set of dissociable cognitive processes, providing insight into the structure underlying decision-making variability.

Are Global Statistics Discarded Statistics? An Investigation of What Types of Co-Occurrence Statistics Could Support the Acquisition of Semantic Knowledge

Much research examining the development of semantic knowledge has focused on patterns of co-occurrence in language. An often-made distinction concerns ‘local' co-occurrences (word pairs that appear close together) and ‘global' co-occurrences (word pairs that appear in similar linguistic contexts), with some recent accounts arguing that only local co-occurrences are exploitable by young children. We tested the alternative view that this dichotomy may be misaligned with natural language input. First, we performed a descriptive analysis of the co-occurrence structure present in the input. This analysis suggests that global co-occurrences are frequent in the input to children. Second, we performed a descriptive analysis of the semantic information encoded by local and global statistics in the CHILDES corpus; this analysis suggests that both local and global co-occurrences encode patterns of similarity that could support the acquisition of structured semantic representations.

Research on Fundamental Issues of Intelligence Cognition

Intelligence cognition, as a pivotal element in intelligence activities, encompasses a series of psychological processes, cognitive operations, and behavioral manifestations involved in the acquisition, processing, analysis, and utilization of intelligence throughout the entire process. This study reviews the current status of intelligence cognition research, explores the theoretical foundations of intelligence cognition, elucidates the cognitive components embedded in the intelligence research workflow, and identifies key issues that need to be addressed in the future field of intelligence cognition.

Eyes on Her: A Pilot Eye-Tracking Study of the Attentional Advantage of Happy Female Faces

Our study investigated the emotion-based modulation in attentional mechanisms towards male and female emotional faces simultaneously in the extrafoveal vision. We presented pairs of female and male faces of the same emotions peripherally (≥5° from the fixation point) to a central letter discrimination task, to examine gender preference in attentional capture under conditions of mutual competition for processing resources. Eye movement measures were used to assess selective orienting and attentional engagement. Results revealed a strong attentional preference for happy and neutral female faces, indicating an inherent attentional advantage for female faces, particularly when expressing happiness. Happy female faces attracted significantly more initial fixations, supporting the hypothesis that positive emotions have a unique capacity to capture attention. Despite no significant differences in dwell time, female faces consistently received more fixations. These findings highlight the unique attention-capturing capacity of happy female faces and suggest that positive emotional expressions have a distinct role in guiding attentional processing.

How do LLMs Solve Multi-step Reasoning? An Algorithmic Evaluation

What algorithms do LLMs actually learn and use to solve problems? Studies addressing this question are sparse, as research priorities are focused on improving performance through scale. Here we introduce a framework for systematic research into the algorithms that LLMs learn and use (AlgEval). Toward this goal, we conducted a graph navigation study that typically requires multi-step search, and evaluated whether Llama-3.1-8B uses classic search algorithms. We formed top-down hypotheses about candidate algorithms (e.g., breadth first, BFS, or depth first search, DFS), and tested these hypotheses via circuit-level analysis of attention patterns and hidden states or representations. We found that 1) Extracting possible sequences processed by the model's layer-by-layer representations did not support either BFS or DFS. 2) Attention patterns showed a cascading shift toward the correct path as the prompt was processed. 3) Projecting node-token representations across layers to a manifold revealed gradual separation of the goal from its competitor in representation space. Overall, our results don't support the idea that the model relies on forming or using an accurate map of the environment, and instead of a step by step search, it seems to rely on more policy-dependent shifts. Future work can connect these findings to failure modes in multi-step reasoning. A rigorous, algorithmic evaluation of how LLMs solve tasks offers an alternative to resource-intensive scaling, potentially enabling more sample-efficient training, performance, and novel architectures.

How Gesture Impacts Preschoolers' Recall and Inference for Narrative Stories

Gestures are a ubiquitous part of communication, however, their exact role in communication is not fully understood. Previous work has found that producing or viewing gestures with speech can be beneficial for both the speaker and listener in various learning and comprehension tasks, though it is unclear how gestures are assisting. This study uses a narrative comprehension task to look at the impact of viewing gestures both for recall and inference making for preschoolers, with the expectation that seeing gestures increases their abilities for both tasks. Results show that children who saw gestures remembered significantly more for the gesture connected questions. There were no significant differences in inference scores, however, children's responses provide rich insight into their mental representation. Including gestures during narration can help the listener to form mental representations that converge in form, but only when the gestures are relevant to what the children may be picturing.

Understanding Quantifier Scope with Large Language Models: How Many Children Climbed Trees?

Sentences with multiple quantifiers often lead to interpretive ambiguities, which can vary across languages. This study adopts a cross-linguistic approach to examine how large language models handle quantifier scope interpretation in English and Chinese, using probabilities to assess interpretive likelihood. Humanlikeness scores were used to quantify the extent to which LLMs emulate human performance across language groups. Results reveal that most LLMs prefer SS interpretations, aligning with human tendencies, while only some differentiate between English and Chinese in IS preferences, reflecting human-like patterns. Humanlikeness scores highlight variability in LLMs' approximation of human behavior, but their overall potential is notable. Differences in model architecture, scale, and pretraining data, particularly models' pre-training data language background, significantly influence how closely LLMs approximate human quantifier scope interpretations. Deepseek-R1 was also explored for its potential in handling quantifier scope in English and Chinese.

How surprising is "1% for winning 1000yen": information-theoretic analysis of the search for the definitive prediction principle

What is the value of probability? Keren and Teigen (2001) demonstrated that people prefer extreme probability ("10%" or "90%") to medium probability ("50%") and high probability ("90%") to low probability ("10%"), and proposed that people's perception of the value of probability phrases follows the principle of searching for definitive predictions. The present study proposes that this principle aligns with information theory and predicts that people's judgments of informativeness will vary according to their prior beliefs. Additionally, this study also proposes that surprisingness judgment also obey the prediction from the information theory. To examine these propositions, this study required participants to estimate the valuableness and surprisingness for probability phrases expressing the winning probabilities of gambles. To manipulate prior beliefs about winning a gamble, the study created four conditions where the winning amounts varied. Results indicated that participants' estimations of the informativeness of the probability phrases changed in accordance with predictions from the information theoretic analysis.

What theoretical horizon for cognition? Towards a categorical cognitive science

Cognitive science was founded on the idea that an understanding of mind requires interdisciplinary approaches among specialist fields. However, after more than half a century since the cognitive revolution, the study of mind is as specialized as ever, which raises an existential question: What lies on the horizon for a theory of cognition—unification or dissolution? This question is approached by way of a parallel with the (meta-)mathematical field of category theory. A core principle is construction by universal (mapping) property: every instance of a structure is uniquely composed of a common constituent (map). This principle applies at the level of categories and at the meta-level of categories of other categories. The analogous view of cognitive science presented here is as a meta-theory employing this unifying principle: in the form of a slogan, Cognitive science is to mind as category theory is to mathematics. Funding: JSPS KAKENHI Grant Number JP23K11797

Using Erroneous Worked-out Examples for Supporting Collaborative Learning: An Investigation Based on the Cognitive Model of Link Errors using ACT-R

Considering the effect of collaborative learning based on the Interactive-Constructive-Active-Passive (ICAP) theory, learners can deepen their understanding by engaging in interactive activities in which they elaborate their knowledge by integrating others' elaborated knowledge. However, it is unclear from the comparison between worked-out and erroneous worked-out examples whether they facilitate interactive and deepen understanding. Therefore, this study examines the effects of erroneous worked-out examples in collaborative learning. We employed a cognitive model based on Adaptive Control of Thought-Rational (ACT-R) to present concept map examples based on learners' relevant knowledge. Errors were adopted link error because links are important for understanding knowledge. The results showed that the worked-out example improved learning performance, but did not facilitate the collaborative learning process. Moreover, the erroneous worked-out example enhanced learning performance by facilitating interactive. These results provide insights into the effect of an erroneous worked-out example and a strategy for presenting it.

Testing the Zeigarnik effect in spontaneous memory recall during mind-wandering

Why do certain past events resurface more often in our thoughts? This study investigates the factors influencing retrospective mind wandering, particularly concerning incomplete or unresolved experiences. We used a custom-designed game in which in-game events were systematically varied, then assessed participants' spontaneous recall of those events one week later. Results revealed that offline participants and those familiar with the experimenter were likelier to experience game-related mind-wandering. The strongest predictor of recall was the time spent engaged with the game, highlighting the importance of memory encoding strength. While individual rumination tendencies did not predict whether participants would recall the game, they did predict the frequency of such episodes among those who did. Thoughts centered on the game's protagonist over peripheral details, suggesting narrative salience. Based on these insights, we propose an initiation maintenance model of retrospective mind wandering, which integrates bottom-up and top-down processes to generate and persist spontaneous thoughts.

The Role of Plant Gestalt in Plant Embodied Cognition

I propose that plant cognition is better understood by focusing on the overall form and structure of a plant that continuously unfolds via growth—its gestalt—rather than discrete movements of individual parts, such as roots, stems, or leaves. I conceptualize plant gestalt as a self regulating, dynamic, fractal structure that modulates information flow and simultaneously channels resources across scales, thereby coordinating perception-action within the plant and with other organisms. By integrating concepts like pink noise and self organized criticality, I link the statistical signatures of optimal coupling to the physical architecture of living plants. This structural lens reframes plant cognition: rather than being merely distributed beyond a nervous system, form follows function and is materially enacted by the ever changing topology of plant bodies. Recognizing gestalt growth as a cognitive substrate opens new research avenues into how plants sense, decide, and adapt, with the possibility of analyzing it as records of past decisions.

What Do People Expect from Expected Value?

This study examines how instructional framing influences probability distortions in decision-making scenarios. With 136 participants, we explored four instruction conditions: direct calculation guidance, estimative averaging, and narrative framing from first- and third-person perspectives. Our findings indicate that skewness, rather than variance, significantly impacts estimation errors in expected value tasks. The estimative instruction condition notably reduced probability neglect, whereas direct calculation instructions unexpectedly introduced bias. Both narrative conditions amplified probability neglect, with no significant difference between perspectives. These outcomes challenge traditional assumptions in decision-making models, emphasizing the central role of skewness and the substantial effect of instructional framing on probability distortions. The results suggest that employing estimative instructions could effectively minimize biases in contexts where accurate evaluation of expected value is crucial. This research underscores the importance of instructional design in decision-making tasks and provides insights into minimizing probability neglect.

What's going on? Surprising difficulties in complex relational rule discovery

In the present study, we create an analog of a math task for judging whether one integer is greater than another. Shapes (e.g., triangle, square) represent integers (3, 4), colors (green, red) denote sign (+/–), and spatial arrangement (above) depicts the comparison (greater than). Across two experiments, we find that this discovery task is surprisingly hard: after approximately 120 trials with feedback, average final performance is about 58%, not far above chance. Additionally, training on sub-rules using a variety of previously effective treatments, both with the support of examples and otherwise, provide only short-term benefit to relational rule discovery. Our findings highlight the difficulty of learning complex relational structures purely from feedback, underscoring the possible need for more explicit guidance or extended practice to achieve robust transfer.

Learning in online chess increases with more time spent thinking and diversity of experience

What factors of our learning experiences enable us to best acquire complex skills? Recent ideas from artificial intelligence point to two such factors: (1) a balance of real experience with simulated experience acquired during planning itself, and (2) appropriate diversity in training examples. To test whether these factors influence the development of human expertise, we analyzed data from 1,873 chess players on the online platform Lichess, each of whom played hundreds to thousands of games over months to years. We found that both the time spent planning before moves and the diversity of opening positions encountered predict skill improvement over time. These findings suggest that principles shaping the development of expertise in artificial intelligence systems may also apply to human learning.

Explaining Necessary Truths

Knowing the truth is rarely enough---we also seek out reasons why the fact is true. While much is known about how we explain contingent truths, we understand less about how we explain facts, such as those in mathematics, that are true as a matter of logical necessity. We present a framework, based in computational complexity, where explanations for deductive truths co-emerge with discoveries of simplifying steps during the search process. When such structures are missing, we revert, in turn, to error-based reasons, where a (corrected) mistake can serve as fictitious, but explanatory, contingency-cause: not making the mistake serves as a reason why the truth takes the form it does. We simulate human subjects, using GPT-4o, presented with SAT problems of varying complexity and reasonableness, validating our theory and showing how its predictions can be tested in future human studies.

Neural Speech Tracking and Accents: Are You Familiar with My Accent?

This study explores neural speech tracking of local and foreign accents. Studies have found neuro-cognitive differences for foreign accent processing in lower-level acoustic extraction and higher-level predictive mechanisms. However, how these mechanisms are recruited in speech tracking for different accents remains unclear. We explored neural speech tracking while 24 native English speakers listened to local and foreign accents in an EEG experiment. We examined the decoder accuracy of predicted speech envelopes using the Temporal Response Function to the speech envelope of our stimuli. Results showed stronger tracking for the local accent, and for accents participants rated more familiar. Findings suggest that participants utilized available cognitive resources to recruit predictive mechanisms during local accent processing, allowing them to attend to speech cues more efficiently. This top-down benefit was less available for foreign accents as listeners could not effectively access pre-stored sound variations for predictions.

Examining Children's Applications of Privacy Norms in a Digital, Photo-Sharing Game

Children develop an understanding of privacy through experiences in both real and virtual contexts. As technology becomes central to their lives, it is crucial to explore how they navigate privacy in digital environments. This study examined children's application of privacy norms in a digital photo-sharing context and tested whether a privacy intervention could improve their understanding. In Experiment 1, 85 children (ages 5–8 years) and in Experiment 2, 35 children (ages 5–7 years) listened to a story about Sally, who appeared as a cartoon and a real person. Children decided whether Sally should allow a game to take her picture in different settings. Older children judged taking real Sally's picture as less permissible than cartoon Sally's. Experiment 2 introduced a privacy intervention, which influenced judgments equally across both versions of Sally. These findings suggest that children's privacy reasoning develops with age and can be shaped by targeted interventions.

Hierarchical Instance-Based Learning for Decision Making from Delayed Feedback

In real-world decision making, outcomes are often delayed, meaning individuals must make multiple decisions before receiving any feedback. Moreover, feedback can be presented in different ways: it may summarize the overall results of multiple decisions (aggregated feedback) or report the outcome of individual decisions after some delay (clustered feedback). Despite its importance, the timing and presentation of delayed feedback has received little attention in cognitive modeling of decision-making, which typically focuses on immediate feedback. To address this, we conducted an experiment to compare the effect of delayed vs. immediate feedback and aggregated vs. clustered feedback. We also propose a Hierarchical Instance-Based Learning (HIBL) model that captures how people make decisions in delayed feedback settings. HIBL uses a super-model that chooses between sub-models to perform the decision-making task until an outcome is observed. Simulations show that HIBL best predicts human behavior and specific patterns, demonstrating the flexibility of IBL models.

Knowledge of Examples Affects Conditional Reasoning About Math

Conditional reasoning, or reasoning with if-then statements, depends in part on knowledge. However, the mechanisms underlying this dependence are not fully understood. We propose that example knowledge—the ability to generate and categorize examples of logical possibilities—plays a central role, and therefore hypothesize that individual differences in example knowledge contribute to differences in conditional reasoning. Two studies tested this hypothesis in the domain of algebra. In Study 1, individual differences in example knowledge predicted differences in conditional reasoning about algebra when controlling for everyday conditional reasoning and general algebra knowledge. In Study 2, training designed to improve example knowledge improved conditional reasoning about algebra. We discuss implications of the findings regarding the mechanisms underlying the knowledge-dependence of conditional reasoning and the nature of individual differences in conditional reasoning.

Do bilinguals avoid ambiguity? An experimental study of lexical ambiguity in spoken Mandarin

Previous research has proposed that bilinguals would rather be redundant than ambiguous. To test this hypothesis, we conducted an experiment examining lexical ambiguity in spoken Mandarin at the tonal, segmental, and orthographical levels. Using a picture naming task, we explored how L1 Mandarin L2 English speakers in the UK and more-monolingual speakers in China resolve ambiguity by analysing their verbal responses when naming pictures, manipulating whether the context in which a picture is named makes the preferred label ambiguous (e.g. do speakers avoid saying "fen3 si1" when describing a picture of glass noodles when it appears alongside a picture of fans which shares the same label? do bilinguals avoid this ambiguity more than more monolingual peers as claimed?). Our results do not support this hypothesis, as no reliable differences between groups were found. Despite the null results, we observed several interesting patterns worthy of further investigation.

Crossmodal Processing Effects Through an Eye Tracking Lens

The current experiments used an eye tracker to examine how congruent and incongruent stimuli in one modality affect processing in a second modality. Participants in Experiment 1 had to quickly determine if a stimulus was an animal or a vehicle, and participants in Experiment 2 had to determine if a stimulus had one or two circles (visual), or one or two beeps (auditory). Stimuli were congruent (dog/dog bark, or one circle/one beep) or incongruent (dog/car horn, or one circle/two beeps), and performance was compared to unimodal baseline conditions. Behavioral results in both tasks show that visual stimuli had a larger effect on auditory responding than vice versa – as congruent stimuli sped up responding while incongruent stimuli slowed down responding and decreased accuracy. Oculomotor data acquired via eye tracking mirrored behavioral results, with auditory conditions being more susceptible to interference effects observed through congruency manipulations.

Using Tools From Animal Psychology to Measure Metacognition in Artificial Intelligence

Metacognition is the ability to monitor one's cognitive processes, including one's own uncertainty when making a decision. Metacognition has been studied in humans and other animals for several decades, and it is of increasing interest to the Artificial Intelligence (AI) research community too. In this work, we implement a well-established experimental procedure to study whether two classes of AI system can monitor their own uncertainty. We ask deep reinforcement learning agents and vision language models to learn to discriminate two stimuli that vary in similarity, where they are rewarded for correct and punished for incorrect discriminations. By measuring the frequency at which they choose not to make a choice, and analysing how that varies with stimulus similarity, we produce a measure of the degree to which their uncertainty informs their decisions. We find some limited evidence that the AI systems we study monitor their own uncertainty when making risky decisions.

What makes a conversation interesting? Linguistic features predictive of interest in educational conversations between teachers and learners of English

Stimulating language learners' engagement is essential to successful second language acquisition, but it can be hard to translate this intuition into effective learning resources. In the first large scale investigation into the linguistic and pragmatic features that make an educational conversation interesting, we collected interest ratings for 64 conversations between teachers and second language learners of English. We provide proof of concept that - despite the high degree of subjectivity involved in perceptions of interest - it is possible to extract features that make a conversation interesting for the average learner. Specifically, concreteness, comprehensibility, and uptake (i.e., the degree to which a teacher and a student's turn build on one another) all had unique relations to interest in our data. These findings lay the foundations for future work on the optimization of AI tutors for more engaging language learning interactions.

LLMs have "mental" models: Latent world models in LLM network weights can be inferred from output layer tokens at inference

Do large language models (LLMs) construct and manipulate internal "mental models" of physical systems, or do they rely solely on statistical associations represented as output layer token probabilities learned from data? We adapt cognitive science methodologies from human mental models research, testing LLMs on pulley system problems using TiKZ-rendered stimuli. Study 1 examines whether LLMs can estimate mechanical advantage (MA) while distinguishing relevant from irrelevant system components, and disregarding distractor elements. We found that contemporaneous state-of-the-art models performed marginally but significantly above chance when exact estimate-label matches were required, and that their estimates correlate significantly with ground-truth MA. Crucially, tested models selectively attended to meaningful variables (e.g., number of ropes and pulleys) while ignoring system features that are irrelevant to MA (rope diameter, pulley diameter, ceiling height). Study 2 extends this by investigating the extent to which LLMs may internally represent gestalt system features, which are crucial to estimating MA: LLMs evaluated a functionally connected pulley system against a "fake" system comprising unconnected components. Without explicit cues that one system was non-functional, models correctly identified the connected system as having greater MA with an average accuracy of 84%. However, their explanations failed to acknowledge the fundamental distinction between connected and unconnected systems, instead relying on post hoc rationalizations over false premises (e.g., assuming both systems were connected and inferring MA from supporting ropes). This suggests that while LLMs manipulate internal "world models" analogous to human mental models, these may be conceptually uncoupled from explicit reasoning at the output layer. These findings provide evidence that LLMs may construct latent world models that inform token probabilities, challenging the notion that they are "only" next-token predictors.

Can Abstract Categories Be Represented by Shared Features in Concept Bottleneck Models?

Learning abstract concepts is a core component of human cognition, yet remains challenging for artificial intelligent. We present a computational model that investigates whether abstract categories can be acquired through shared perceptual features, using a Label-Free Concept Bottleneck Model (CBM) trained to induce basic-level concepts using shared features. Concepts are represented through intermediate concepts layer, enabling the model to form grounded representations of basic-level categories. To evaluate conceptual robustness beyond surface-level accuracy, we conduct a series of generalization and ablation experiments. These assess whether the model forms robust conceptual representations rather than merely mapping inputs to labels. Our results show that the CBM achieves high accuracy on a dataset comprising four basic-level classes and twelve subordinate Image-Net subclasses, while also yielding interpretable intermediate representations. This framework demonstrates that abstract categorization can emerge through feature based induction, and suggests a pathway for cognitive models of concept learning.

The Role of Gesture in Emotion Communication: Patterns Across Emotional Categories and Stimulus Types

Abstract Gestures are crucial cues in emotion communication, yet little is known about how specific emotions elicited through different stimuli link to gesture production. The present study investigated how gesture frequency and type (representational vs. nonrepresentational) varied across specific emotion categories (i.e., happiness, anger, sadness, and neutral) elicited by visual stimuli and written narratives. In a within-subject design, participants (n=38) retold emotionally charged movie clips and written narratives for each emotion. The results showed that the participants overall produced fewer representational gestures while describing sadness compared to happiness and neutral, and anger compared to neutral. Interestingly, the participants overall produced more nonrepresentational gestures in narrative descriptions than in movie clip descriptions. However, gesture frequency and type did not significantly differ across movie clips and narratives. These results underscore the importance of accounting for both stimulus type and specific emotion categories when examining the role of gestures in emotion communication.

Through Thick and Thin: People Think Family Will and Ought to Reconcile

In a preregistered experiment, adults living in the United States (N = 700) expected family (here, siblings) to be more likely to reconcile than friends after a conflict. To a greater extent, participants reported that siblings (vs. friends) have to reconcile and failing to do so would be less morally permissible. Further, participants expected love between siblings to be negatively affected to a lesser extent than love between friends who experienced the same conflict. We also explored potential generational differences, and found that Baby Boomers (people born in the years 1946–1964) reported that family members were significantly more obligated to reconcile than did Millennials (people born in the years 1981–1996). Our findings indicate that ties to family members are especially anticipated and obliged to persist through thick and thin.

Order Effects in Evidence Chains: Normative and Naïve Evaluations

This is a first exploration into a newly identified reasoning error. We explore normative (derived from Bayes' rule) and naïve (empirical data) evaluations of how order of reliability within chains of evidence (e.g., hearsay testimony) impacts overall evidential value. In a novel paradigm, we swap the position of two witnesses within the chain to determine the effect of order, when these witnesses differ in their reliability. First, a probabilistic (Bayesian) assessment is provided, including both quantitative and qualitative explanations. Second, lay reasoner qualitative intuitions are measured, using Bayesian predictions as a benchmark for accuracy. Lay reasoners significantly deviate from Bayesian predictions. Three quarters (75.41%) made an error when evaluating order effects in hearsay testimony, with 49.18% wrongly concluding that order has no impact. Only 24.59% correctly judged that the preferential order had greatest evidential value. We illustrate how hearsay testimony is inherently complex and an optimal evaluation is nonintuitive.

Seventeen Times Six Equals Eleven-Twelve: A Multi-Level Analysis of a Conversation with Donald Trump

In this work, we apply a variety of theoretical perspectives and tools to discourse from a popular talk show. We describe how participants in a 72-second excerpt from The Howard Stern Show marshal social, cognitive, and linguistic resources to negotiate the answer to a simple arithmetic problem. Our interdisciplinary analysis of the interaction identifies phenomena occurring at multiple levels as eight people are clamoring to influence the conversation in their own way, arguing, interrupting one another, aligning on certain phrases, and generating embodied mental imagery. While math errors are common human mistakes, the social dominance effects that allow a wrong answer to be accepted as a correct answer are concerning. We conclude that when multiple theoretical perspectives and tools are applied together on real-world linguistic phenomena, there is a multi-scale richness and massive interactivity that is sometimes neglected when language scientists focus solely on a single level of analysis.

Beyond East and West: Cognitive Preferences in English, Chinese and Japanese Event Description

This study challenges traditional East-West dichotomies in cross-linguistic cognition by examining event construal preferences across English, Chinese, and Japanese speakers. We investigated how 90 participants (30 per language group) described visual stimuli depicting agent-patient interactions with varying animacy types. Statistical analysis revealed that Chinese speakers' construal patterns aligned with English speakers (p>.05), contrasting sharply with Japanese speakers despite China's cultural proximity to Japan. Both English and Chinese groups demonstrated greater flexibility in perspective-taking across all agent types (human>animal>object), while Japanese speakers showed significantly stronger constraints with inanimate agents (p<.0001). These findings suggest that grammatical flexibility in encoding perspectives, rather than cultural grouping, shapes cognitive preferences in event description. Our results indicate that linguistic structures may influence cognition independently of cultural boundaries, revealing a more complex relationship among language structure, cognitive preferences, and traditional cultural categorizations than previously assumed.

How Habit Learning Guides Planning: A Normative View and Behavioral Evidence

Human behavior is determined by both learned habits and prospective planning. Because planning is computationally expensive, humans face two meta-control challenges: They must determine when to plan and, if so, which potential futures to consider. We propose that habit learning itself could solve these meta-control problems by prioritizing which futures to explore and to what extent. We show how this notion emerges from a normative Bayesian model and test one of the resulting predictions empirically. To do so, we developed a behavioral paradigm that operationalizes model-based planning as spatial navigation through a maze. Our findings suggest that humans indeed incorporate learned habitual information during planning in a manner closely aligned with the Bayesian model. This corroborates existing reinforcement learning accounts and contributes a normative and unifying perspective.

Schema-Induced Emotional Arousal Enhances Task Performance: A Pupillometric Investigation of Top-Down Cognitive Influence

In some cases, cognitive function can change owing to external factors without physiological effects. This is considered to be influenced by emotional arousal caused by schema activity. This study analyzed the relationships between emotional arousal and task performance to determine whether schema can lead to emotional arousal and enhance task performance. Emotional arousal was assessed based on pupil dilation using eye tracking, and task performance was measured based on reaction times to the stimuli. No main effect of condition on reaction time was observed. However, the experimental results showed that the larger the pupil dilation, the shorter the reaction time, but only when schema activation stimuli were present. These findings suggest that schema-induced emotional arousal may improve performance without the intake of substances with physiological effects. This may also offer insight into mechanisms underlying the placebo effect, though further investigation is required.

Studying Mathematical Reasoning through the Gadget Game

Mathematicians regularly come up with multi-step solutions to difficult problems, formulating intermediate statements and subgoals, deciding which ones to attempt to prove, and judging when to start, stop, or come back to a question. What drives these and similar cognitive processes? Studying mathematical reasoning is challenging, in part because of a lack of engaging yet controlled environments in which to do so. We introduce a new game -- the Gadget Game -- for this purpose. Each level in the Gadget Game can be an encoding of a provable mathematical statement, together with hypotheses and deduction rules, that obscures the semantic content of the original problem. The resulting puzzles are enjoyable to play. We conduct a series of preliminary experiments involving a web-based crowdsourced experiment and a "think aloud"' deep-dive with two experienced mathematicians. We believe that the Gadget Game is a ripe domain for interesting cognitive science that engages deeply with mathematical thought.

Text Typography with Font Size Variations to Distinguish Information Importance Improves Chinese Readability

In today's information overload era, efficiently extracting information from extensive textual content is a crucial reading challenge. This study investigates how text typography with font size variations based on information importance in Chinese influences text readability. A total of 236 undergraduate students from various universities in China were randomly selected to participate in this experiment. The experiment investigates the influence of typography on reading from three dimensions: reading objective performance, subjective evaluation, and reading attention. The study has the following findings. First, larger-sized Chinese characters are more effective at capturing reader's attention in Chinese reading. Second, text typography with font size variations to distinguish information importance based on Chinese grammar can significantly enhance the readability of Chinese. This study demonstrates that text typography with font size variations based on information importance can improve the readability of Chinese.

SIESTA: A Spectral-Temporal Unified Framework for Robust Cross-Subject EEG Analysis

Electroencephalography (EEG) provides critical insights into brain activity, yet its inherent variability and nonstationary nature pose significant challenges for computational analysis, particularly in cross-subject generalization tasks. We present SIESTA (Spectral Invariant EEG-based Semi-causal Transform Architecture), a novel EEG foundation model that addresses these challenges through three key innovations: (1) VQGAN-based spectral tokenization capturing wavelet representation of EEG; (2) a dual-stream Transformer architecture pre-trained using a semi-causal generative modeling approach; and (3) Contrastive Invariant Fine-Tuning (CIFT), a label-free domain adaptation strategy that aligns feature distributions across subjects by integrating spectral-temporal dynamics. Pre-trained on over 32,900 hours of diverse EEG data, SIESTA achieves state-of-the-art performance in epilepsy monitoring, improves F1-score by $12.4 \%$ on scalp EEG and $8.7 \%$ on intracranial EEG, respectively. Beyond epilepsy, SIESTA demonstrates strong generalizability to non-epilepsy tasks, including motor imagery and sleep stage classification. These results validate that spectrotemporal integration and domain-invariant learning are fundamental for modeling cross-subject EEG variability, establishing new benchmarks for robust brain-computing systems.

On the Role of Nonsymbolic and Symbolic Numeracy Skills in Number-Line Estimation Processes

Number-line estimation tasks (NLETs) have been used to assess symbolic numerical skills (SNS; Booth & Siegler, 2008; Lyons & Ansari, 2015) and have also been associated with the approximate number system (ANS; Khanum et al., 2016; Wong et al., 2016). A recent study with 6–7-year-old children in Sweden (Morell-Ruiz et al., 2025) provided evidence that training NLE abilities can help bridge these two numerical systems, suggesting that the ANS may actively scaffold the development of the SNS. Building on this, we designed a novel two-choice NLET compatible with Drift Diffusion Model (DDM) fitting, allowing us to decompose children's estimation processes into interpretable parameters. Our results show that DDM parameters significantly correlate with performance in both symbolic and nonsymbolic tasks, and that performance on the two-choice and standard NLETs is strongly correlated. These findings validate our paradigm, offering new insights into the cognitive mechanisms linking numerical representations via number-line estimation.

Gesture Restrictions effect on Silent Pauses in Emotional Narratives: An Embodied Emotion Perspective

Gestures and silent pauses are integral to the pragmatic, semantic, and temporal organization of speech. However, how gesture inhibition affects silent pauses in emotional context remains unexplored. This study examined how narrative type and gesture conditions affect (a) the distribution of silent pauses [short (250–500 ms), medium (500–1000 ms), and long (>1000 ms)], measured by frequency, average duration, and time ratio, while controlling for speech rate, and (b) self-reported emotional intensity. Thirty participants (Mage=20.61) narrated negative (sadness, fear, anger) and neutral (daily routine) experiences in Hindi-English under gesture-restricted (N=15) and gesture-free (N=15) conditions. A significant main effect of narrative type was found: short pauses increased in neutral narratives under the gesture-free condition, and long pauses increased in negative narratives under the gesture-restricted condition. No significant effect of gesture condition or interaction effect was observed. Gesture restriction also appeared to increase self-reported emotional intensity during negative narratives.

Examining the Relationship Between Joint Attention and Word Recall in Preschool-Aged Children

Joint attention (JA) is a crucial form of early communication strongly associated with word learning and vocabulary development. However, limited research has examined JA in children older than 36 months, despite its potential role in classroom learning. This study employed head-mounted eye-tracking in a social interactive context to investigate the relationship between JA and word recall. During the study, JA was measured as parent-child pairs engaged in play with unfamiliar objects, with parents actively naming the objects. Subsequently, children were tested on their ability to recall the names of these objects. The results suggest that JA significantly predicted the recall of unfamiliar words but was not related to vocabulary development. These results contribute to the growing body of research on JA by providing insights into the potential mechanisms supporting word learning and vocabulary development.

Understanding the Impact of Metacognitive Ability on Decision-Making with Causal Diagrams

People use their knowledge to evaluate information and make decisions. Yet this knowledge may be faulty, like many lay beliefs on health remedies. This can lead to incorrect decisions and choosing ineffective interventions. While people often overestimate their knowledge, as shown in prior research, less is known about how metacognitive factors such as perceived versus actual knowledge interact with new information we receive during decision-making. Prior work has found that the simplest causal models are most helpful for everyday decisions, but did not examine the role of people's existing knowledge. To address this gap, we conducted an online experiment to examine how metacognitive abilities influence decision-making with causal diagrams for Type 2 diabetes management. Actual knowledge positively predicted decision-making accuracy, while perceived knowledge had a negative effect, and simpler diagrams led to higher accuracy regardless of prior knowledge. We discuss the implications of our findings for designing decision support interventions.

Computational Implementation of a Model of Category-Theroretic Metaphor Comprehension

In this study, we developed a computational implementation for a model of metaphor comprehension based on the theory of indeterminate natural transformation (TINT) proposed by Fuyama et al. We simplified the algorithms implementing the model to be closer to the original theory and verified it through data fitting and simulations. For details, we proposed a method to replace the deterministic operation of existing model with a probabilistic (softmax) operation. The outputs of the algorithms are evaluated with three measures: data-fitting with experimental data, the size of the correspondence of the metaphor comprehension result, and the novelty of the comprehension (i.e. the correspondence of the associative structure of the source and target of the metaphor). The improved algorithm outperformed the existing one in all the three measures. We suggest that the metaphor comprehension process in humans is based on more probabilistic procedure.

The Role of Insight and Analytic Learning During Concept Acquisition

We report an experiment that examines the relationship between insight and analytic learning and relational concept acquisition. This paper introduces a novel paradigm in which warmth judgments were included throughout a relational category learning task. Specifically, some subjects completed this category learning task through classification, whereas others completed it through inference. Additionally, some subjects made warmth judgments throughout this task, whereas others did not (control). The results revealed no evidence that warmth judgments impacted concept learning. More importantly, we find that subjects who engaged in classification reported greater and more rapid increases in warmth judgments than subjects who engaged in inference. These findings directly paralleled subjects' learning patterns, wherein classification subjects showed evidence of more rapid learning, whereas inference subjects displayed more gradual, stepwise learning. Taken together, the present results suggest that classification involves more insight-based learning, whereas inference seems to involve more analytic learning.

Recognizing Voices: Do Listeners Rely on Specific Exemplars or Summary Statistics?

Listeners recognize talker identity by storing exemplars or by forming abstract representations. However, it is unclear which strategy they use for nonnative-accented talkers, whose speech requires more cognitive effort to process. We examined whether native English listeners rely on abstraction or exemplars when learning to recognize Mandarin-accented and American-accented talkers. We trained listeners to identify voices that varied in glottal pulse rate and vocal tract length. They then made recognition judgments for two types of stimuli: ring-shaped tokens, at the perimeter of a talker's voice space and heard during training; center tokens, at the average of the trained distribution but not heard during training. Results showed higher accuracy for center tokens and crucially, this pattern emerged for both native and nonnative talkers, suggesting that talker recognition relies on abstraction-based encoding strategy regardless of listeners' prior experience with specific talker types.

Dynamic Inter-brain Synchrony in Real-life Creative Problem Solving in Teams: an fNIRS-based Hyperscanning Study

The ability to solve problems creatively is a pivotal characteristic of human brain, yet its underlying neural mechanism remains largely unknown. Previous hyperscanning studies mainly analyzed the entire time series of brain signals to reveal an overall pattern of inter-brain synchrony (IBS) during social interaction. However, we argue that this approach might not be able to capture the dynamic properties of inter-brain interaction. In this study, we proposed a novel approach based on sliding windows and k-mean clustering to identify the dynamic modulation of IBS patterns during an interactively creative problem-solving task. Results showed that inter-personal communication could be characterized as a series of dynamic and modular IBS states along the task. Besides, the transition of dynamic IBS states was highly correlated with the dyad's creativity ability. In sum, the proposed approach holds great promise for advancing our current understanding of the dynamic neurocognitive processes underlying social interaction.

Non-linear relational composition in large language models

The longstanding question of how neural networks could implement relational composition has been buoyed by recent success showing relational abstraction in transformer-based large language models (LLMs). We address recent findings showing some, but imperfect, generalizability in linear composition during knowledge retrieval of attributive triplets [Her- nandez, E. et al, (2024). Linearity of relation decoding in transformer language models arXiv:2308.09124)]. We report that limitations to relational generalization are explainable by two systematic factors. First, relational combinations that are more accurately retrieved generalize better than uncertain or inaccurate ones. Second, relational generalization scales with the semantic similarity of the entities being bound across triplets, showing that it is in fact non-linearly dependent on component meanings rather than being purely invariant. This aligns with longstanding findings that human judgments of adjectival combinations are likewise non-linearly interactive.

Mental Sampling in Social Judgment: Examining Variability in Judgments for the Self, Close, and Distant Others

A growing number of theories explain various aspects of cognition through processes of mental "sampling." Under these theories, judgments (e.g. predicting whether a friend will be late) are accomplished by generating and aggregating samples, through simulation or memory retrieval. Here, we examined a key prediction of these theories: that the variability of judgments will be lower when more samples can be drawn. We test this with a novel intervention in a simple social inference task, examining people's ability to judge the probability of various everyday behaviors, comparing judgments made for themselves versus others. Responses were more consistent responses when predicting their own behavior than that of an acquaintance, suggesting a greater number of samples could be drawn. Surprisingly, we found only a weak relationship between the time spent with a target and the variability of estimates, suggesting that sampling processes may not rely only on retrieval from memory.

Universals in Visual Word Recognition: Investigating the Optimal Viewing Position for Visual Words in Hindi

The Optimal Viewing Position (Nazir, Heller & Sussman, 1992; Brysbaert & Nazir, 2005) for reading visual words is well studied and documented in most of alphabetic languages (Grainger, 2022). The current study investigates the optimal viewing position in Hindi, written using the Devanagari script in attempt to understand reading universals across scripts as suggested by Frost (20120. We carried out two experiments using the lexical decision task, one each on words with maatraa (3, 4 and 5 varna) and without maatraa (2, 3, and 4 varna). Maatras are diacritics used to manifest vowel pronunciations with aksharas in Hindi and can be be marked all around the base akshara, adding to the graphemic complexity of a Hindi word (for a detail review see, Share et al., 2015; Rimzhim et al., 2021; Verma et al., 2021). The results demonstrate robust optimal viewing position effects in terms of reaction times with a U-shaped curve in relation to initial fixation position and the optimal viewing position has been found to be slightly left to the centre of the word. These seem to be similar to the OVP findings reported for European languages (Brysbaert & Nazir, 2005) and may add information about reading universals across scripts and writing systems. Keywords: Optimal Viewing Position, Hindi, Lexical Decision Task

Gaze and Gluttony: How BMI Affects Our First Fixations

We investigated selective attention towards high- versus low-calorie foods in individuals with normal (BMI 18–25) and high BMI (>25) using eye-tracking. Participants performed a central letter discrimination task while irrelevant food images appeared peripherally (5° from fixation). We measured the probability and latency of first fixations. High BMI participants were more likely to initially fixate on high-calorie processed foods. On the contrary, the normal BMI group showed a bias towards low-calorie foods, although this observation needs further investigation. These attentional biases occurred despite controlling for perceptual differences between images. The high BMI group showed a bias in the left visual field, as predicted by the literature on dominance of the right hemisphere in rewarding stimulus processing. This early, automatic attentional bias may contribute to unhealthy eating patterns and has implications for understanding cognitive mechanisms involved in obesity, particularly the role of bottom-up attentional capture by rewarding food stimuli.

Negation as a tool for conveying mental models

Negations (e.g. "the ball isn't red") are thought to contain less information than their positive counterparts (e.g. "the ball is green"), which poses a pragmatic puzzle: why ever use them? We contend that negations convey additional information about a speaker's mental model of the world, revealing preferences and expectations. For example, "the ball isn't red" implies the speaker's expectation that the ball could or should have been red — that it being red was worth considering. Here, we demonstrate that speakers take advantage of the dual world and world-model information conveyed by negation when faced with the need to efficiently share information about their beliefs across many contexts. Across four experiments, we demonstrate that speakers use significantly more negations when differentiating between possible causal models of a situation, explaining their political beliefs to a member of the opposite party, discussing racial differences, and sharing genre-specific sources of narrative conflict.

Scope Interpretation: Evidence from Human and Large Language Models

This study investigates real-time processing and interpretation of quantified sentences in English and Chinese, focusing on cases where an existential quantifier precedes a universal quantifier. Using a self-paced reading (SPR) task and comprehension questions, we found that surface scope was processed faster than inverse scope, with differences emerging in later processing regions, aligning with the Processing Scope Economy principle. Cross-linguistic differences revealed that inverse scope was less accessible in Chinese than in English, confirming scope rigidity in Chinese. Working memory influenced offline interpretation but not online processing. Additionally, large language models only partially resembled human performance, with BERT-based models aligning with human data in English but not in Chinese, likely due to training biases. These findings contribute to understanding scope processing mechanisms, cross-linguistic variation, and cognitive constraints, while also informing the limitations of LLMs in modeling human sentence comprehension.

How Infant-Like, Embodied Visual Experiences Can Support Generalization to In-The-Wild Images: Insights from Domain Adaptation

Infants' object play is closely tied to language learning and 3D object understanding. How does this kind of embodied visual experience support infants' abilities to categorize objects ``in the wild?'' We address this question using domain adaptation, a machine learning framework for studying distribution shifts, i.e., how to transfer knowledge learned from one data distribution to a related but different distribution. We formalize a specific distribution shift problem inspired by infant visual learning---the VI-Shift problem---which mimics the tradeoff between object instances and viewpoints in these two regimes of visual experience. We study the VI-Shift problem through the lens of domain adaptation in deep learning architectures, in particular using novel metrics to demonstrate how the clusterability of learned features contributes to robust generalization. We show that two classic domain adaptation methods do not perform well on the VI-Shift problem, and we demonstrate a novel loss function that improves performance by leveraging some of the distinctive visual characteristics of embodied object play experiences. Our results illustrate one potential learning route through which the distinctive visual properties of embodied object experience can boost robust generalization.

Amortizing Structure Discovery with Generative Flow Networks

An open problem in cognitive science is how the human mind discovers and represents structures for human knowledge. To address this problem via a computational approach, Kemp and Tenenbaum (2008) introduce a posterior distribution which assesses the goodness of fit for a structure to some data; however, Kemp and Tenenbaum's method finding structures with high probability under this posterior relied on hand-crafted search heuristics, was intractable at inference, and lacked convergence guarantees to the true posterior distribution of structures. To address this, we amortize this process with a generative flow network (GFlowNet), a novel framework for developing probabilistic models. When trained, our GFlowNet samples structures proportional to Kemp and Tenenbaum's posterior. We show preliminary results on synthetic datasets that highlights the benefit of our approach in both scaleability and simplicity.

Cognitive Distillation with Parameter-Efficient LLMs: Chain-of-Thought Calibration for Personality Prediction

Large language models (LLMs) excel at personality prediction but are often impractical for deployment due to high computational demands. This work introduces cognitive distillation with Chain-of-Thought calibration, a novel framework for transferring structured reasoning from large LLMs to smaller, efficient models. Inspired by cognitive architectures like ACT-R, our method aligns intermediate inference steps using exemplars from single or multiple LLMs. A 1.5B-parameter model distilled through this process surpasses Qwen1.5-110B in predictive accuracy, achieving a 28% improvement in Pearson correlation while using just 1.36% of its parameters. Ablation studies reveal that moderate sampling diversity and multi-model ensembles enhance cognitive skill transfer and construct validity. These findings demonstrate that high-level reasoning can be efficiently and faithfully transferred, enabling psychometrically robust personality assessment in resource-constrained settings. This approach bridges AI and cognitive science, offering a scalable path toward plausible, cognitively grounded language models.

Hypocrisy, Emotion, and Belief Alignment: Betrayal May Drive Moral Judgement

The present research is an exploratory study to investigate how hypocrisy impacts emotions and how pre-existing moral beliefs influence emotional and moral reactions to hypocrisy, factors that have received little attention in previous research. We gave participants scenarios in which targets were either hypocritical (actions and beliefs did not match) or non-hypocritical (actions and beliefs match). Results showed that hypocrisy elicited anger and disgust but had no significant effect on general affect. Hypocritical actions were perceived as less morally acceptable, but the effect was relatively weak. Analysis of the alignment of the participant's beliefs with the scenario character's beliefs suggest that such alignment may attenuate the overall impact of hypocrisy. Judgments of hypocrisy may be driven more by feelings of moral betrayal than by hypocrisy itself, but follow-up studies are required to establish this conclusion.

Developing a Mentoring System Based on Behavior Logging and Personalized Cognitive Modeling

This research introduces an intelligent mentoring system that utilizes behavioral logs and cognitive modeling to provide personalized learning interventions. The system analyzes semantic patterns during web browsing to estimate cognitive parameters and predict task completion times. By employing CLIP, the system evaluates semantic alignment between learning objectives and accessed resources. Our experimental study with eight postgraduate students serves as a starting point for ACT-R cognitive parameter estimation. The system tracks both content relevance and user interest as internal human dynamics to maintain engagement with study materials. By integrating ACT-R architecture to estimate working memory capacity, attention span, and anxiety levels, the system maps these parameters to observed browsing behaviors. This dual-stage approach enables real-time cognitive estimation and personalized schedule generation, establishing a foundation for adaptive learning technologies that provide tailored interventions based on semantic dynamics analysis and cognitive parameter estimation.

Linguistic and visual cues in counting and measuring

In deciding "Which has more?" of some items, we rely on cues in the environment to decide how to answer: both in the form of linguistic cues and object form. Linguistic cues may appear as a classifier or "quantizer" like "ounces" as in "Which has more in ounces?" On the other hand, object cues have to do with the form of the entity and how object-like it is. Both linguistic and object cues can present ambiguities and leave the desired dimension up for interpretation. In the two experiments presented here, we examine how people decide whether to count or measure when presented with two item sets that vary independently in both numerosity and total size. Our findings suggest that dimension selection in quantifying is a process that relies not only on one dimension, but on effects of both linguistic indications and object factors.

Emergence of Communication: A Comparative Study of Instance-Based Learning in ACT-R and PyIBL

This study investigates the role of memory mechanisms in the emergence of communication by comparing two instance-based learning models: one implemented using the ACT-R cognitive architecture and the other using PyIBL, a lightweight framework based on Instance-Based Learning Theory. Both models were tested on a simulated communication task requiring agents to coordinate actions message exchange using abstract symbols. The ACT-R model, featuring an explicit goal-representation module and precise memory structure, led to faster formation of communication system and more successful task performance. In contrast, the PyIBL model showed delayed emergence of communication system, attributed to its simplified memory representation and difficulty in imitation during the task. These results suggest that detailed goal representation and mechanisms for self-other distinction play a critical role in communication development. The study also demonstrates the potential of cognitive modeling for connecting individual-level processes with large-scale simulations of social behavior.

Processing non-maximal readings of sentences with plural definites

Sentences with plural definites like ‘the circles' are usually interpreted akin to universally quantified sentences (maximal reading). In some circumstances, however, they also allow for non-maximal readings, i.e imprecise readings that allow for exceptions. We report two mouse-tracking experiments that investigate the online derivation of these non-maximal in- terpretations of plural definites. Our results show that non- maximal readings are harder to process than maximal interpre- tations. This difficulty seems to be associated to hesitation in the decision-process rather than to a truly two-step derivation. Interestingly, we also find that experimental setup plays an im- portant role in the availability and difficulty linked to these readings.

Member Abstracts with Poster Presentation

Spontaneous small ratio production in perceptual comparison task

How does the perceptual system compare quantities in the environment? Adult participants (n=47) were asked to judge the relative similarity or dissimilarity of two line lengths or brightnesses by making a mouse click along an unmarked horizontal bar. Despite receiving no instruction or feedback regarding how the stimuli should be compared, responding was remarkably consistent across observers and between modalities. A linear model based on the ratio of the smaller to the larger magnitude accounted for 92% and 93% of variance in average responses to line lengths and brightnesses (respectively) across 28 repeated stimulus pairs. A replication using 336 randomly generated pairs showed similar results (with 90% and 91% of variance accounted for). Decades of psychophysical research have delivered mixed results with respect to the relative importance of ratios – as opposed to differences – in perceptual comparison. The current data suggest ‘small ratios' (Morton et al., 2024) are the predominant comparative function.

FutureMind: How Human Cognition Shapes the Way We Think About Our Futures

"FutureMind" is the composite human mental ability to imagine distant, large, complicated futures, often wildly unlike any human memory. It is an astonishing, ubiquitous, powerful, distributed ability, mostly absent from other species, yet mostly unstudied in Cognitive Science. It is indispensable for human activities from political systems to personal ambition. Our understanding of human history consists largely of notions of how previous generations used their own FutureMind. FutureMind relies on both deeply entrenched and highly creative mental operations. In this presentation, we outline a research program for a systematic science of FutureMind, with an initial theoretical framework.We present the cognitive mechanisms underlying FutureMind, which include framing, blending, compression, analogy, selective projection, viewpoint blending, and the construction of networks of mental spaces.

Learning imposes a bottleneck beyond anatomical constraints: a computational investigation into the nature of WM capacity limits

Human working memory (WM) is central to our complex cognitive capacities. Famously, it is limited, and much debate surrounds the nature of this limitation. Anatomical evidence reveals strongly-connected neural populations in PFC allowing robust maintenance. Past work implicates a basal ganglia circuit in WM management. There remain open questions about whether, aside from anatomical ‘hard' limits, there are computational ‘soft' limits that arise from learning/management bottlenecks. Here, we use computational modeling to tease apart these factors by considering them in isolation: we allow a transformer model trained to do a symbolic WM management task full access to past context, manipulating only the number of concurrent symbols it needs to learn to maintain, controlling for surface complexity. We find that despite having no ‘hard' limits, the model shows difficulty in learning that scales with the computational demand, suggesting WM limits in humans may have arisen due to a learning bottleneck.

Metaphor, Polysemy and Semantic Extension in an Artificial Language Learning Experiment

Polysemy is pervasive in language use and plays a crucial role in enabling the boundless expressive capacity of human language. Semantic extension based on metaphorical associations has been argued to be a key process in words acquiring novel, additional meanings (Anderson, 2017). In this poster, we report the results of an artificial language learning study in which participants had to extend the meaning of previously learned items to refer to new referents. We hypothesised that participants would choose semantic extensions based on metaphoric associations proposed by Conceptual Metaphor Theory (CMT) (Lakoff & Johnson, 1980; Kövecses, 2010). The results indicate that participants seem to make systematic use of salient semantic and metaphoric associations and mappings when having to extend the meanings of learned form-meaning pairings from concrete items to more complex and abstract referents. However only in some cases did participants perform semantic extensions according to our prediction based on CMT.

How shared interactive experiences reshape the semantic representations of abstract concepts

How do people's understandings of abstract concepts evolve through interacting with others? While prior research has focused on individual cognitive processes, how people reflect on and adapt knowledge in social contexts remains underexplored. This study examines how shared interactive experiences during a word-guessing game influence semantic representations of abstract words. Participants completed a spatial arrangement task (SpAM) before, immediately after, and two weeks after the game. Abstract words used as game targets underwent significant positional changes, indicating semantic reorganization. Semantic alignment between game partners was stronger than between non-partners, as measured by a property listing task (PLT), highlighting the role of shared interaction in driving semantic changes. Additionally, in-game semantic alignment measures predicted post-game performance in SpAM and PLT, suggesting that dyadic interaction quality influenced the magnitude of semantic change. These findings provide empirical evidence for the socially driven and dynamic nature of abstract concept representations in collaborative contexts.

People Consistently Overweight Extreme Outcomes in Risky Choices, Even after Long Delays

When making decisions from experience, people rely on memories of past outcomes, which can be influenced by extreme outcomes (best and worst). These memories, however, can be forgotten. In a pre-registered experiment, we evaluated risk preference with delays of up to 7 days between initial learning and later consequential decisions. Participants (N=277 total) learned to choose between safe and risky options (e.g., 25 points vs 50/50 chance of 5 or 45) both in gain and loss domains. After a delay, they made decisions without feedback. All groups were more risk seeking for gains than losses by the end of learning, contrary to the typical pattern with explicit descriptions. This effect was long-lasting and persisted across the delays. Additionally, risk aversion slightly increased for both gains and losses with any delay. These results provide novel evidence that the influence of extreme outcomes on risky decisions persists over long-term delays.

From hearing to feeling: Quantifying music-emotion and examining the different processing patterns in children with special educational needs (SEN)

Music-emotion recognition, the ability to perceive emotions in music, has emerged as a means of understanding emotion beyond verbal language, specifically for individuals with special educational needs (SEN). However, there has been little focus on delineating emotion through quantified music features for a systematic comparison between different SEN groups. This study identified specific musical features and examined the different music-emotion processing patterns in 3-to 10-year-old Chinese children with and without SEN. Participants completed a forced-choice task by identifying four emotions involving happiness, sadness, anger, and fear from Western classical music. Through integrating a biologically-inspired filterbank into music information retrieval analysis, the result revealed that musical features, such as spectral density, contributed to human emotional recognition. In addition, children with SEN exhibited distinct confusion patterns in some emotion pairs compared to their typically developing counterparts. These findings demonstrated a novel approach to investigating musical-emotional recognition across the developmental span.

The emergence of flexible perspective reasoning in large language models

Work on human reference processing has shown that, in sentences like "Mary asked her daughter Sally if she understood the assignment", readers overwhelmingly interpret "she" as co-referring with "Sally". This reflects perspective inference, or reasoning about who possesses at-issue information, and is inconsistent with a statistically-learned bias toward subject antecedent selections. The flexibility of inferencing is evident from the effect of manipulating the object character description ("Mary asked her tutor…"), where readers now prefer Mary as the antecedent. Until recently, these patterns have been largely unaccounted for by large language models (LLMs). Leveraging advancements in LLM interpretability techniques, the present study systematically examines how LLMs fare in relation to human judgments. We determine which layer activations impact these inferences and perturb them to causally link activations to model performance. Finally, we examine performance across training iterations, analyzing the point where subjecthood biases become evident and when more nuanced inferencing emerges.

Units of representation: Children's perception of number in the "connectedness illusion"

The developmental and evolutionary origins of abstract number reasoning have long been debated. Central to this debate is the underlying unit: whether the quantitative reasoning observed in infants and animals necessitates truly numeric object-level representation or can instead be inferred from covarying low-level spatial frequency. Recent studies with adults rely on the "connectedness illusion" to dissociate cardinality from spatial frequency, suggesting object-level representation is fundamental. However, whether these representations exist early in development remains underexplored. We use the connectedness illusion to test whether 3–6-year-old children enumerate objects or spatial frequency. Children complete a non-symbolic comparison task modeled after He et al. (2009). On 50% of trials, two dots are connected by a line, forming a "barbell." Results show that, like adults, children underestimate connected displays despite instructions to ignore the connections. These findings suggest that object-level representations, rather than low-level spatial frequency, underlie children's quantitative reasoning.

A Computational Model of Human Vocal Imitation

Humans have the remarkable intuitive ability to use their voices to imitate any sound they hear. For example, imagine conveying the sound of an engine to a mechanic, a birdcall to a friend, or a strange new synthesizer to a musician. Vocal imitation is fundamental to human communication, and may even form the basis of early language learning. But it is also mysterious: how do we use the limited affordances of a vocal tract to communicate novel sounds far beyond its reach? We propose that the answer lies in recursive social reasoning: we present a computational model of vocal imitation that extends the Rational Speech Acts framework with a simulated vocal tract for the speaker, and a feature-based model of auditory perception for the listener. Without fitting any parameters, our model accurately predicts the types of phones human speakers choose when imitating a variety of real-world sounds (R^2 = 0.809).

Cognitive coherence and resource rationality: rethinking resistance to belief change

Human resistance to conceptual change is not a cognitive anomaly but a rational process rooted in the complexity of belief systems. This review argues that such resistance reflects an optimization process balancing coherence, accuracy, and computational cost. Beliefs exist within interconnected networks, making revision challenging, as changes to one belief necessitate broader adjustments. The Duhem-Quine thesis highlights how auxiliary hypotheses shield theories from refutation, a mechanism underexplored in cognitive science. Recent evidence suggests that individuals employ ad hoc explanations to preserve beliefs when faced with contradictory evidence, mirroring Neurath's ship analogy of gradual belief revision. Bayesian models suggest that belief updates occur incrementally, with adjustments to peripheral beliefs before structural changes occur. This review also reassesses findings on cognitive dissonance and confirmation bias, arguing for a more nuanced, adaptive perspective on belief revision. By framing resistance as a rational strategy, this work contributes to ongoing debates on the dynamics of conceptual change.

Mental models of fluids in solid body rotation: students' reasoning about fluid processes modeled during oceanography and atmospheric science instruction

A pedagogical tool in undergraduate oceanography and atmospheric science education is a water filled, rotating, tank used to model geophysical fluid flow on Earth. We used a typical classroom rotating tank to investigate students' mental models of fluids in rotation. In two experiments, we conducted semi-structured interviews with participants (N=59) who predicted the behavior of water in rotation, explained their predictions, and attempted to make sense of observed demonstrations. We found participants had accessible mental models of fluid behavior based on analogies, however these mental models were wrong. Our results suggest that the behavior of rotating fluids is highly unintuitive and that without tangible opportunities for mental model formation, the mind adopts mental models representative of experiences with fluids to fill this gap. We discuss implications for education and suggest a better understanding of how humans reason about fluids can inform cognitive science while improving oceanography and atmospheric science instruction.

The Effects of Cognitive Load on Full-Body Gaze Control During 3D Visual Search

Increased cognitive load is linked to decreased fixation counts and longer dwell times during visual search. However, past work studied eye movements in 2D tasks, whereas in 3D environments the head and body help the eyes gather information. Thus, we examined how cognitive load affects the eyes, head, and body in 3D visual search. Cognitive load may create difficulty planning the eye, head, and body movements needed to search a 3D space. We used an eye-tracker to record gaze and inertial sensors to measure the motion of the head and body during a dual task paradigm: Participants searched a set of 27 images distributed in a 270º space for stimuli of a specified criteria. Cognitive load was manipulated by having participants count backwards by set intervals (1, 3, 5, 7, and no counting). Results showed a decrease in total fixations and an increase in stimulus dwell times when under load.

Evaluating sensorimotor knowledge in large language models

Large Language Models (LLMs) process language but lack direct sensorimotor experience. This study assesses their ability to estimate human perceptual ratings using the Lancaster Sensorimotor Norms. We prompt LLMs with the original rating task instructions and analyze their correlations with human norms. Results suggest LLMs struggle with embodied cognition, highlighting limitations in computational models of sensory meaning. Future research should explore fine-tuning LLMs on sensorimotor data to enhance embodied representations.

Choosing to choose: Neural Mechanisms Underpinning Levels of Volition

Voluntary action entails decision-making guided by endogenous intentions, yet the neural correlates of varying levels of volitional freedom remain underexplored. This fMRI study examined 39 participants performing a task with three conditions: (1) cued Go/NoGo, (2) 2-choice Go (left/right), and (3) 3-choice Go/NoGo (left, right, or NoGo). Behaviorally, reaction times increased with greater volitional freedom, indicating higher cognitive demands. BOLD analyses showed dlPFC activation for choices (2-choice Go > cued Go), supporting its role in higher-order cognition. Greater volitional freedom (3-choice Go > 2-choice or cued) further engaged the right insula and SMA, the latter likely reflecting the sequencing of decisions about whether to act and, if so, which action to select. The left insula was activated across all choice conditions relative to cued. These results advance our understanding of voluntary decision-making by showing distinct neural activation patterns underlying varying levels of volitional freedom.

Collaborative Encoding of Visual Working Memory

Collaboration helps humans surmount individual cognitive limitations by distributing information over many minds. However, figuring out when and how to collaborate is not trivial. This study examines whether dyads split up information in a collaborative visual working memory task when doing so improves performance. Participants (N=356) memorized grids of 4, 16, or 36 images both alone and with a partner. We used a visual working memory model to estimate how much dyads would benefit from splitting up a grid of images, rather than each memorizing the grid independently. Our model predicts that participants should split up grids that are neither too easy nor too difficult to benefit from collaboration. Indeed, participants tacitly adopted conventions to split up medium and large grids---and were more accurate in these conditions when they worked together than when they acted alone---but not small grids where individual performance was already at ceiling. Our work provides a first step to understand how decisions about when and how to collaborate are shaped by the adaptive use of cognitive resources.

Overcoming sparse and uneven evidence with natural language communication

Humans rely on social learning to go beyond their personal experience. This requires identifying 'experts' and resolving information duplication. Several models have been proposed to examine how networks may interact with difficult information conditions, without accounting for natural language. We provide an examination of the medium through which information is exchanged in varied epistemic contexts. We report an experiment where N=1236 participants from Prolific were asked to make inferences about a probability distribution. We compared two communication modalities: a constrained slider and an interactive chat. The games were categorized on difficulty: information distribution, accuracy of individual sample, and network sample size. All groups converged toward more accurate inferences, with rates varying across modality. Natural language reduced error in challenging epistemic conditions. Harder representation conditions also decreased error over time as a main effect, supporting the idea that one well-informed player in a connected network can significantly influence the game outcome.

Temporal dynamics of numerosity and symmetry processing in the human brain

The number of items in a visual image is underestimated when these are symmetrically arranged compared to when they are randomly scattered in space. The neural mechanisms underlying this perceptual illusion remain unclear. In this study, adult participants viewed arrays of dots varying in numerosity and spatial arrangement while undergoing EEG recording. Visual evoked potentials (VEPs) associated with numerosity were distinguishable over occipito-parietal electrodes in two distinct time windows: an early response ~80 ms post-stimulus and a later response from ~150 ms. Sensitivity to spatial arrangement (symmetrical vs random) emerged at approximately the timing of the second numerosity-related window. Representational similarity analysis confirmed that numerosity was processed independently of low-level visual features (dot size and convex hull). These findings suggest that the underestimation of numerosity in symmetrical displays may stem from later-stage perceptual grouping processes, whereby the visual system derives numerosity from segmented objects.

The Remote Infant Studies of Early Learning (RISE) Battery - A scalable assessment of cognitive development in infancy

Capitalizing on advances in remote developmental testing and automated gaze detection, we established a battery of tasks for comprehensive evaluation of cognitive development in infancy. The Remote Infant Studies of Early Learning (RISE) Battery allows for large-scale assessment of skills hypothesized as building blocks of cognitive development. RISE assesses attention, memory, prediction, multimodal processing, word comprehension, social evaluation, and numeracy, all with established predictive value for developmental outcomes. Using childrenhelpingscience.com, we recruited 111 infants for participation from home, at their convenience. Results were consistent with preregistered predictions for attention, memory, prediction and word comprehension tasks, but not for multimodal processing, numeracy, and social evaluation tasks. Results support the use of this battery to investigate mechanisms of infant cognition in relation to early developmental trajectories, with implications for early identification of developmental delays, evaluation of interventions to enhance early development, and testing of computational models of infant cognition and learning.

The (side) effects of medicalization: How viewing mental disorders as brain disorders shapes perceptions of onset, recovery, severity, and treatment efficacy in the general public

The question of whether mental disorders are brain disorders has sparked curiosity in cognitive science for years. But does framing mental disorders as brain disorders actually help the public better understand and engage with mental health? What do people understand when we call something a brain disorder, and why does it matter if mental illnesses are described as brain-based? To explore these questions, we conducted three quantitative vignette studies with a UK-based general public sample, focusing on perceptions of seven mental disorders: ADHD, ASD, OCD, major depressive disorder, anorexia nervosa, schizophrenia, and bipolar disorder (as defined in the DSM-5-TR). Our findings show that seeing mental disorders as brain disorders is linked to beliefs about greater severity, earlier onset, longer duration, lower chances of recovery, and higher effectiveness of medication. These results highlight how public perceptions might impact reasoning and decision-making about mental health.

Eye Movement Patterns Influence Investment Decision Making

Goals play an essential role in how individuals process visual information. Prior work shows that different explicit visual task goals, such as instructing participants to identify specific graph features, lead to distinct visual search strategies and eye movement patterns (Hosseinpour et al., 2024). However, little is known about how such goals influence attention and decisions when not explicitly guided. This study explores how explicit and implicit goals influence attention and decisions in stock market graphs. In three eye-tracking experiments, we investigate the impact of auditory instructions and implicit visual cues that guide participants to focus on either the highest or lowest dips in stock market trend lines. The auditory instructions are provided either explicitly, directing participants on where to look, or implicitly, by subtly guiding their gaze through descriptive scenarios about each company. Our findings suggest that attentional goals significantly influenced participants' gaze fixation patterns and investment decisions.

Copilots for Cognitive Linguists 2025

This presentation explores the use of Large Language Models (LLMs) as "copilots" for linguistic research in Frame Semantics and Construction Grammar. We examine the potential of conversational AI, like ChatGPT and OpenAssistant, to augment FrameNet, create new frames, and analyze constructions. We review ways of using locally-run models (DeepSeek-like) and local RAG to the same purpose. We review prompt engineering, qualitative and quantitative analysis, and controlled experiments. Results show LLMs can generate examples and analyses of constructions, expand FrameNet, and work in multiple languages. Careful prompting and critical evaluation are indispensable. Despite limitations, LLMs offer a novel approach to linguistic inquiry, and ongoing work focuses on fine-tuning and integrating them with other tools. We also discuss the development of pedagogical AI for Cognitive Linguistics, including the FrameChat tool, which uses open-source LLMs and FrameNet data to facilitate research and education.

Visual groundedness as an organizing principle for word class: Evidence from Japanese

How do languages structure word classes, and is this organization arbitrary? This study explores groundedness--the degree of association between a word in an utterance and the meaning which the utterance describes--as a potential organizing factor. In particular, we look at visual groundededness, where meaning is approximated with an image, allowing tractable estimation with neural vision-and-language models. Prior work showed that nouns, adjectives, and verbs differ in groundedness cross-linguistically. We test whether groundedness describes the atypical word class structure in Japanese, where adjectives are split split into i-adjectives (formally verb-like) and na-adjectives (noun-like). Analyzing the Crossmodal-3600 dataset with the PaliGemma model, we find that na-adjectives exhibit significantly (p=0.029) higher visual groundedness, suggesting that the formal similarities of these classes to nouns and verbs reflect their semantics. This challenges the idea that linguistic categories are purely conventional or innate and arbitrary, and supports a view where language structure emerges from human perception.

Perceptual Strategies in Speech Quality Discrimination: A Comparison of Blind and Sighted Listeners

This study examines perceptual differences in speech quality discrimination using a novel ternary AX task, testing how blind and sighted participants perceive distortions in a synthesized speech signal. Two groups were tested across two difficulty conditions, involving differing levels of distortions. While sighted participants showed decreased performance as task difficulty increased, blind participants demonstrated greater stability and perceptual capacity. Additionally, groups struggled with different stimuli, suggesting that item-specific perceptual differences drive difficulty, rather than a uniform increase in task complexity. Acoustic analyses were conducted to explore potential factors influencing perception. These findings indicate that blind and sighted participants may rely on different auditory processing strategies, leading to group-specific difficulty patterns. Our results highlight the importance of individual perceptual mechanisms in speech discrimination, with implications for models of auditory perception and accessibility in speech technology.

Anxiety Impacts Reinforcement Learning Subprocesses In a Choose-Your-Own-Adventure Text-Based Game

Reinforcement learning (RL)—using past choices and outcomes to learn best policies—can model the effects of anxiety in human learning. RL mechanisms are commonly studied using game-like computer tasks; but real-life learning often happens in naturalistic settings, with artificial stimuli less involved than day-to-day situations and objects. We used an online "choose your own adventure" text game to test how modality of interaction—narrative versus gamified—impacts the effect of anxiety on RL. 211 participants completed five "chapters" (blocks) of seven choices, followed by informative feedback about underlying story structure. Higher anxiety scores linked to lower model-free accuracy and higher reaction times. An RL model with memory-decay and attention components showed comparable learning rates, but more decision noise and lower attention in anxious participants. This data shows that anxiety impacts RL in naturalistic processing contexts and holds potential insights into the use of narrative, RL-based gamification in educational settings.

Redefining External Memory in the AI Era

External memory, traditionally conceptualized within the extended cognition framework (Clark & Chalmers, 1998), has evolved from debates about its status versus internal memory (Michaelian, 2012) to more pragmatic views emphasizing its role in intention offloading and cognitive scaffolding (Heersmink, 2020; Gilbert et al., 2023). While traditional approaches require external memory to be intentionally recorded and subsequently accessible, the emergence of personalized AI tools fundamentally transforms this human-tool relationship. Drawing on Chalmers' (2025) propositional interpretability framework, we propose that as AI systems become increasingly personalized through access to personal data, the criterion for external memory shifts from human-initiated recording to agent-driven construction of user propositional attitudes. To empirically validate how this reconceptualization captures the emerging nature of external memory, we compiled a comprehensive multimodal dataset including 5-year continuous audio recordings and life-logging data, revealing AI systems' capacity to reconstruct personal propositional attitudes from previously unconsidered forms of external memory.

Eyetracking measures of performance on the Traveling Salesperson Problem

Human solutions to the Traveling Salesperson Problem (TSP) have been proposed to employ heuristics integrating global and local spatial information (Pizlo et al., 2006). Because different neuroanatomical regions may be involved in local vs. global processing, as well as attentional shift between levels, performance on the TSP may provide useful insight into changes that occur in the brain as a result of age or of neurodegenerative disorders (Slavin, 2002). In a previous study, we compared the performance of healthy adults in conditions that varied the availability of global cues. Surprisingly, our results indicated excellent performance on configurations requiring global information, even in conditions that masked these cues. The current study uses eyetracking to examine target fixations during the TSP. The question was whether participants compensate for the presence of distractor cues by constructing a mental outline of the configuration before selecting a route.

Investigating the Impact of Emotional Modulation on Attentional Numerical Representations in Childhood

Emotion dysregulation can heighten attentional biases toward threat, divert attention from goal-directed tasks, and deplete working memory, hindering learning. While research has explored its impact on numerical processing in adults, its developmental trajectory remains unclear. This study examines whether children's emotional modulation, indexed by high-frequency heart rate variability (HF-HRV), relates to neurophysiological markers of numerical attention in ERPs. Eighty-one children (ages 4, 6, and 8) completed a pirate-themed paradigm, including a novel symbol-learning task and a numeral comparison task. EEG and ECG data were recorded alongside self-reported emotions and parent-reported behavioural profiles. While self-reports showed positive emotions, HF-HRV varied across children and strongly predicted task performance (r = .41, P<.001). HF-HRV also correlated with ERP markers of attentional numerical representation, emphasising the role of emotion regulation in symbolic number learning. These findings offer insights into early precursors of mathematics anxiety and the reciprocal link between emotional regulation and numerical cognition.

Do you like a robot that makes mistake? Preliminary study on changing evaluations of a robot that makes mistake in collaborative task

This study explores the human evaluation of a robot's behavior when it makes a mistake during a collaborative task, particularly concerning the success or failure of the individual. We conducted an experiment where an interactive robot collaborated with participants to estimate and report the number of balls and rods in 2D/3D models built using magnetic balls and rod toys. The participants were also required to provide their estimates, and occasionally both made mistakes because of time constraints and parts that were invisible from different views. The experimental results showed that participants evaluated the robot's failures more favorably when they failed than when they succeeded. This suggests that the participants' outcomes may have influenced their perception of the robot's behavior. These findings contribute to the development of a robot that fosters better relationships with humans and deepens our understanding of the psychological effects involved in evaluations within social interactions.

Asymmetries in the Production of Regular and Irregular Verbs

Humans, despite limited cognitive capacity, can generate an infinite variety of sentences thanks to regularities in language, though irregularities also exist (e.g., English past tense). We explore the mechanisms underlying the production of regular and irregular verbs and the role of domain-general processes, offering novel insights from a non-Indo-European language and both L1 and L2. Forty-seven adults conjugated verbs in L1 (Turkish) or L2 (English), and reaction time (RT) was analyzed to assess (1) the cost of switching between regulars and irregulars, and (2) whether individual differences in inhibitory control and working memory predict performance. RT for regular production differed depending on the preceding (N-2) trials (irregular-irregular-regular > regular-irregular-regular), indicating the difficulty in retrieving a regular verb after repeated suppression. This pattern was absent for L1 irregulars or L2, suggesting regulars and irregulars rely on asymmetrical, distinct mechanisms, but only in L1. Neither inhibition nor working memory predicted RT.

Flexible Physical Problem Solving with Strategy Acquisition and Composition

Humans exhibit a remarkable ability to acquire, generalize, and compose strategies for object manipulation, yet the underlying mechanisms of this flexible strategy learning and reuse remain poorly understood. In this paper, we extend the Virtual Tool Game (Allen, Smith, & Tenenbaum, 2020), where humans solve complex physical puzzles in just a few attempts. Through two behavioral experiments, we show that humans acquire abstract strategy representations and can flexibly chain multiple strategies for novel tasks in both forward and backward directions. To formalize this process computationally, we introduce a probabilistic framework that models physical events, actions, and high-level manipulation strategies. Our approach represents strategies as amortized sequences of physical events and integrates them into a bi-level search mechanism that combines simulation and planning. These findings advance our understanding of human physical reasoning and contribute to the development of AI systems with human-like physical problem-solving capabilities.

Estimating Lexical-Semantic Networks from Verbal Fluency Data in 5-8 Year Olds

As children learn language, they organize their knowledge in lexical-semantic networks. Comparing pre-existing methods of accessing underlying networks, we examine developmental verbal fluency in 5-8-year-olds (N=37, mean age=80.27 mos) across two different prompt types — taxonomic (animals and foods) vs. (location-based) thematic prompts (zoo and grocery store) — using two different graph-theoretic estimation strategies: random-walk modeling (e.g., U-INVITE) and GloVe word embeddings. We observed several consistencies: taxonomic prompts elicited more words than thematic prompts (linear mixed-effects model: t(35) = 3.16, p<0.05); networks expanded with age (U-INVITE, t(35) = 4.26, p<0.05; ESN, t(35) = 4.53, p<0.05); and structures spread out (versus clustering densely). However, key differences emerged. Random-walk networks uncovered different highest-degree (most densely clustered) words depending on the prompt type (e.g., "dog" for animals, "monkey" for zoo). By contrast, networks based on word embeddings identified networks with very similar highest-degree words for animals and zoo. Hence, alternative assumptions informing method choices may result in distinct network estimates, with consequences for how we map the growth of lexical knowledge.

Preschoolers' Use of Quantitative Math Language in Picture Descriptions Predicts Early Numeracy Skills

Early numeracy is foundational for children's future mathematics achievement (Duncan et al., 2007), and research shows that math-related language plays a role in its development (Purpura et al., 2017). However, the specific ways in which different types of math-related language contribute to numeracy remain unclear. This study examined if preschoolers' spontaneous use of math language during picture description predicts numeracy skills. A sample of 321 preschoolers (ages 3-5) completed picture description, print awareness, and numeracy tasks, including a measure of cardinality. Children's picture descriptions were coded for cardinal labels (e.g., "three birds"), spatial prepositions (e.g., "in", "on"), and other math-related language (e.g., "more", "some"). Results showed general numeracy and cardinality were significantly predicted by math language use (p=.002) and print awareness (p<.001). Spatial prepositions and cardinal labels were not consistent predictors. These findings provide further evidence for the link between numeracy and language.

Childhood Experiences and Parental Bonding modulates the late positive potential neural index of emotional reactivity

Young adulthood is a high-risk period for Major Depressive Disorder (MDD), often linked to reduced neural responses to positive stimuli, as measured by Late Positive Potentials (LPP). This study examines the connection between childhood experiences, parental bonding, and emotional sensitivity in adulthood. Participants (n=65), without a current MDD, completed assessments on depressive symptoms (BDI-II), adverse childhood experiences (ACE), and parental bonding (PBI). Participants viewed positive, negative, and neutral images while EEG data were collected to measure LPP. Key findings showed that higher depressive symptoms (BDI-II) were associated with increased LPP to negative and decreased LPP to positive images. Higher ACE scores correlated with lower LPP to positive images. Additionally, greater parental care (PBI-Care subscale) was linked to increased LPP to positive and decreased LPP to negative images. The PBI-Overprotection subscale was not a significant factor. The study highlights how childhood experiences and parental bonding shape emotional processing in adulthood.

Testing the role of working memory and domain-general skills in fraction comparisons

We tested the hypothesis that working memory may help overcome interference during fraction comparison tasks: Individuals with a higher working memory capacity should outperform those with lower capacity, especially in trials where fraction magnitudes are incongruent to their numerator and denominator components (i.e., 3/4 is greater than 5/8 although 5/8 has the larger components). Third graders (N = 84) completed a fraction comparison task with congruent trials, where the greater fraction had the larger components, and incongruent trials. We found that trial type influenced performance in fraction comparison, with higher accuracy on incongruent than congruent trials, and that this effect was not moderated by working memory, other executive functions, or relational reasoning. However, working memory was related to a fraction estimation task, suggesting a more general association with fraction competency. These findings have implications on the role of domain-general skills in understanding and learning fractions.

Modeling human learning and exploration in a temporal combinatorial bandit task

Life often presents choices that are not mutually exclusive, yet there has been insufficient research on human learning and directed exploration involved in combinatorial settings. We investigated human behavior in a four-armed combinatorial bandit (CB) task (N=107) where participants combined "nutrients" affecting required nurture time of virtual plants. Participants demonstrated effective learning but converged to suboptimal strategies, preferring combinations of one or two options. To model learning, two computational models were proposed and compared: a naïve extension of upper confidence bound (NaiveUCB), and a linear UCB model (LinUCB), both incorporating heuristic components. The NaiveUCB model with penalty for multiple selections, value decay, stickiness, and recency-based credit assignment best explained behavior, outperforming both LinUCB and simplified variants, suggesting that humans may navigate uncertainty through simple heuristics rather than sophisticated estimation. These findings extend our understanding of exploration and credit assignment in CB, and provide insight into daily decision making.

Striking the Right Chord Between Reuse and Improvisation: Melody Learning as Resource-Rational Program Induction

How do people balance the reuse of learned routines with the need to invent new solutions under cognitive limitations? Recent computational frameworks have begun to develop resource-rational approaches to program induction, with a common theme being the benefits of building a "library" of past solutions for creative reuse. Here, we study these mechanisms in an online experiment where participants learned real-world musical melodies. Our results reveal systematic error patterns during reconstruction and improvisation tasks, with participants repeating local patterns and displaying a behavioral bias consistent with simpler programs. To explain these findings, we developed a non-parametric Bayesian model using a hierarchical Pitman–Yor process to learn both a global library encoding domain-general primitives, and a local library capturing melody-specific motifs—both helping to constrain the hypothesis space. Our model makes testable predictions about human error distributions and adaptive behaviors that balance the trade-off between efficiency and creativity when resources are scarce.

Cross-Linguistic Influence and Developmental Patterns in Reflexive Pronoun Use Among Monolingual, Bilingual, and Trilingual Children

Chinese employs compound reflexives (e.g., ta-ziji) adhering to local binding principles, and bare reflexives (ziji) exhibiting syntactic and semantic flexibility (e.g., logophoric, intensifier, generic pronoun, (Wang & Pan, 2021), whereas English reflexives primarily function as locally bound anaphors but permit extended logophoric/intensifier uses in long-distance contexts (Charnavel, 2019). Previous studies show that monolingual Mandarin children achieve adult-like local and long-distance interpretations by age four (Li, 2024), while English monolinguals master reflexives by five (Chien & Wexler, 1990). This study examines 5,299 reflexive utterances from 476 monolingual, bilingual, and trilingual children (1;0–6;9) and caregivers across 19 CHILDES corpora. Results show Chinese bare reflexives emerge earliest (1;7), followed by Chinese compound reflexives (1;11) and English reflexives (2;4). Chinese reflexives are used as local anaphoric quite late (2;6), possibly attributed to low frequency in input (< .005%). Multilingual children acquiring English alongside Chinese exhibit English-like local-binding preferences in their Chinese bare reflexives.

Smartphone and Cognitive Processes: an experimental study on the Brain Drain effect

The mere presence of the smartphone seems to impair cognitive performance and cause the "Brain Drain" effect. However, researchers have attempted to replicate this effect, with null results in most cases. The present study is a conceptual replication of the experiment by Ward and colleagues (2017), using the smartphone abstinence paradigm to increase smartphone salience and consequently enhance the Brain Drain effect. Participants were smartphone abstinent for 5 hours. At the end of the abstinence period, they performed cognitive tasks after being randomly assigned to one of two conditions: one group performed the tasks while their smartphone was turned off next to them; the second group performed the tasks while their smartphone was in another room. Although abstinence was found to increase over time, the results showed no Brain Drain effect. This effect therefore does not appear to occur even when the salience of the device is emphasized through abstinence.

Understanding the cognitive mechanisms behind groupitizing in early education

Groupitizing, the ability to use perceptual grouping to facilitate enumeration, appears with schooling and predicts math achievement (Guillaume et al., 2023). Our study (Spring 2025) investigates the cognitive mechanisms underlying its development and enhancement during early education. We propose that groupitizing among young children involves two core processes: (i) subitizing to identify the numerosity of small subgroups and (ii) using perceptual grouping to perform mental arithmetic, such as addition or multiplication. To test this hypothesis, children from kindergarten through second grade will perform an enumeration task involving dot sets organized into various grouping structures (subgroups with equal or differing numerosity) and various grouping patterns (subgroups with consistent or varied shapes). Response times and error patterns will be analyzed, alongside children's counting knowledge, arithmetic skills and math achievement. Our results aim to elucidate the cognitive mechanisms underlying groupitizing across development, and the links between foundational perceptual skills and higher-order mathematical reasoning.

Gaze Patterns during Map Reading as Predictors of Route Learning

Participants (N = 74) learned a predefined route from a digital map, during which gaze patterns were recorded. Subsequently, participants navigated the route from memory in a virtual environment, and navigation errors were measured. This task was performed twice, with half of the participants receiving specific map reading instructions between the pretest and posttest. Based on sequences of distances between fixations and the indicated destination of the route, gaze patterns were categorized as systematic (following the route with their gaze) or unsystematic. In the pretest, navigation performance of systematic readers was significantly better, and navigation performance was positively associated with repetitions of systematic reading following the route. Moreover, more systematic reading was found in the posttest, and navigation errors decreased from pretest to posttest. No effect was found for instruction. Results show that effective map reading can be predicted by gaze patterns.

Effects of meditation on emotions and depression: a longitudinal study using Geneva Emotion Wheel

Beneficial effects of meditation on mental well-being are not unequivocally confirmed in research (Goyal et al., 2014). In our study (n=19), we hypothesize that meditation (practiced over a longer period of time) impacts emotions and their intensity. We used The Geneva Emotion Wheel in meditation sessions (n=12) performed regularly over a 12-week period of time, from April to July, 2024. We controlled depression and personality type. Emotions were measured before and after each meditation session. Depression was measured before the start of the meditation course and at the end of it by The Patient Health Questionnaire (Kroenke et al., 2001). Personality traits were tested once - before the start of the meditation course by Big Five Inventory-10 (Rammstedt & John, 2007). The results showed that meditation had a positive effect on several emotions and their intensity, but the effect on depression was not found.

Evaluating Vision Language Models Through Concept Hacking

Evaluating the cognitive abilities of Vision-Language Models (VLMs) is challenging due to their reliance on spurious correlations. To distinguish shortcut-taking from genuine reasoning, we introduce Concept Hacking, a paradigm manipulating concept-relevant information to flip the ground-truth but preserving concept-irrelevant confounds. For instance, in a perceptual constancy test, models must recognize that a uniformly wide bridge does not narrow in the distance; the manipulated condition using concept hacking altered the bridge to actually taper. We assessed 209 models across 45 experiment pairs spanning nine low-level cognitive abilities, encompassing all five core knowledge domains. Comparing performance on manipulated versus standard conditions revealed that models fell into shortcut-reliant or illusory understanding types, with none approaching human-level performance. Models of varying sizes appear in each category, indicating that scaling neither imparts core knowledge nor reduces shortcut reliance. These findings highlight fundamental limitations in current VLMs, reinforcing concerns about their ability to achieve genuine understanding.

Exploring the Impact of Regularity, Frequency and Phonological Complexity on Morphological Production in Children with DLD and Phonological Disorders

This study examined the effects of regularity, frequency, and phonological complexity on morphological production in Turkish-speaking children with developmental language disorder (DLD) and phonological disorder (PD) compared to typically developing (TD) peers. Thirty children (ages 4–6) completed elicited production tasks using real and nonce words with regular and irregular noun and verb suffixes. DLD children showed lower accuracy in tasks involving consonant voicing and epenthesis and performed significantly worse on irregular suffixation, often substituting irregular forms with familiar ones. Nonce word production confirmed these challenges. Random Forest analyses indicated that phonotactic probability best predicted TD performance, while lemma frequency and phonological neighborhood density were more influential for DLD and PD groups, respectively. These findings suggest that DLD children rely on familiar, regular forms to manage morphological complexity, reflecting distinct processing strategies compared to PD and TD peers.

Eye tracking in the real world: a graph-theoretical analysis and comparison to virtual realty

Is virtual reality (VR) necessary to conduct complex immersive eye-tracking studies or can similar data be recorded and analyzed in the real world? We adapted a spatial navigation paradigm from VR to the real city Limassol. Specifically, we combined a free exploration of 130 min with two pointing tasks while recording eye-tracking, head-tracking and GPS data. We labeled the eye-tracking data with a new classification pipeline and found a similar gaze distribution over object categories in both the real world and VR. Furthermore, we hand-labeled fixations on buildings to apply a graph-theoretical analysis. When comparing our results with VR we found some differences (e.g. graph density, diameter), but also many similarities in viewing behavior (e.g. hierarchy index, gaze-graph-defined landmarks). Overall, our work showcases the feasibility of complex eye-tracking experiments in the real world and highlights the similarity of viewing behavior in both the real world and virtual reality.

Whole is Greater than the Sum of its Parts : Bilingualism Shapes Tolerance for Simultaneous Identities

In a set of three cross-cultural studies, we investigated how culture, linguistic background, and way of acquiring a bilingual status could affect tolerance for simultaneous identity (the belief that people can be simultaneously part of two social groups). Adults (N =1412), and 5-7-year-old children (N = 166) read stories about three bilingual children who each acquired a second language by different means (through learning, immigration, or parents) and measured participant's tolerance towards the simultaneous identity. In study 1 and 2 we found that US and Indian bilinguals were more likely to tolerate simultaneous linguistic identity than monolingual groups. In study 3 we find that bilingual 5-7-year-old children from the US and India exhibit a pattern similar to what was previously found in adults. Results suggest that both culture and the experience of bilingualism serve as important mechanisms in shaping our social group cognition.

Evaluating Individual Differences in Multimodal Measurement of Inhibitory Control Using Drift Diffusion Modeling

Computational modeling of behavioral data allows for a precise characterization of distinct aspects of decision making related to the neurocognitive process of IC. In the Diffusion Model for Conflict (DMC; Ulrich et al., 2015)—drift rate (the amount of information absorbed per time unit) and boundary separation (the amount of information accumulation required for action)—have been found to be differentiable processes implicated in IC. In the current study, we evaluated these DMC parameters in independent community/student samples (Ns=150, 199) completing a flanker task and electroencephalogram recording during a novelty-oddball task to elicit a P300 brain response shown to index IC (Brennan & Baskin-Sommers, 2018). Our results showed that only boundary separation significantly correlated (r=-.20,-.28) with amplitude of the P300 brain response, and this effect replicated across both samples. These findings suggest that computational modeling of behavior is better able to bind together measurement of IC across different measurement modalities.

An EEG study of forming new phonemic categories by naive listeners of Mandarin

The current study investigates the formation of new phonemic categories by examining the neurophysiological changes before and after training. Native English speakers tend to perceive Mandarin retroflexes as English fricatives ([ʂ] → [ʃ]) (Rasmussen & Bohn, 2015). In our experiment, native English speakers unfamiliar with Mandarin underwent a pre-training EEG recording in a passive auditory oddball paradigm. Then in two consecutive days, they learned Mandarin words containing retroflexes before undergoing a post-training EEG. The current result shows that the difference in the amplitude of P300 – an indicator of stimuli differentiation (Calcus et al., 2015) – between the retroflex [ʂ] (rare) and non-retroflex [ʃ] (frequent) is larger in post-training than pre-training, suggesting that they are learning to distinguish the two fricatives. Our current findings suggest that the formation of the new phonemic category can be observed in the processing stage that overlaps with P300.

Learning to communicate a shared wavelength

Some social interactions connect us deeply, while others just don't "click." Yet it has proven difficult to pinpoint what aspects of a social interaction account for this variation. To test the hypothesis that connection arises from effectively coordinating on a shared perspective, we introduce a novel experimental paradigm based on the game Wavelength, wherein players provide each other clues to help locate a target on a spectrum between opposing concepts (e.g., Bad/Good; Painful/Pleasant). Each trial involved three clues, with Guessers selecting a position after each one and Clue Givers independently predicting their choices. Players rated their sense of connection and their likelihood of generating the same clue. Results show that Guessers feel more connected to players who accurately predict their clue rating and whose clues are in line with what they would have generated, highlighting the role of shared reasoning and predictive accuracy in fostering social bonds.

How descriptions moderate memory biases in experience-based risky choice

Individuals receive information about risk mainly via two ways: description, which provides explicit outcomes and associated probabilities of available choice options; and experience, where individuals interact with the choice options and receive feedback from their choices. In the current work, we investigate how the presence of descriptions in a risky experience-based task influenced choice behaviour and memory of past outcomes. Participants made repeated choices in either an experience-only condition or a description-plus-experience condition, where descriptions were presented alongside feedback. They were more risk seeking in the description-plus-experience condition than in the experience-only condition, particularly in the domain of losses. This suggests that descriptions have an asymmetric effect, exerting a stronger influence in loss contexts. While the presence of descriptions did not eliminate memory biases (i.e., overweighting the best and worst experienced outcomes), their impact on choice was reduced. Future research will explore the underlying mechanisms of this effect.

Testing the Aspect Hypothesis: Relating child and caregiver verb inflection across development

The Aspect Hypothesis suggests children's early verb production choices to use perfect or progressive verb form (-ed vs. -ing respectively) can depend on event semantics (Shirai & Andersen 1995). Children tend to use perfective constructions with verbs denoting completion, while present constructions mostly occur with verbs denoting ongoing actions. Li and Shirai (2000) suggest that children may stray from this pattern as caregiver input changes throughout development. However, these changes are underexplored, and previous findings supporting the Aspect Hypothesis emerge from limited corpora. Our study used NLP on all English corpora in the CHILDES corpus to extract the main verbs from the utterances of children and caregivers. We confirmed children and caregivers' general adherence to the predicted pattern. We also found preliminary support for caregivers' shifting in their inflection of multiple verb types. Our findings support the Aspect Hypothesis and provide insight into how children come to broaden their inflections.

Explicit Cooperation Shapes Human-Like Multi-Agent LLM Negotiation

Humans develop cooperation heuristics in social decision-making, either intuitively or deliberatively. Large language models (LLMs), which exhibit human-like biases across cognitive domains, may acquire prosocial tendencies through instruction tuning, enabling cooperative behavior in strategic reasoning games. However, most studies of this kind either focus on cooperative language generation or explicitly instruct LLMs to cooperate, deviating from the inherent cooperation heuristics of humans. Using negotiation role-play simulations with BATNA (Best Alternative to a Negotiated Agreement), we found that LLMs struggle with cooperation in the absence of explicit instructions, leading to a 50–80% lower success rate than in instructed scenarios and 50–60% lower than human performance reported in past studies. Implicitly inducing cooperation through personality traits had inconsistent effects, with agreeableness showing marginal influence and other traits exhibiting no systematic impact. These findings suggest that personality-based cooperation cues are subtle, and explicit instructions remain essential for multi-agent LLMs to approximate human-like negotiation.

Bounded hypothesis testing underlying human learning of probabilistic rules

We investigated human learning of probabilistic rules in experiments using a 2-by-2 feature space. Despite the seemingly minimal complexity, participants struggled to learn nonlinear XOR rules (where outcomes depend on cue matchings) but rapidly mastered linear rules. This difficulty persisted even when explicit probes revealed the possible rules, indicating constraints in hypothesis testing. To explain these behavioral patterns, we propose a hypothesis diffusion model where learning arises from evidence-driven transitions between hypotheses in a sparsely connected network. The model outperformed reinforcement learning alternatives and generalized across different rules. To further understand the origin of the learning difficulty, we trained low-rank recurrent neural networks and found that networks with limited capacity (rank 3) failed to learn XOR rules when trained in biased environments, mirroring human performance. In conclusion, human rule learning may rely on structured hypothesis exploration, with learning biases potentially emerging from adaptations to environmental demands under computational constraints.

The Continuity of Geometric Intuition between Monkeys and Humans

Recent research portrays humans as the only species with a unique ability to detect geometric shapes with Euclidean features (e.g., parallel sides), a phenomenon known as the regularity effect. Studies show humans exhibit this effect across ages, cultures, and educational levels, while non-human primates do not. One interpretation credits this human advantage to symbolic representations, whereas non-human primates rely on low-level visual features. Our findings challenge this theory: with sufficient experience, non-human primates also show the regularity effect, and their accuracy is predicted by "symbolic" geometry models, as in human children. Notably, performance in both monkeys and children is accounted for by a mixture of symbolic and non-symbolic models of geometry. These findings question the claim that symbolic models are language-based or exclusively human, and strongly imply that humans and monkeys share abstract mechanisms to represent geometry. Ultimately, this overlap underscores the continuity between species.

Order Matters: Learning semantic information before seeing a face improves face memory

Prior research suggests that learning novel faces with identity-relevant semantic information is beneficial for face encoding, increasing face discriminability on recognition tasks even when tested using a different image (Mattarozzi et al., 2019; Ünal, Akan & Benjamin, 2024). However, no study has yet examined whether the timing of semantic information presentation is important for this effect by comparing conditions where semantic information is presented either before or after seeing the face. Sixty-two young adults learned a series of 36 faces of which 24 were paired with semantic information (12 face-first, 12 semantic-first) and 12 were not (control). We found that participants showed significantly better discriminability (d') for identities learned in the semantic-first condition compared to the face-first condition. These findings suggest that learning identity-relevant semantic information before seeing a face can optimize face memory, likely by increasing the salience of the face and providing a semantic scaffolding to bolster identity encoding.

Understanding working memory as a facilitator of math learning: Offloading as a potential strategy

High working memory capacity (WMC) is linked with stronger mathematical abilities; however, underlying mechanisms remain unclear. One theory suggests that problem solvers may offload information from their working memory in order to reduce cognitive load to solve problems more effectively. We investigated whether the use of offloading improved problem-solving skills. Ninety-three undergraduate students were administered a pre-test and WMC tasks. Participants were then split into two conditions, offloading or no-offloading, and were administered a post-test. ANOVA results indicated that while both groups improved, the offloading group showed greater improvement. Participants with lower WMC performed better when offloading, but there was no significant interaction between WMC and condition. Additionally, pre-test moderated the effect of offloading, suggesting students might benefit from offloading with greater prior knowledge. These findings have theoretical implications for mechanisms underlying the relationship between working memory and mathematics, and how to support students in classrooms.

A study on playing cards to disentangle order and magnitude in the SNARC effect

The SNARC effect would reflect a left-to-right mental number line (MNL). The aim of this study was to disentangle the roles of numbers' magnitude and order in the SNARC effect by using playing cards as stimuli. While most people hold cards in ascending order (AO), consistent with the MNL, a subset of individuals stably hold cards in descending order (DO), opposite to the MNL, with low-value cards (e.g., 2) on the right and high-value cards (e.g., 6) on the left. In a magnitude classification task (Experiment 1) and in a parity judgement task (Experiment 2) involving both digits and playing cards as stimuli, AO participants showed regular SNARC effects for both digits and cards, whereas DO participants showed regular SNARC effects for digits but not for cards. These results suggest that the order of cards in DO participants prevents the SNARC effect from occurring, but does not reverse the effect.

Fantasy Play and the Language of Emplotment in Greek L1 Children

Fantasy/symbolic play is central to theories of child cognitive development (Piaget 1962; Pellegrini 1985; Leslie 1987; Francis & Gibson 2022). Most studies suggest that children distinguish pretense from reality by their second year, though the cognitive mechanisms involved remain debated (Leslie 1987). Fantasy play is also linked to language development, including early literacy and metalinguistic awareness (Pellegrini & Galda 1982, 1991; Pellegrini 1984; Orr & Geva 2015). Garvey & Kramer (1989) identify two communicative levels in symbolic play: (i) enactment and (ii) emplotment. This study examines the grammar used by L1 Greek children while setting up scenes and giving instructions. Based on novel naturalistic data from 55 recorded sessions with 14 children (aged 2;7–6;4), we show that by 2;7, children produce counterfactual scenarios with a light verb meaning ‘pretend'. By 5;0, they employ counterfactual morphological marking in symbolic play before using it in other contexts (Amsel & Smalley 2000).

Rethinking Rumination: A Decision-Theoretic Approach Without Negativity Bias

Prior work by Bedder et al. (2023, 2024) modeled rumination as a negativity-biased decision process under uncertainty in a POMDP framework, where excessive sampling results from pessimistic priors and uncertainty about negative experiences. While these models provide valuable insight, they assume that excessive information-seeking is driven primarily by negative affect. We explore an alternative hypothesis: excessive information-seeking may result from optimal inference due to uncertainty and planning depth, independent of negativity bias. Using a POMDP solver with recursive value iteration, we find horizon length and uncertainty influence the persistence of sampling actions without a negative reward bias. This shifts rumination from an affect-driven process to a more generalized preoccupation mechanism consistent with the ICD-11's transdiagnostic conceptualization of preoccupation (Eberle & Maercker, 2022). Unlike prior models with fixed stopping thresholds, our approach may allow preoccupation to emerge dynamically from decision parameters without explicit negativity bias.

Equal > Equals: Labels Affect How Math Symbols Are Interpreted

How do labels affect people's interpretation of symbols? The preferred label for "=" is debated amongst mathematics educators. In Study 1, U.S. adults defined the "=" symbol and rated the "smartness" of definitions when the label "equal" or "equals" sign was used. As predicted, the "equal" label activated a relational understanding of the equal(s) sign more than the "equals" label. Study 2 utilized a 2x2 factorial design and incorporated an activation phase before the definition tasks. Participants were randomly assigned to operational activation, where they solved problems with operations to the left of the equal(s) sign (8 + 5 = __), or to relational activation, where they solved problems with operations to the right of the equal(s) sign (8 = 5 + ___). Results suggest the terms "equal" and "equals" are not equivalent. The term "equal" may enhance students' conceptual understanding of equations.

Ready, Set, LEGO®: Examining the effects of construction on undergraduates' 3D spatial learning using virtual and physical models

One frequent task necessary to progress within STEM disciplines is understanding and reasoning about three-dimensional (3D) spatial information. Markedly, undergraduates find understanding and reasoning about 3D spatial content to be quite challenging. Today, educators use virtual, physical, or both kinds of models to better support their students' learning of abstract concepts. Prior research has provided insights into whether virtual or physical models provide better support when learning 3D STEM concepts (e.g., Casselman et al., 2021; Justo et al., 2022). However, the details of if and how physical versus virtual models support student learning of 3D spatial information devoid of domain-specific content is not well-understood. This study will examine the effects of virtual versus physical block construction on postsecondary students' 3D spatial learning and preliminary results will be presented. The findings from this study may have implications for facilitating 3D spatial learning and integrating digital tools in the classroom.

Orthographic positional biases in pre- and early readers

English-speaking adults demonstrate a strong bias toward processing the beginning of a word when learning to read novel words. This study explored when in literacy development this onset bias emerges. English-speaking pre- and early-readers (4-6-year-olds) were tasked with matching known spoken words (boat) to text presented with three competitors: a nonword that shared the target's onset (baot), an English word that shared its offset (goat), and a nonword that shared an onset with the real-word foil (gaot). We then analyzed participants' accuracy, and false alarm rates to the different competitors. Both groups were above chance in matching known spoken words to their written counterparts. Like adults, children in both groups false alarmed to onset foils significantly more than any other foil. Onset biases were stronger in early-readers than pre-readers, and in children with better word matching skills. The onset bias emerges early in reading development, as children start mapping text-to-speech.

The representational space of symbolic numbers: from integers to fractions

Mathematics is a central tool for understanding the universe, yet how numbers are cognitively and neurally represented remains unclear. This is especially true when focusing on higher-level concepts, such as symbolic integers, including zero, and fractions. Twenty participants were scanned using a high-resolution 7T fMRI while performing a task in which they judged whether the current number was larger or smaller than the previous one. Behaviorally, we found that participants were slower and less accurate when two numbers were closer in distance on the number line. Neurally, our findings suggest that brain areas involved in mathematical processing – such as the intraparietal sulcus, inferior temporal gyrus, and prefrontal cortex - presented a graded neural BOLD response as a function of numerical distance on the number line. However, both behavioral and neural representations clearly distinguish between fractions and integers.

Listening-Related Fatigue and Cognitive Effort in Deaf and Hard-of-Hearing Bilinguals

Listening-related fatigue is a well-documented challenge for deaf/hard-of-hearing (DHH) people who rely on amplification devices and speechreading to access spoken language (Holman & Hornsby, 2020). While previous research has attributed fatigue to effortful auditory processing, it has largely overlooked the cognitive demands of bilingual DHH individuals who navigate both spoken and signed languages. This study examines the role of bilingual language experience in mitigating cognitive fatigue. Using survey data from 200 DHH adults, we found that greater reliance on English and speechreading correlated with increased fatigue, while higher use and proficiency in American Sign Language (ASL) was associated with reduced fatigue and improved communication well-being. Principal Component Analysis revealed distinct cognitive and social fatigue factors, highlighting the role of modality flexibility in cognitive load management. These findings suggest that sign language use may serve as a protective factor against cognitive exhaustion, informing models of multimodal bilingualism and cognitive effort.

Individual Differences in the Functional Role of Arousal Synchrony on Empathy

Previous work has shown that people's arousal states become synchronized during natural communication. The present study examined the functional role of such synchrony. We recorded 10 personal stories, half happy, half sad, from a set of storytellers while continuously measuring electrodermal activity (EDA). A separate set of participants listened to the recordings while EDA was measured, and they completed a state-empathy questionnaire following each story. We predicted listeners would empathize more with storytellers when EDA synchrony was higher. Results revealed that synchrony did modulate empathy, but this depended on the valence of the story and the trait empathy level of the listener. Among people with low trait empathy, as synchrony increased, so did state empathy. Among people with high trait empathy, this correlation was negative. These relationships obtained only for sad stories. The findings point to intriguing individual differences in the functional role of arousal synchrony on empathy.

Abstract Over Item-Specific Information: Statistical Learning Optimizes Memory Representations

Statistical learning optimizes working memory by abstracting relations among specific items. However, the mechanisms underlying the representations of abstract and item-specific information remain unclear. This study developed a learning-memory representation paradigm in which three groups of participants, i.e., control, item-specific, and abstract encoding, were presented with picture-artificial character pairs containing abstract semantic categories at high (100%), moderate (66.7%), and low (33.3%) probabilities and item-specific information. Participants performed a visual search task that assessed memory representations through the search speed for artificial characters among abstract or item-specific distractors. Participants spent more time searching among abstract than item-specific distractors in the control but not item-specific condition, indicating that by default working memory prioritizes abstract information. However, this prioritization enhanced for moderate and low probability items in the abstract encoding condition. These findings suggest that statistical learning is central to abstraction, forming flexible memory representations particularly for uncertain inputs to optimize learning processes.

Leveraging Machine Learning and Wearable Cameras to Analyze Children's Social Interactions

Direct insights into children's daily experiences are limited despite their importance for development (Rogoff et al., 2018). Traditional methods, like laboratory play sessions, fail to capture naturalistic interactions. Wearable recording devices provide richer data, but their sheer volume challenges traditional coding. We introduce a machine-learning approach to analyze children's everyday interactions. Sixty-four children (ages 3–5) in Leipzig, Germany, wore vests with small cameras, recording 224 hours of video since 03/2020. Our analysis focuses on social interaction cues: person presence, face, gaze, and voice. Using YOLO11, we achieved 80% accuracy in person detection and a 0.9 F1 score for face detection. Preliminary results indicate children often spend time alone or with one person, with face presence in only 17.75% of frames. We will next integrate gaze and voice detection to assess child-directed speech. Our machine-learning approach provides novel insights into children's natural social environments, advancing research on early development.

Scaffolding the Understanding of Scientific Analogies

Analogy is a mainstay in STEM education. It benefits learning by highlighting important commonalities between concepts and promoting transfer of knowledge. However, students often fail to process analogies deeply, and thus miss the potential benefits. This research aims to equip students with a domain-general strategy for understanding analogies. We created an Analogy Template that guides students through an explicit analysis of the relational matches and object correspondences. To test its effects, we gave undergraduates a series of science analogies. The Training group used the template to analyze the analogies. The Control group explained the same analogies without the template. Then both groups were asked to explain four novel science analogies. Raters blind to condition judged the explanations. Students who successfully completed the template training showed better understanding of the analogies than those in the control group. These results provide initial evidence that analogical training can contribute to science understanding.

Context-dependency in marmoset gaze-following

Gaze-following, the ability to direct one's attention to the target of another individual's gaze, plays a key role in social interactions. Gaze-following has been considered reflexive, implying it might operate invariant to context. We tracked the eye position of head-fixed marmosets as they viewed naturalistic video stimuli featuring a marmoset gazing toward (i.e., cueing) one of two transparent boxes. Three trial types were presented interleaved: in the first, a second marmoset entered the cued box; in the second, no marmoset entered either box; and in the third, both boxes were occluded, creating a non-informative context. In all three conditions, a significantly larger number of initial saccades landed in the cued region than in the un-cued region (p < 0.001) confirming gaze-following effects. However, preliminary analysis of other eye movements revealed different gazing patterns across conditions, indicating the potential context-dependent nature of gaze-following influenced by the availability of relevant information.

Predicting the Time and Place of Critical Transitions in Socio-Cognitive Systems

Collective behavior can change rapidly. Individuals can align their behavior suddenly, opting to cooperate or coordinate, such as in revolutions or riots. Can we predict when and where such collective shifts are about to occur? In many physical and biological systems, critical transitions from one regime to another are preceded by a variety of "early warning signals," including increased relaxation time, variance, and autocorrelation. We investigate whether these early warning signals also prefigure sudden shifts in large-scale socio-cognitive systems. Using agent-based models, we demonstrate the existence of early warning signals of both the onset (when) and origin (where) of critical transitions in social interaction. These results were robust across social networks that varied in size and structure (i.e., random vs. small-world networks). We speculate that these signals may occur for many collective social-cognitive phenomena, including transitions in teamwork, norms, and communication systems.

Children express emotions multimodally before expressing them in speech

Children express some concepts in their gestures before expressing them in speech (Goldin-Meadow, 2015). This phenomenon has been shown in several domains that are rich in visuospatial information. One other domain that can benefit from gestures is emotion expression (Kelly & Tran, 2023). In this study, we explored monolingual Turkish speaking children (N=23, Mage=8.6) and adults (N=19, Mage=35.6) in emotion recall after watching a silent video demonstrating a range of emotions. We coded participants' emotion expression either in speech-alone or multimodally (speech plus head/body/hand gestures and/or facial expressions). Overall, children (M=13.8, SD=6.4) recalled significantly more emotions than adults (M=9.5, SD=3.8) (p=.014). They also recalled emotions significantly more multimodally (M=8, SD=7) compared to adults (M=4.9, SD=4) (p=.03). These results corroborate previous research on children's reliance on gestures, now also extending it to the domain of emotions and by incorporating facial expressions as an alternative expression channel.

Counterfactual error-monitoring in human planning

Human decision-making involves evaluating choices and learning from outcomes, yet how individuals process information about paths previously or no longer available options remains unclear. While traditional reinforcement learning models suggest that evaluating unchosen options optimizes future decisions, online planning algorithms typically ignore unavailable paths that can not inform current choices. This raises a critical question: do humans actively monitor unchosen alternatives during online planning? Using a two-stage decision task with eye-tracking and reaction-time analyses, we show that participants systematically monitor alternative paths, especially when their selected path proves suboptimal. Attention to unavailable options scales with their potential value, revealing an adaptive metacognitive process absent in online planning algorithms. These findings indicate that humans maintain and update representations of unavailable alternative choices rather than discarding them after selection. This work provides novel insights into the interplay between planning, metacognition, and adaptive behavior by demonstrating counterfactual evaluation during decision-making.

How Communicative Pressure Shapes Social Networks: An Agent-Based Model of Language-Network Co-Evolution using the Naming Game

In cultural evolution, empirical data and models highlight the key role of demographic factors like population size and network structure in cumulative cultural evolution (Derex & Mesoudi, 2020). However, few studies have explored the co-evolution of cultural and demographic structures (Smolla & Akçay, 2019). Therefore, we extend Falandays and Smaldino's (2022) agent-based model with dynamic directed networks. Agents played the Color Naming Game (Baronchelli et al., 2010), learning linguistic and color categories and mappings between. Based on communicative success and Bayesian learning, agents adjust network connections, driving the co-evolution of cognitive, linguistic, and demographic structures. We varied initial network structure, population size, and constraints on cognition, communication, and life cycle. Our findings indicate that pressures for shared language shape the emergence of networks that facilitate the learnability and transmissibility of shared language, and that the equilibrium network structure depends on initial conditions and the balance of constraints on the system.

Coupled echo state networks as a model of task-oriented alignment

Coordination studies reveal that groups can achieve performance exceeding the sum of individual contributions (Bahrami et al., 2010). Further evidence suggests that weak coupling maximizes the benefits of coordinated problem-solving (Abney et al., 2015; Schloesser et al., 2021). This work develops a computational framework to study coordination in coupled systems. We trained two echo state networks (ESNs) to classify cepstrum-coded speech signals from nine native Japanese speakers (Kudo et al., 1999). Coupling ESN feedback during testing reveals a nonlinear relationship between joint performance and coupling: moderate coupling (feedback integrates readout states from both networks) enhances performance, whereas full coupling (feedback is swapped between networks) returns performance to that of independent networks. These results suggest that while interaction between networks can enhance performance, excessive integration may diminish the benefits of independent contributions (cf. Fusaroli et al., 2012). Our model provides a novel, formal framework for explaining interaction dynamics in collective intelligences.

A Normative Account of Specialization: How Task and Environment Shape Role Differentiation in Collaboration

In collaborative groups, both humans and artificial intelligence (AI) agents frequently adopt specialized roles, yet the conditions that govern the optimal degree of specialization remain poorly understood. In this work, we propose that specialist teams outperform generalist ones when environmental constraints limit task parallelizability---the potential to execute task components concurrently. Drawing inspiration from distributed systems, we introduce a heuristic to predict the relative efficiency of generalist versus specialist teams and validate it through three multi-agent reinforcement learning (MARL) experiments in Overcooked-AI, demonstrating that key factors limiting task parallelizability influence specialization. Notably, as the task space expands, agents reliably converge on specialist strategies, even when generalist ones are theoretically more efficient, suggesting that specialization may help mitigate costly learning demands. Our findings provide a normative framework for understanding when and why specialization emerges as the optimal strategy in collaborative settings.

Capturing Student's Spontaneous Knowledge Transfer Between Block and Text-Based Programming Languages

Transferring knowledge to new situations is essential for learning (Gentner, 2003) but notoriously difficult (Gick & Holyoak, 1983). In computer science education, students are expected to transfer knowledge of earlier taught block-based programming languages (e.g., Scratch) when they transition to more challenging text-based languages (e.g., Python). However, still little is known about whether and how students engage in such transfer. To explore these ideas, we developed an assessment for late-elementary and middle-school students proficient in Scratch that provides brief instruction on similarities with a novel text-language (Python) for various concepts (conditionals, iteration, etc.). Students were then assessed as to whether they could transfer their knowledge to conceptually related problems in Python. Results indicate students struggle in transferring most concepts, particularly those with syntactic differences. These findings are consistent with ACT-R theory (Anderson & Schunn, 2000) and suggest students may benefit from targeted transfer support when learning new programming languages.

Towards a computational account of egodystonia

Egodystonia refers to thoughts and behaviors that conflict with one's values or beliefs, which is often observed in psychiatric conditions like obsessive-compulsive disorder (OCD). While prior work has demonstrated dissociations between beliefs and actions (Vaghi et al., 2017, 2019), we lack a computational framework to explain the mechanisms underlying this mismatch. In a novel experiment combining behavior and subjective report, we induced egodystonic feelings in a healthy population with a range of obsessive-compulsive traits. Individuals scoring higher on the Obsessive-Compulsive Inventory (OCI-R) reported greater egodystonic experiences. Egodystonicity was not influenced by reward availability or action rate, but was driven by perceived consequences of inaction, as captured by a computational model of the task. This study provides the first experimental evidence of induced egodystonia and offers a foundation for theoretical advances in understanding this phenomenon.

Through the Eyes of Expertise: Decoding Mathematical Cognition with Eye-Tracking and Entropy

Understanding how experts and novices allocate visual attention during mathematical problem-solving can reveal novel insights into cognitive processing. This study investigates eye-movement patterns in linear algebra tasks using entropy and recurrence quantification analysis (RQA). Eye-tracking data were collected from participants of varying expertise levels, analyzing fixation duration, scan paths, and pupillometry between key Areas of Interest (AOIs). Results indicate that experts exhibit lower entropy, suggesting a more systematic, targeted approach, whereas novices display higher entropy, reflecting exploratory and less efficient search strategies. A Welch Two-Sample t-test confirmed significant differences in entropy scores, with experts showing greater attentional focus in key areas. These findings highlight the role of visual attention in mathematical cognition. Our research underscores the potential of entropy-based metrics for assessing problem-solving strategies, with implications for content sequencing and designing instructional tools that scaffold visual attention in STEM education.

A Neurosymbolic Model of Human Reasoning on the Abstraction and Reasoning Corpus

The Abstraction and Reasoning Corpus (ARC) is a visual program synthesis benchmark designed to test out-of-distribution generalization in machines. Although recent advancements have led to human-level performance on ARC, it is still unclear how people solve ARC tasks. Other work has demonstrated that people often make errors when reasoning about these problems, and sometimes fail to infer the true underlying program. In this work, we explore the hypothesis that the cognitive mechanisms supporting reasoning are more approximate, graded and resource-limited compared to those suggested by purely discrete, program-induction models. Taking inspiration from previous work on program sketches and partial programs, we model human reasoning in the ARC domain using a neurosymbolic, meta-learned model that interleaves symbolic operations with approximate, statistical pattern completion. We then evaluate the model against human errors from H-ARC, a recent, comprehensive behavioral dataset on the training and evaluation sets of ARC.

The Temporal Evolution of Implicit Bias in Perceptual Decision-Making

Priors shape decision-making, but how bias emerges remains unclear. Using a drift-diffusion model, we analyzed data from 40 participants completing a forced-choice task in which they judged the direction of apparent motion of dot stimuli of varying coherence. For one stimulus color, unbeknownst to participants, one direction occurred more frequently (positive prior), whereas for the other color, the frequencies were balanced. Results show drift rate increased throughout learning for both conditions. However, a greater drift rate emerged early for the positive prior condition, indicating a rapid increase in evidence accumulation consistent with the prior. Starting point increased gradually with practice only in the positive prior condition, suggesting that participants acquired a bias toward the prior-consistent outcome. These findings suggest that perceptual decision-making bias emerges through both a rapid allocation of attention to information consistent with the prior and a gradual development of response bias toward the globally more frequent outcome.

Is there a strategy switch cost when switching strategies within one task?

Switching from one task to another typically induces performance costs, but it remains unclear whether switching strategies within the same task also induces similar switching costs. In this study, the Building Sticks task, a simple problem-solving task, was used to investigate whether strategy switching produces costs comparable to task switching. By using a task-switching paradigm, participants completed pure blocks (using a single strategy) and mixed blocks (switching between two strategies), allowing direct comparison of switch and nonswitch trials. We measured both reaction time (RT) and a Linear Integrated Speed-Accuracy Score (LISAS). Results showed that switch trials were significantly higher than nonswitch trials for both RT and LISAS measures. Moreover, splitting participants at the median accuracy revealed that high-accuracy individuals consistently showed smaller switch costs (RT and LISAS) than low-accuracy individuals. We conclude that strategy switching within a single task triggers robust performance costs, partially mitigated by stronger baseline accuracy.

Video game experience mediates sex differences in spatial and navigation abilities

Past research reveals sex differences in spatial ability measures. This study further examines this variation across a comprehensive range of spatial tasks and explores potential mediators. 259 participants completed psychometric measures, questionnaires, and navigation tasks in immersive and desktop virtual reality. Men performed significantly better on both small- and large-scale spatial tasks. Video game experience (greater in men) mediated sex differences on most spatial tasks. However, spatial anxiety and motion sickness (greater in women), and exploration tendencies and risk-behavior (greater in men), generally did not account for this variation. This study reveals significant sex differences across spatial navigation tasks and provides preliminary evidence that experiential factors contribute to this variation. This research has implications for everyday navigation and the effects of technology use (video game experience) on spatial performance.

Comparing Navigation in Immersive and Desktop VR Environments

The use of virtual reality (VR) has become a standard procedure for studying spatial navigation, as it allows researchers to create controlled environments and paradigms that can be used across multiple research sites. These simulated environments are primarily conducted in either desktop VR (DVR) or ambulatory immersive VR (IVR), yet little work has directly investigated if navigation in these modalities reflect the same abilities when using identical environmental layouts. In 2 studies we examined participants' abilities to learn the layout of a maze-type environment in DVR and IVR. Our findings generally show that while people exhibit better navigation performance in IVR, performance in the two modalities are highly correlated. We discuss the implications of these findings, including possible reasons for different performance in IVR compared to DVR, including body-based cues and cyber sickness, and make recommendations for future research examining navigation in VR.

Prosodic and other paralinguistic features of speech differ across social contexts and roles

Prosodic (e.g., pitch, rhythm) and other paralinguistic features (e.g., laughter) shape speech dynamics, but the ways in which different communicative demands, such as social context and role, influence these features remain unclear. Using a podcast corpus, we analyzed speech across two contexts (monologue and dialogue) and roles (host and guest). We extracted 18 prosodic features (across pitch, rhythm, loudness, timbre, and voice quality) and annotated several paralinguistic features: proportion of laughter, proportion of interjections, and response offsets (inter-speaker gaps). Prosody differed reliably across contexts, such that dialogue exhibited more variable pitch, faster rhythm, narrower loudness range, more stable timbre, and rougher voice quality than monologue. Speakers also laughed more during dialogue than monologue. Additionally, hosts tended to laugh more and interject less than guests, but both groups had similar response offsets. Ongoing analyses will investigate role-based prosodic differences and the continuous relationship between these features and semantic meaning.

Young children use offers of help to infer relative task difficulty and agent competence

To learn effectively, young children must reason about how difficult a task is and the level of competence required to complete it. Prior research finds toddlers prefer helpers who assist individuals who are less competent or facing more difficult tasks, suggesting that children use these factors to evaluate who needs help. Here, we ask whether children make the inverse inference: Can children use offers of help as a cue to infer either an agent's relative competence or task difficulty? Study 1 finds that 4- to 6-year-olds (N = 36) infer a student who is offered help is less competent than one who is not (binomial test; p < .001). Study 2 suggests that children rate tasks as relatively more difficult if a student was helped on it (N = 45, p < .05). These results reveal how children use help as a salient cue to learn about others' competence and tasks in their environment.

How Well Do Adults and Children Remember Agents and Actions in Dynamic Events?

Understanding how adults and children remember dynamic events is important for cognition. In S1, adults(N= 177) heard no words, verbs, or nouns while Tobiix30 eyetracker recorded looking to 34 familiar events. Test included 40 events: Old, New Action(old agent,new action), New Agent(new agent,old action), and Conjunction(new combination). Tracking shows more attention to hands (50%) than head (25%); looking affected memory. Results show better memory for actions than agents (e.g.,Trial type:F(3,522)=660.28,p< .001). In an adapted procedure, 19 3-year-olds and 19 4-year-olds were shown 2 events while hearing verbs; tested on memory with the same types of trials. Again, actions were remembered better than agents (e.g.,Trial type(F(4, 72)=25.66,p< 0.001). Conclusions can be drawn regarding the development of event memory with implications for multiple areas (eyewitness testimony,verb learning).

Early Engineering Identity: Examining Competence, Interest, and Affect

Early childhood beliefs play a crucial role in shaping engineering engagement. This study examines how engineering competence and interest vary across gender, race/ethnicity, and age among children aged 5–12 (n= 16; data collection ongoing), exploring the relationship between identity-related beliefs and task performance in a hands-on science museum. Participants were assessed on engineering-related perceived competence and interests, and parent's beliefs. Measures were correlated with persistence and effectiveness in an engineering activity (build a tinfoil boat) linked to problem-solving/creativity. Preliminary results suggest that boys and girls did not differ significantly in engineering perceived competence or interest. Girls were able to support twice as many marbles (25) in their created boat than were boys (12). Age weakly positively correlated with task performance. Our findings show no major gender differences in interest or perceived competence at these young ages, and could inform strategies to enhance STEM engagement among diverse learners.

SVM neural decoding of EEG for words and non-words across speakers, dialects, and genders

Decades of behavioural research have shown that word recognition is an incremental process involving competition among multiple lexical candidates (Huettig et al. 2011). Recent work by McMurray et al. (2022) demonstrated that SVM-based machine learning can decode the neural spatiotemporal encoding of phonetically similar words and non-words from EEG signals, and that decoding response patterns closely mirror prototypical lexical competition effects. Here we describe two studies that (i) replicate McMurray et al. (2022) and (ii) extend this paradigm one step further, by decoding EEG responses to words and non-words across different speakers, dialects, and sexes. Additionally, we assess the decoder's sensitivity to individual differences by correlating its performance with behavioral task data. We conclude that this algorithmically simple decoder can be a powerful tool for uncovering neural psycholinguistic dynamics, but that it requires an amount of data that currently limits applications to developmental or clinical populations.

Phonological Overlap Connects Semantically-Unrelated Concepts: Evidence from Neural Correlates of Language Co-activation

Language co-activation can strengthen associations between unrelated concepts. We tested whether cross-linguistic phonological overlap impacts semantic processing of non-overlapping written inputs. English monolinguals and Korean-English bilinguals were presented with an interlingual homophone (e.g., "moon") and a word that is either semantically related (e.g., "lock" – "moon," the sound /mu:n/ means "door" in Korean, which is semantically related with "lock") or unrelated across languages (e.g., "fork" – "moon"). While their EEG was recorded, participants had to judge whether the word pairs were semantically related. A smaller N400 effect (difference in ERP amplitude between related and unrelated word pairs) was found in bilinguals than monolinguals, especially for word pairs related in meaning across languages. We conclude that phonological links across languages can connect unrelated concepts, reshaping the lexico-semantic network.

Experiencing Positive and Negative Emotions in L1 vs. L2

Bilinguals perceive events differently in their native (L1) and second languages (L2; e.g., Dylman & Bjärtå, 2019). A possible explanation for this "foreign language effect" is that L2 makes the described event feel less personal or emotional, leading to more analytical thinking. Prior research has (1) focused predominantly on negative emotions, and (2) not considered individual differences in how relatable an event feels, potentially modulating emotional intensities. This study tests both positive and negative events and how relatable each event feels. Participants, randomly assigned to the L1 or L2 condition, read 8 happy and 8 sad stories, and rated the emotion they felt for each story and its similarity to their personal experience. Preliminary descriptive results (N=17) suggest that emotion is felt more intensely in L1, regardless of valence. With the final dataset, we will examine whether Language, Valence, and Relatability independently and interactively predict the felt emotion.

Children's Skills, Interests, and Play in Object and Spatial Visual Domains: Maternal Evaluations & Field Observations

This study explored individual differences in visual-object and visual-spatial play preferences and performance in 4–8-year-old children. First, mothers have completed surveys about their own and their children's abilities and traits. Subsequently, children's play behavior was observed at field study organized in the form of an edutainment festival. Mothers' self-reported abilities correlated with their evaluations of the corresponding abilities in their children but showed weak or no links to children's learning interests and play. Parenting practices were more strongly associated with children's abilities and interests than maternal traits. Specifically, maternal control was linked to children's visual-spatial play, while warmth and structure correlated with various skills and interests. Children's play preferences predicted by mothers aligned with observed play choices, but actual play behavior was more related to children's own traits than their mothers' characteristics. These findings highlight the role of parenting in shaping children's visual skills, learning interests, and play behaviors.

Interaction of generalised aversive beliefs and avoidance behaviour

Beliefs about potential threats motivate avoidance behaviour. However, avoidance may also provide new learning experiences that can impact threat beliefs. Here, we use a novel task to explore this bidirectional relationship between aversive beliefs and avoidance. Specifically, we evaluate changes in outcome expectancies for locations on a grid-like sea-world before and after participants navigate a boat and avoid threatening locations. Using computational modelling we assess how participants generalise threat information to nearby locations and how these beliefs change as a result of avoidance. Initial piloting indicates that participants differ in the degree to which they generalise beliefs to nearby locations. Simulations further show how increased generalization influences avoidance and highlight the role of avoidance costs in shaping behavior. This work explores individual differences in belief-avoidance interplay and contributes to our understanding of how this interaction can lead to maladaptive patterns typical of anxiety.

Who drew this? Children appreciate visual style differently than adults

When viewing a painting, we can discern not only its *content* — what is being depicted, e.g., a mountain — but also its *form* — the manner in which it is depicted, e.g., an impressionist sketch. What are the origins and developmental trajectory of our capacity to distinguish content and form? In 3 experiments, we introduced participants to artists who produced scenes with distinct contents and styles. Then, participants saw a critical third scene whose content matched one artist's drawing but whose style matched the other. Participants were asked which artist produced this critical scene. Whereas adults attributed the critical scene to an artist based on style (E1), children aged 4-7 years attributed based on content (E2; replicated on Children Helping Science in Experiment 3). This work supports two conclusions: (1) The capacity to distinguish content from form arises early; but (2) the way this capacity is applied shifts throughout development.

Pushing people: the neural basis of social interaction perception

We commonly use the language of physics to describe social interactions, even those that do not involve physical contact at all (e.g., pushing, pressuring, blocking). Is there a common conceptual basis for perceiving causal interactions? This study asks whether there is a shared neural code for physical and social interactions. For instance, when a gust of wind knocks a sailboat off its trajectory, is that interaction processed similarly to when an employee is kept from going home by their boss? To investigate domain-invariant representations of causal interactions involving enabling vs. preventing, we presented fMRI participants with vignettes depicting physical interactions between objects or purely social interactions (agents interacting socially but not physically). Multivoxel pattern analyses revealed that brain regions classically associated with physical reasoning and social reasoning contained domain-invariant representations of enabling vs. preventing. These results suggest a shared conceptual structure for making sense of physical and social interactions.

Human and Nonhuman Learning of Hierarchical Structures in a Lindenmayer Grammar

One proposed explanation for humans' unique cognitive capacity is our ability to extract hierarchical, recursive structures from ambiguous input. However, little work has successfully tested when humans represent hierarchical structures, and whether nonhuman animals do so. Using a serial reaction time task, we test if human adults, children, and rhesus macaques predict upcoming items in a Lindenmayer grammar containing self-similar recursive constituents. Recursively merging constituents makes the sequence more predictable. Constituents of different levels vary in predictive power, allowing measurement of depth of embedding in these representations. We test the human and nonhuman capacity to represent recursive structures, measured by reaction times, and its developmental and evolutionary origins. Preliminary results indicate human subjects recursively merge chunks to build multiple levels of embedded structures spontaneously. With similar training, macaques use simpler, linear strategies to predict items. Follow-up experiments will test whether macaques can learn to extract hierarchical structures for better prediction.

Are "sweet talks" literally sweet?: A study of taste imagery evoked by Japanese emotional words

In many languages, the speakers use taste-related words to express emotions. This preregistered study extends preceding survey study findings by applying attribute conditioning paradigm to examine whether emotional words influence gustatory imagery of nonsense words. Native Japanese-speaking participants underwent an attribute conditioning consisting of five phases: the first evaluative phase, a conditioning phase, the second evaluative phase, a contingency awareness questionnaire, and a contingency memory task. In the evaluative phases, participants rated nonsense words on emotional (i.e., valence, arousal, and familiarity) and taste-related (i.e., sweetness, saltiness, sourness, bitterness, umami, and spiciness) impressions. Emotional words with positive (e.g., happy, vacation) or negative meanings (e.g., sad, thief) were used as unconditioned stimuli. Results demonstrated that nonsense words paired with positive-meaning words were associated with sweetness and umami, while those paired with negative-meaning words were associated with saltiness, sourness, bitterness, and spiciness. These findings provide psycholinguistic insights into metaphorical expression using taste-related words.

Evidence for Distinct Factive and Non-Factive Mentalization Systems in Adults and Infants

Despite extensive research on mentalization, few studies target the representations and the cognitive systems that underlie different mental state attributions. In two eye-tracking experiments with adults (n=32) and 19-month-old infants (n=24), we examined whether factive (knowledge, ignorance) and non-factive (false belief, true belief) mental state attributions belong to seperate representational systems, relying on the assumption that transfer within-system should occur faster than between-systems. Participants watched animated videos of an agent tracking a hidden ball that could hide in two locations, requiring mental state attribution updates from non-factive to either another non-factive or to a factive mental state. Saccadic reaction times (SRTs) to the ball's reappearance were measured. Results showed that both adults and infants had faster SRTs when updates occurred between two non-factive mental states compared to updates between a non-factive and a factive mental state. This supports the existence of distinct systems for factive and non-factive mental state attribution.

Temporal-Difference Learning in Uncertain Choice: A Reinforcement Learning-Diffusion Decision Model of Two-Stage Decision-Making

Behavioral adaptation in probabilistic environments requires learning through trial and error. While reinforcement learning (RL) models can describe the temporal development of preferences through error-driven learning, the diffusion decision model (DDM) allow for the mapping of state preferences on single response times. We present a Bayesian hierarchical RL-DDM integrating temporal-difference (TD) learning. Our implementation incorporates variants of TD learning, including SARSA, Q-Learning, and Actor-Critic models. We tested the model with data from N = 59 participants in a two-stage decision-making task. Participants exhibited learning over time, becoming both more accurate and faster. They also reflected a difficulty effect, with faster and more accurate responses for easier choices, as reflected by greater subjective value differences between available options. Model comparison demonstrated that the RL-DDM provided a better fit compared to standalone RL or DDM models. Notably, the RL-DDM captured both the temporal dynamics of learning and the difficulty effect in decision-making.

The superiority of graphics over text in long-term memory retention

Graphical representations of data are pervasive in modern communication and are often used to convey socio-economic, scientific and medical information. Despite their popularity, it is still unknown whether they can enhance the long-term retention of their content. We conducted an incidental delayed recall task with psychology undergraduates (N=92), in which participants read about the evolution of a socio-economic phenomenon, with few datapoints presented either as a graphics, a text, or a table. We found that graphics facilitated the remembering of the general trends of the data after a two-hour interval. No advantage was found on immediate recall of numerical values in another sample of participants (N=80). Thus, even for equal initial encoding of numerical information, and even for very concise materials, graphics facilitate long-term retention. Overall, the study reveals the potential of graphics as effective tools for enhancing memory retention and therefore highlights their valuable role in educational settings.

The neural bases of graph perception: a novel instance of cultural recycling

Graphs abound in our culture, but the brain mechanisms of graphicacy are unknown. Here, using scatterplots, we tested two hypotheses about the brain areas underlying graphicacy. First, at the perceptual level, we hypothesized that the visual processing of scatterplots recycles cortical regions devoted to the perception of the principal axis of objects. Second, at the semantic level, we speculated that the math-responsive network active during mathematical truth judgments should also be involved in graph perception. Using fMRI, we found that graph trend judgement recruits a right lateral occipital area involved in detecting objects' orientation, as well as a right anterior intraparietal region also recruited during mathematical tasks. Both behavior and brain activity were driven by the t-value, which indexes the graph's statistical correlation. We suggest that, like literacy and numeracy, graphicacy relies on the recycling of brain areas previously attuned to a similar problem, here the perception of object orientation.

Origins of numbers: A shared language-of-thought for arithmetic and geometry?

Number concepts are often thought to originate from counting and the successor function, or from a refinement of the approximate number sense. Here we argue for a third origin: a shared language-of-thought for geometry and arithmetic, with primitives of repetition, concatenation, and recursive embedding. Applied to sets, starting from 1, those primitives engender concepts of exact integers through recursive applications of additions and multiplications. Links between geometry and arithmetic also explain the emergence of higher-level notions (squares, primes, etc.). Under our hypothesis, understanding a number means possessing one or several mental expressions for it, and their minimal description length determines how easily they can be mentally manipulated. Several historical, developmental, linguistic, and brain-imaging phenomena provide preliminary support for our proposal.

Heritage Language vs. Dominant Language: When Bilinguals Excel in Unexpected Ways

Heritage bilinguals—who learned their L1 in their childhood home and L2 at school—vary greatly in their language and literacy skills. Although much is known about heritage bilingual children's language and literacy development, less understood are psycholinguistic processes underlying literacy skills in heritage bilingual adults, which we examined here. Spanish(L1)-English(L2) heritage bilingual adults completed psycholinguistic and reading tasks in English and Spanish. Although participants' English and Spanish verbal fluency did not significantly differ (t=1.750, p=.118), they read sight words more efficiently in English than Spanish (t=5.371, p<.001), suggesting asymmetries in language versus literacy skills. However, participants had better recollection for and familiarity with passages read in Spanish than English (ts>2.014, ps<.04), despite lower d' scores in Spanish than English reflecting greater uncertainty (t=1.681, p=.066). These findings suggest that memory is richer for passages read in Spanish than English, despite reading being more efficient in English than Spanish.

Does using LLMs in daily life help or hinder learning a second language?

As AI tools become integral to daily life, evaluating their impact on human cognition and learning is essential. This study examines how AI-assisted writing (AIW) tools influence language development over six months. Participants were randomly assigned to either a control group, using basic auto-correction (e.g., Grammarly), or an experimental group, using dialog-based large language model (LLM) tools (e.g., ChatGPT). Each month, all participants will write an English essay and reported on tool usage frequency and strategies. Preliminary findings after 1 month suggest diverse usage behaviors with AIW tools which may lead to varied outcomes in linguistic performance. These results provide insights into AI's role in fostering language growth and inform strategies for effective AI-enhanced writing practices. In July, we will be able to report 6 months of data.

People generate naïve theories to explain probabilistic outcomes even if their theory's predictive power is near zero

In this exploratory investigation, the basic paradigm manipulated diversity of contrasting evidence on inductive inferences drawn from a multi-item target. In prior research, it was shown that increasing the diversity of a contrast set led to lower generalization of a novel property that was probabilistically associated with the target (Bosch, 2020). In the present work, a significant majority of participants generated naïve theories or rules, e.g., "the bananas that cost points were fresher," to explain probabilistic outcomes, e.g., four out of six exemplars of bananas cost ten points (while the other two cost zero points) in a mock online game. This effect did not depend on accuracy or predictive power of the proffered explanation. Further, preliminary evidence suggests that greater diversity of the contrast set is associated with a greater likelihood to generate explanations. Implications for inductive reasoning, naïve theories, and causal inference are discussed.

The Impact of Direct and Implied Claims in Advertising

This study investigates the way in which participants can distinguish between direct assertions and implied claims with particular emphasis on accuracy and response times. Building on Gardner's (1975) claim-belief interaction theory, which explains advertisements may indirectly influence consumer beliefs through subtle suggestions. The study focuses on examining if participants can accurately identify both types of claims. Using PsychoPy software participants were asked to judge the truthfulness of the claims. Results showed that subjects could identify direct claims more accurately than they could implied claims (t (95) = -2.197, p < .050). They took more time to respond to implied claims (t (95) = 6.705, p < .001) than direct claims. Thus, these findings highlight cognitive difficulties associated with processing implied claims and that the direct claims are easier to identify and have a quick response time. The study provides valuable insights into how advertisement claims influence consumer understanding and memory.

Relations between toddlers' core metacognition and parents' metacognitive talk: an eye-tracking paradigm

Recent research has challenged the belief that metacognition develops only in school age, providing evidence of basic metacognitive skills as early as 12 months (Goupil & Kouider, 2016). This emerging metacognition, however, raises the question of the variables that can influence its development, the involvement of very specific parent-child interactions being postulated (Gardier et al., 2024). Using a novel eye-tracking paradigm, we assessed metacognition in 55 18-month-old children through a forced-choice recognition task where eyes movements towards a cue were used as an indicator of metacognitive uncertainty while assessing the metacognitive richness of the parent's talk during a parent-child play session. In addition to providing further evidence of early metacognitive abilities, our results indicated that parents' utterances encouraging children to monitor their mental operations were positively associated with toddlers' metacognitive accuracy (OR=1.3). These findings contribute to a better understanding of the role of parent-child interactions on early metacognitive development.

Sentential Context is Insufficient for Perceptual Learning of Speech

Listeners use sentential context to improve spoken word recognition. What is less clear is whether sentential context can aid in perceptual learning of speech. We employ a perceptual recalibration paradigm to investigate whether sentential context occurring before or after an acoustically manipulated target word can aid in learning a new talker's accent. We found that while sentential context improved spoken word recognition, it did not induce perceptual recalibration effects (regardless of its location in the sentence). This suggests that sentential context alone may not be a sufficient stimulus for perceptual learning. We consider two potential explanations for these results: first, information may need to be more closely tied to the target of learning to facilitate recalibration; second, sentential context may draw listeners' attention away from the target of learning.

The Relationship between Musical Experience and Cue Reweighting in Speech Perception

Musical experience may contribute to individual differences in perceptual cue weighting during speech categorization: musicians are shown to emphasize a single cue more than nonmusicians (Symons & Tierney, 2024). The present study examines whether such difference remains even in background noise where listeners usually downweight the primary cue to minimize the interference from noise. Twenty musicians and seventeen nonmusicians categorized resynthesized speech varying in two acoustic cues. The same stimuli were presented in quiet and noise, which severely compromised the primary cue. Logistic regression modeling of listeners' responses suggested that the difference between cue weights was larger in musicians than nonmusicians, regardless of listening condition. Although both groups downweighted the primary cue in noise, the shift was larger in nonmusicians. These findings suggest that musical training enhances selective attention to specific acoustic dimensions, but it may also limit cue reweighting to cope with diverse listening conditions.

The Effects of Hormone Contraceptives on Spatial Task Performance in Young Women

The impact of hormone contraceptives on cognitive function in young women remains a topic of ongoing debate, with inconsistent findings across studies. While some research suggests a beneficial effect on cognitive performance, others report no effect. The variability in these reports may be due to the different types of hormone contraceptives available, each potentially influencing memory differently. The present study examined the effects of progesterone-only contraceptives and combined hormone contraceptives (consisting of estrogen and progesterone) on spatial memory performance using the landmark memory task in young women (18-25 years old). Results indicate no significant differences in landmark memory task scores between naturally cycling women and those using hormone contraceptives. There were also no differences in spatial task scores between women using progesterone-only contraceptives and those using combined hormone contraceptives. These findings suggest that the use of hormone contraceptives does not significantly impact spatial memory performance in young women.

Impact of impulsivity on brain structures associated with academic achievement

The influence of impulsivity on the relationship between academic achievement (Grade Point Average, GPA) and brain structure remains underexplored. To address this question, a total of 153 college students' GPA, impulsivity, and T1-weighted anatomical images were measured. we investigated which brain areas are related to the GPA of college studies, whether the identified regions are also associated with their impulsivity, and whether impulsivity plays a mediating role in the relationship between the identified regions and GPA. The analyses revealed the gray matter volume (GMV) of right caudate was negatively associated with an individual's level of GPA and was positively correlated with impulsivity. The impulsivity showed a negative mediation effect on the relationship between the GMV of right CN and impulsivity. Our results indicate the caudate nucleus plays crucial roles in a student's performance and associated impulsivity. Various interventions targeting impulsivity could improve educational outcomes by addressing the underlying neurobiological factors.

Probing articulatory representation learning for phonological distinctions

While a growing body of work has aimed to extract spatio-temporal units directly from speech articulatory data, there have been few attempts to probe whether such representations capture phonological contrasts employed in language and to model the mapping between motor plans and phonological representations. This study employs a joint factor analysis and neural convolutive matrix factorization framework to a multi-speaker real-time MRI dataset of vocal tract contours. The framework generates both gestures, the spatio-temporal units that form a given utterance, and gestural scores, which detail the activation of individual gestures in time. Probing of the gestural scores shows some ability to capture phonological distinctions, suggesting that such information is encoded by the model. The gestures, however, show poor discriminability along crucial phonological dimensions, likely limited by cross-speaker spatial variability. The results highlight the difficulties in cross-speaker articulatory modeling, but also show promise in using deep learning to model articulatory representations.

Label Entrenchment Heuristic in Political Communities

Hemmatian and Sloman (2018) showed that the degree to which a label is perceived to be entrenched in society impacts the judged quality of the categorical explanation that invokes it, irrespective of how informative the explanation actually is ("label entrenchment heuristic"). Across four experiments, we show that US partisans rate the informativeness of a circular categorical explanation as higher when the label it invokes is perceived to be entrenched in a particular political community than when the label is not entrenched in any community irrespective of whether the label belongs to the political in-group or the political out-group. Furthermore, we demonstrate that one's relationship to the community that entrenches the label mediates the effect: Democrats find the categorical explanations to be more persuasive when the label they invoke is entrenched in the Democratic political community than when it is entrenched in the Republican political community, and vice versa for Republicans.

Mathematics as visual skill: Evidence from eye movements during algebraic reasoning

Algebra is powerful but difficult. It requires reasoning about abstract relations among symbolic variables. How do we do it? On one account, algebraic expertise is a kind of visual expertise: Experts learn to deploy their attention in ways that reflect the equation's hierarchical structure. Here, we tested this account by tracking participants' eye movements while they viewed algebraic expressions. On Algebra trials, participants judged the algebraic equivalence of two expressions. On Search trials, participants viewed the same expressions but had to verify the location of letters, a non-algebraic task. Despite viewing identical visual displays on both tasks, participants shifted their gaze in systematically different ways. When interacting with the expressions algebraically, participants' eye movements reflected the expression's algebraic structure. Despite algebra's abstractness, its practice may depend on the embodied skill of shifting one's gaze in strategic ways. This perspective can inform mathematics education and theories of abstract reasoning.

Children's beliefs about parents drive their learning about the world

How do young children conceptualize parents' roles? And do these beliefs inform how children learn about the world? Across one study and a preregistered replication (n = 136), we examined whether 5- to 8-year-old children expect parents to protect children from harm and leverage this expectation to learn about unknown objects. Participants watched two vignettes of a child finding a novel object. Either the child's parent or friend ran and took the object away. Participants were more likely to say that the object was bad (vs. good) when the parent (vs. friend) took it away (E1: b = -1.29, p < 0.001; E2: b = -1.44, p < 0.001). These results suggest that children are not passive recipients of care: Rather, children hold rich theories about the care they receive from their parents (i.e., protection) that allow them to build sophisticated representations about their parents and, in turn, the world.

Mutual Exclusivity in Noun and Verb Learning in Adults

Mutual exclusivity (ME), the tendency to map novel words to unfamiliar referents, can support children's word learning (e.g., Markman & Wachtel, 1988). A recent study (in prep) demonstrated that 4-year-old English-speaking children show a weaker ME effect for novel verbs than nouns, consistent with evidence that verbs are harder to learn (e.g., Gentner, 1982). Here, we replicated this study in adults. Adults viewed videos (verb trials) or static images (noun trials), one familiar and one unfamiliar, and selected the best match for a novel verb or noun. Adults applied ME for both verbs and nouns, but significantly less for verbs (z = 4.073, p = 0.0003). Adults also had longer reaction times (β = 471.25, p < 0.0001) and lower confidence ratings (β = -7.035, p < 0.0001) for verbs than nouns. Thus, less use of ME for verbs stems from something about event conceptualization rather than child development.

Maternal Input Quality and Its Impact on Late Talkers' Syntax and Lexical Development

Late talkers are children with fewer than 50 words and no two-word combinations by age 2. While some late talkers catch up with typically developing peers, others remain susceptible to developmental language disorders and language-related academic challenges throughout school years. Although maternal input plays a crucial role in language development, its impact on late talkers remains underexplored. This study examines how maternal input quality affects late talkers' lexical diversity and productive syntax, utilizing conversational data from Ellis Weismer's (2007) corpus in CHILDES (MacWhinney, 2000). We analyzed 76 mother-child samples from 38 late talkers (ages 2;6–3;6) in CLAN, assessing lexical diversity with lexical D, productive syntax with IPSyn, and maternal input quality with MLU, lexical D, and IPSyn. Linear regression models indicate that maternal input quality contributes to late talkers' syntactic development but negatively affects their lexical diversity. These findings underscore the complex nature of maternal input in late talkers' language development.

Who am I playing with? Exploring a New Model of Social Categorisation in Mentalisation

When interpreting others' actions, humans often rely on beliefs about personality traits or "personality types" to predict mental states and behavior (mentalization). However, individual differences in these intuitive inferences are often overlooked and unmodeled. To address this, we developed a controlled paradigm using Minecraft to investigate how participants' beliefs about player types and personality dimensions influence their judgments of others' player types. A multinomial regression revealed that the interaction between participants' ratings of targets on two personality dimensions significantly predicted player type classifications. We tested three competing Finite Mixture Models, each incorporating participants' elicited beliefs as probability distributions. The models were evaluated using standard metrics, including Leave-One-Out Cross-Validation, BIC, Cross-Entropy, Adjusted Rand Index, RMSE, and correlation, based on their fit and predictive accuracy on unseen data. This novel approach provides a structured way to assess the extent to which participants' reported beliefs explain variability in mentalization performance.

Text as a source of perceptual signal

Proponents of the Symbol Grounding Problem have claimed that unimodal text-based AI systems can never develop meaningful representations of the world since they lack the capacity to perceive it. Perception is a relation between an agent and their environment which is grounded in perceptual processing. The earliest stages of perceptual processing involve receptivity to sources of perceptual signal in the environment: light waves, pressure waves, and volatile airborne chemicals are all sources of perceptual signal, insofar as agents appropriately receptive to their properties can (with further processing) perceive the world through them. I argue that (1) human-generated text carries sufficient information about the world to be a possible source of perceptual signal for appropriately receptive agents, and that (2) recent generations of Large Language Models (LLMs) are such agents. Although (1) and (2) do not entail that LLMs are perceivers, they do entail that symbol grounding is achievable without multimodality.

Do People Value Plants Over Non-Living Entities? Moral Considerations in Adults and Young Children

Is it more wrong to harm a plant than a rock? Little is known about the development of our moral consideration for plants—alive but not typically seen as having human-like minds. This study examined whether adults (N=153) and young children (pilot N=17) tend to value plants over non-living things. Participants watched a video of a plant restorer caring for a plant but knocking down a bucket and a plant harmer caring for a bucket but knocking down a plant. The proportion of adults who disliked or distrusted the plant harmer, and identified them as the bad guy—compared to the plant restorer—was significantly greater than chance (ps<.001). Additionally, participants judged harming the plant more severely than harming the bucket (p<.001). Although children judged harming the plant and bucket as similarly wrong, 65% of them liked the plant restorer more. After completing data collection, we will examine developmental differences.

Learning how to learn: Evidence for an explore-exploit tradeoff in information search

To learn about the world, people must gather information from external sources (e.g., books, the internet, other people). People prefer sources that have been reliable in the past—but how do we seek information when uncertain about a source's quality? We test the hypothesis that these decisions are governed by an explore-exploit tradeoff. Prior work has researched this in reward search. Here, we adapt the tradeoff to information search. We propose that people weigh exploring an information source that is more uncertain with exploiting an information source that is more informationally promising. In one experiment (N = 171), we found evidence for both random and directed exploration in people's choices between information sources. However, people explored uncertain information sources even without the possibility of future exploitation (differing from typical reward search). People may be broadly interested in learning about sources' quality, and they explore to gain this information.

Bayes-Adaptive Information Gathering

Problem-solving often requires both acting on and gathering information from the environment. Prior work in psychology proposes that people are intrinsically motivated to seek information (Ryan and Deci, 2000; Ruggeri et al., 2021). However, in realistic settings most available information has no utility, so optimal performance requires estimating its value. We introduce a model that applies the Bayesian framework of Lidayan et al. (2024), which formalizes the value of information as the resulting increase in expected rewards. Our model calculates the utility of information-gathering actions and treats humans as noisy utility-maximizers. We design a novel task in which information sources vary in their likelihood of enabling downstream rewards. Preliminary results suggest our model predicts human behavior more accurately than intrinsic motivation models, suggesting that humans learn to estimate the value of information from experience, and use it to make better decisions.

Investigating representations of time in light verb constructions during verb comprehension

Light verb constructions (LVCs) are grammatical structures where a verb lacking semantic content combines with another structure (a noun phrase) to provide semantic value (e.g., take a photo). Previous cross-linguistic research shows LVCs are distinct from other verb types (Butt 2010), and that distinct representations are activated during verb comprehension (Richardson et al., 2003), but it remains unclear what representations, particularly of time, are activated during comprehension of LVCs. The present study manipulates the specific verb comprising the LVC (e.g., ‘turn attention to' versus ‘dedicate attention to') to probe whether one's perception of the duration of that action is affected. We test different LVCs cross-linguistically, and in both bilingual second-language speakers and monolingual speakers of English. Preliminary findings indicate that the specific verb used in the LVC influences one's perception of the duration of that action, with specific verbs generating interpretations of shorter durations while others much longer.

How Small Belief Shifts and Trusted Voices Alter Polarized Minds

The rise of right-wing populism and increasing political polarization have amplified extremist ideologies, eroding democratic norms and fostering social conflict. Traditional interventions targeting rigid beliefs often backfire, as direct challenges provoke defensiveness and strengthen existing convictions. This study proposes an alternative approach: leveraging parasocial relationships (PSRs) with beloved public figures, such as Donald Trump, to induce cognitive dissonance by presenting contradictory information targeting peripheral beliefs. Drawing on the BENDING model, which conceptualizes belief systems as interconnected networks, the research explores how changes in peripheral beliefs can disrupt ideological rigidity. The study will use experimental methods to measure belief networks and PSRs, exposing participants to Trump statements that contradict peripheral beliefs. Analyses, including ANOVA and network modeling, will assess belief network disruptions and shifts in PSRs. This research aims to deepen understanding of belief dynamics and inform strategies to reduce extremism, mitigate polarization, and strengthen democratic resilience through subtle, non-confrontational interventions.

Event Segmentation and Memories of Daily Life After a Traumatic Brain Injury (TBI)

Event segmentation agreement refers to the degree to which individuals divide the continuous flow of information into meaningful units of experience in the same way as others. In TBI patients, low levels of agreement have been reported and associated with poorer memory performance. However, this effect has been observed with laboratory tasks, and its impact on everyday memories remains unexplored. This study therefore investigated whether the ability to segment a short video predicted personal event memory in fifteen TBI patients and their matched controls. Memory was assessed using a recall of daily events collected through the experience sampling method. With this ecological task, TBI patients exhibited deficits in both the richness and accuracy of their personal memories, the latter being significantly predicted by their level of segmentation agreement. These preliminary findings highlight the role of event segmentation in everyday memory functioning, which could offer new avenues for rehabilitation memory programs.

Zipfian distributions facilitate word segmentation in infants

The word-frequency distributions infants hear during language learning are highly skewed (Zipfian). Previous work suggests that such skewed distributions facilitate speech segmentation, a crucial milestone in language acquisition. However, the experimental studies supporting this have only examined individuals aged 10 years or older, and it is not clear whether this effect arises from accumulated linguistic experience or is already present in the early stages of learning. To address this, we ran a word-segmentation study with 8-month-old infants (N=60) by exposing them to a continuous speech stream with a skewed or uniform frequency distribution of artificial words. At test, infants in the skewed condition exhibited a larger looking time difference between heard and unheard words than infants in the uniform condition. These findings suggest that Zipfian distributions can facilitate word segmentation during early linguistic development, and highlight the importance of the distributional characteristics of linguistic input in natural language learning

A comparison of transposed-character effect between Chinese and Japanese speakers: Are they equally strict in processing position information of Hanzi/Kanji characters?

Transposed-letter (TL) effect means that TL pseudowords (e.g., mnokey) are more likely recognized as their base words (monkey) than substituted-letter nonwords (markey). Compared to TL effect studies, transposed-character (TC) effect studies on logograms, such as Chinese Hanzi and Japanese Kanji, remain limited. This study compared the TC effect between Chinese and Japanese speakers using a lexical decision task. Twenty Chinese and 21 Japanese speakers were presented the same two-character stimuli including TC pseudowords (界世-世界), control nonwords with a substituted character (界座) and real filler words. Chinese participants exhibited a TC effect: they took longer to reject TC pseudowords and misidentified them more frequently than control nonwords. Conversely, Japanese participants showed no TC effect: their reaction times and error rates showed no significant differences across two conditions. These findings suggest that Chinese speakers process Hanzi/Kanji character-position more flexibly than do Japanese speakers.

Perceived legitimacy of authority influences rule endorsement and intent to comply

Although rules are codified in explicit language, the goals and normative force underlying them remain ambiguous. We argue that people interpret rules in the broader social context where the rule is set. We hypothesize that an authority's perceived legitimacy—impartiality, competence, and benevolence—affects rule endorsement and compliance through influencing interpretation of the rule's intent. In an online vignette study with 50 realistic rules that range from very positive to very negative, we found that participants were more likely to endorse and comply with the same rule if it is set by a legitimate compared to an illegitimate authority, independent of the a priori rule's valence (obtained from independent participants in a norming study). In ongoing work, we are testing the robustness of this effect, probing the potentially distinctive representation of legitimacy from other positive leadership dimensions, and investigating the cognitive mechanisms of how legitimacy shapes norm internalization and voluntary compliance.

Man or Machine: Evaluations of Human and Machine-Generated Movie Reviews

Recent advances in generative language models, such as ChatGPT have demonstrated an uncanny ability to produce texts that appear to be comparable to those produced by humans. Nevertheless, machine generated texts differ from those produced by humans in important aspects, such as routinely including references to nonexistent sources. In this paper, we use both psycholinguistic measurements and participant responses to compare texts generated by machine with equivalent texts generated by humans. Our analysis demonstrates some of the ways in which machine-generated texts differ from human-generated ones in both style (e.g., increased use of positive connectives) and content (e.g., increased confidence). We also note multiple ways in which texts generated by these models are similar to those generated by humans (e.g., their use of emotion words). We believe this research provides insights that can be useful to understanding how language is generated by both humans and machines.

Investigating the parallels of Negation and Conflict using a Mouse Tracking Paradigm

Negation is a linguistic universal, and its processing is often assumed to require extra cognitive steps: representing an idea and then suppressing it (Kaup et al., 2006). Recently, it has been shown that linguistic negation processing resembles basic conflict processing in both behavioral and electrophysiological data (Dudschig & Kaup, 2018, 2020), in line with standard conflict tasks (i.e. Stroop Task, see Botvinick et al., 2001). The present study implements mouse tracking to allow the analysis of fine-grained changes in responses during negation processing (e.g., trajectories, deviations from the ideal path, and partial errors). Participants responded to affirmative and negated phrases. The key dependent measures were influenced by the polarity (affirmative vs. negated) of the phrases on current trials (indicating the activation of the to-be-negated information) and by the polarity of preceding trials (indicating negation processing is context dependent). The theoretical implications in light of negation and conflict processing accounts will be discussed.

The Role of Compositionality in Children's Creating Representations of Large Exact Numbers: A Case Study of the Number Five

Compositional capacity (i.e., chunking four objects as two sets of two) can extend 12- to 14-month-old infants' working memory capacity from three to four1,2,3. Here we ask whether numerical composition supports 3- to 4.5-year-olds' creating representations of large exact numbers such as five. In a non-verbal object-tracking task (See Fig1a), subset-knowers and most young CP-knowers failed to track exactly five objects. In two experimental manipulations, we provided children with spatiotemporal, linguistic, and/or color chunking cues. If tracking sets of five as a composition of two and three is within children's compositional capacities, they should perform better than children from baseline. We found no evidence that 3- to 4.5-year-olds can represent exactly five by composing representations of two and three (Exp.1: F(4, 133) = 5.69, β = -0.03, p = .509; Exp.2: F(4, 143) = 4.94, β = -0.03, p = .587).

How does children's trust evolve in a repeated trust game?

The ability to detect a partner's trustworthiness and adjust one's own trust decisions accordingly is crucial and adaptive for maintaining cooperative relationships. Trusting a trustworthy partner maximizes mutual benefits; withholding trust from an untrustworthy partner minimizes chances of being exploited. We sought to understand how young children learn through experience and adjust their own trust behaviors when they interact with trustworthy and untrustworthy individuals. In this study, N = 96 6 to 11 year olds played 40 trials of repeated Trust Game with a trustworthy and an untrustworthy partner (20 trials each, order randomized). Results showed that although children across all age groups correctly identified the trustworthiness of their partners post-game, surprisingly, they did not trust the trustworthy partner more than chance level, nor did they show increasing differentiation between the two partners across the trials. These findings suggest a knowledge-behavior gap in children's trust interaction with novel partners.

Once Upon a Goodbye: Exploring How Animated Films Spark Child-Caregiver Conversations About Death

Animated films can provide a context for caregivers and children to discuss death, potentially furthering children's understanding of this concept (Bridgewater et al., 2021). However, white caregivers in the United States tend to shield their children from this topic (Rosengren et al., 2014). The purpose of the study was to delve into the dynamics of child-caregiver discussions about death. We recruited 29 children (ages 4-6) and their caregivers and observed their discussion about a death depicted in an animated film. We found that most families (92.6%) discussed death, and often mentioned affective topics (e.g., sadness; 81.5%), but some mentioned biological (33.3%) or spiritual topics (11.1%). Children's age or caregivers' reports of shielding were not linked to the content or frequency of these discussions. This study highlights how media can serve as a context for the development of spiritual and biological concepts.

Three Levels for Large Language Model Cognition

Marr's three-level hypothesis is widely applied to information processing systems, including large language models (LLMs). Despite its usefulness, applying it to LLMs proves it to be a leaky abstraction: demarcating between levels tends to be a choice that needs to be argued for. The paper explores the three levels separately and offers paradigm examples of explanations for each level. It closes with a pragmatist proposal for studying LLM cognition, inspired by the philosophy of cognitive neuroscience.

Children's division of cognitive labor: Evidence from Kenya and China

No matter how brilliant, one person cannot achieve major technological innovations alone. Human progress relies upon our ability to think together, building beyond an existing foundation of cumulative cultural knowledge (Heinrich & Muthukrishna, 2024). From five-years-old, children show cooperative capacities fundamental to this collective success (Warneken et al., 2014; Fletcher et al., 2012). Yet, little is known about children's capacity to pool mental resources with cooperative partners – if they can think together as interconnected nodes to surpass individual computational limits (Velez et al., 2022). Prior developmental research also does not fully address cross-cultural diversity in children's cooperative strategy (Rogoff, 2014). Here, we investigate how pairs of children (N = 96 dyads) cooperate on a memory task across two cultural contexts – Nanyuki, Kenya and Beijing, China. We find that children flexibly employ different strategies based on the level of cognitive demand, pointing to an early capacity for strategic cognitive collaboration.

Impact of Social Feedback Stimulus on Information-Integration Category and Rule-based Category Learning

Social stimulus often triggers a stress response, influencing learning performance depending on the task's nature. Previous research shows that social pressure before learning impairs performance in rule-based (RB) learning tasks, relying on hypothesis-testing mechanisms. Conversely, it may enhance performance in information-integration (II) learning tasks, which depend on procedural memory systems. This study examined the effects of social feedback stimuli (e.g., smiling or angry faces), compared to non-social feedback stimuli (e.g., green circles or red crosses) on performance in RB and II tasks. Participants were assigned to either an RB or II task and received either social or non-social feedback during the task. Results showed that social feedback significantly improved performance in the II task compared to non-social feedback, but did not enhance RB task performance. These results shed light on the distinct cognitive mechanism underlying categorical learning and emphasize the importance of feedback design in educational and training environments.

Cultural Variability in Baby-Schema Perception: Insights on Face Perception from Two Small-Scale Malaysian and An Urban German Community

Baby-schema is a social perception phenomenon describing how infantile facial features and their abstract representation in animals and objects automatically engage human caregiving responses. While often considered universal, its cross-cultural variability remains untested outside large-scale industrialized communities. Does cultural background and visual experience influence perceptual and behavioral biases of baby-schema? We examined responses to manipulated infant faces in two Indigenous Malaysian (Batek, Temiar) and an urban German community. In a pre-registered eye-tracking task, participants viewed human and cat face pairs differing only in baby-schema-level (high vs. low) and completed a pre-registered binary forced-choice task. Germans selected high baby-schema faces as more likable, cuter, younger, and healthier, while small-scale communities showed attenuated or absent effects and were less sensitive to baby-schema manipulations in a control discrimination-task. Eye-tracking showed cultural differences in gaze patterns that were not meaningfully influenced by species or baby-schema-level. A follow-up study will investigate effects of group membership.

Causal Stacks: A Theoretical Framework for Recurrent and Hierarchical Counterfactual Reasoning

Counterfactual (CF) reasoning – the process of considering alternative events and their outcomes – plays a vital role in understanding causation in fields like cognitive psychology and philosophy of science. In this paper, I develop a theoretical framework of Structural Causal Stacks (SCS) that provides a conceptual structure to describe the relationships between related causal and counterfactual analyses. Then, I explore its useability for observing human reasoning by running 500 pilot simulations of causal stack agents. My simulation modelled Gerstenberg et al. (2013)'s experiment design, which measured whether people's judgements about the consequence of a counterfactual state changes depended on the order they considered the events. According to my preliminary results, the stack model replicated the asymmetry in backwards versus forward counterfactual reasoning, aligning with the established consensus in a cognitive psychology literature while extending a persistent explanation for successive analyses.

Serendipity and Scientific Discovery: a new way to see Science and Knowledge?

At the heart of my paper there is the concept of serendipity: it has recently been defined as a «discovery at the intersection of chance and wisdom», but it has been described in many other ways since Horace Walpole coined the term to indicate «that discoveries of things the observers were not in quest of, which happen by both accidents and sagacity». Finding an unambiguous definition is therefore complex. However, defining the concept of serendipity is not the only way to shed light on its occurrence: by analyzing and understanding its theoretical-conceptual characteristics, in fact, I believe we can clarify how it acts within the process of scientific discovery. In this way, chance, theory ladenness, scientific method, and situated cognition acquire a new light and significance. In particular, what we learn through serendipity is not only new facts about the world, but also other ways of seeing science and knowledge.

From Degraded Inputs to Robust Sensory Cognition: A Computational Perspective on Early Perceptual Development

Human sensory development unfolds in a consistent temporal sequence, with early visual inputs initially degraded. Rather than mere biological constraints, we propose these developmental "limitations" may act as inductive biases that foster more global and robust sensory cognition. Evidence derives from children born blind who later gained sight, effectively bypassing this early degraded period. Despite many otherwise intact visual abilities, they exhibit specific deficits in generalization and extended spatial integration. Simulations with deep neural networks confirm that these deficits can arise from a lack of early degraded inputs. Conversely, training with developmentally-inspired input trajectories yields more robust representations and superior generalization. These findings help illuminate the development of typical and atypical sensory cognition, inform clinical interventions, and inspire more robust computational training procedures. Comparable results from auditory development suggest a broader phenomenon, demonstrating how what may appear to be "limitations" can adaptively shape perception and cognition over time.

Influences of catastrophic events on ethical judgments: A case from the Great East Japan Earthquake

We conducted a survey of over 2000 people before and after the Great East Japan Earthquake and obtained valid responses from 840 participants who completed both surveys. Participants responded to two types of trolley problems and answered a series of sociodemographic questions. As in previous studies, more than 80% of participants responded that they would push the switch (utilitarian judgment), and only less than 45% of participants chose the utilitarian option in the modified version (i.e., pushing to drop the person to stop the trolley). We found that people were slightly less likely to flip the switch after the earthquake than before. In addition, people who believed the world was trustworthy and who had a greater sense of self-control were more likely to flip. The results are discussed in terms of the flexibility and adaptability of external environmental factors and their possible effects on moral judgments.

Does Hand Constraint Affect Visual Roughness Perception?

When movement of hands is restricted, memory for words referring to hand-manipulable objects (e.g., cup or pencil) declines (Dutriaux & Gyselinck, 2016), and activity in the intraparietal sulcus decreases and reaction times to judge the size of the objects represented by those words increase (Onishi, Tobita, & Makioka, 2022). These findings suggest that body immobility influences higher-order cognitive processes. However, whether hand immobility also affects the perception of lower-level features, such as texture, remains unclear. The aim of our study was to investigate whether changes in somatosensation caused by hand constraint affect visual judgments of roughness. As a result, hand constraint did not significantly influence either the accuracy or reaction times of visual roughness judgments. This suggests that somatosensory information is not recruited automatically during visual roughness judgments. However, we cannot exclude the possibility that participants made judgments based solely on visual information, potentially because the task was too easy.

Systemic Barriers to Indigenous Higher Education: A Comparative Analysis of the United States and Australia

Access to higher education (HE) is heralded as a pathway to social mobility and equity but remains elusive for Indigenous populations in high-income countries like the United States and Australia. Systemic racial inequities, deeply rooted in colonial histories, perpetuate barriers to HE access and attainment for Native American and Aboriginal and Torres Strait Islander communities. This essay employs a comparative analysis based on statistics and Indigenous policy frameworks, using Critical Race Theory (CRT) and marketisation as analytical lenses to interrogate these challenges. It examines how "Whiteness" shapes educational discourses and institutional practices, reinforcing exclusion and inequality. Key disparities are analyzed, including lower enrollment, geographic isolation, socio-economic disadvantage, and financial barriers. Contrasting outcomes—declining Native American enrollment in the U.S. versus rising Indigenous completion rates in Australia—underscore the importance of community-led, equity-focused policies. The essay advocates for transformative reforms that prioritize Indigenous voices, dismantle systemic barriers, and address colonial and neoliberal legacies.

Validating Predictive Models of Extreme Expertise in Complex Cognitive-Motor Skills

Modeling masterful performance is an essential component of skill acquisition research. Several models of high-level or extreme expertise exist for a variety of tasks, from surgical performance to the video game Tetris, yet assessing whether these computational models accurately represent real-world expertise remains challenging. Empirical validation is uniquely difficult because of limitations intrinsic to models of expertise, namely the inherently small sample sizes of experts which increase the cost of partitioning data into test and training sets, and the detailed domain knowledge often required to interpret model results. This paper presents multiple novel approaches to validating models of extreme expertise, using strategies such as generative pseudo-interventions and retroactive longitudinal case studies. The results of these validation methods align very closely with interpretations given by domain experts, demonstrating great promise for both the validation itself as well as our currently proposed models of predicting real-world performance outcomes in complex skills.

Young children can use counterfactual simulation to reason about task performance

Young children often need to decide which tasks to pursue and how long to persist. What guides these decisions? In this work, we investigate whether children can use counterfactual simulations to evaluate their performance. Preschool-aged children played a game in which they could launch a ball into a goal; on their final attempt, the ball headed either straight for the goal (Almost condition) or veered for a miss (Miss condition) before the game "froze" such that the final outcome was not observed. More children wanted to keep playing the same game in the Almost condition (N = 12/22) than in the Miss condition (N = 3/22, BF10 = 11.36), suggesting that counterfactual simulations may support evaluation of past outcomes and inferences about expected future performance. Ongoing work examines whether children can go beyond observed outcomes (make vs. miss) to use counterfactual simulations in order to reason about their task performance.

Reverse Law of Effect in Sequential Parlay Gambling

This study reports a counterintuitive reversal of Thorndike's Law of Effect in human sequential decision-making. Participants in a parlay gambling task chose between banking their current wager or betting it on a risky gamble, where winning would increase the next wager while losing or banking would reset it. We observed improving payoffs across runs, indicating learning. However, contradictory to the Law of Effect, participants were more likely to choose betting after losses rather than wins. We further developed computational models incorporating prospect theory and reinforcement learning. Consistent with our model-free analyses, models incorporating reverse updating of subjective probabilities (negative learning rate) not only significantly outperformed traditional learning models in fitting human data, but also led to higher payoff than models with positive learning rate. These findings highlights the complexity and adaptability of human learning, despite not fitting within the framework of the Law of Effect.

Filtering out Empathy: The Impact of Other-Alteration through AI-Filters on Prosocial Behavior

Facial expressions are essential to human interaction, providing emotional cues that influence how we perceive and respond to others. Advances in artificial intelligence (AI) now enable real-time manipulation of these expressions, raising ethical concerns about AI-mediated communication (AI-MC). This project investigates the effects of AI-based facial expression alterations on prosocial behavior and empathy using a choice-exposure counterfactual paradigm. It addresses two key questions: (1) Do facial AI filters influence empathy and prosocial behavior? (2) Do individuals strategically use AI filters to justify self-serving actions? Participants chose between authentic and AI-filtered videos before making donation decisions after exposure. In an initial study, neutralizing AI filters showed no significant effects on empathy or donations. Building on these findings, we will explore stronger filters and more emotionally expressive portrayals to better understand their impact. This research aims to advance knowledge of the psychological and ethical implications of AI-MC.

Motivational Effects of Instructor Images in Educational Materials on Early Reading Stages

This study examined the motivational effects of incorporating instructor images into educational materials in the early reading stages. Participants viewed a page of disaster prevention materials for two seconds and rated it on a five-point scale for motivation to read ("Did the page motivate you to read?") and understandability ("Did the page look easy to understand?"). The materials were three types: those without any instructor photographs or illustrations, those with real instructor photographs, and those with instructor illustrations generated from the photographs. Results showed that mean scores were highest for the condition with real photographs, followed by the condition with illustrations, and lowest for the condition without images. These findings suggest that instructor images enhance motivation to read, with real photographs having a stronger effect.

The learnability of sign language morphology: an experimental study

Linguistic features are adapted to their sociolinguistic ecologies (Lupyan & Dale, 2010). In this way, we posit that the morphological features of sign languages have evolved to be learnable and iconic, as many sign languages have a large proportion of second language learners and delayed first language learners. To test the learnability of sign language morphology, we conceptually replicate Smith (2024), teaching participants (hearing non-signers) British Sign Language (BSL) descriptions (lexical item plus classifier construction depicting the referent and movement) of scenes where animals perform movements. We test how accurately and how quickly participants perform in two conditions: a BSL condition pairing scenes with the BSL descriptions and a counter-iconic condition randomly pairing BSL descriptions to scenes. Following our pre-registered analysis, our pilot study suggests that BSL condition participants perform faster and more accurately than those in the counter-iconic condition. Data collection for our online learning study is underway.

Elicitation Strategies for Capturing Information Visualization Affordances

Understanding how data visualizations shape reader takeaways is critical for designing effective displays, but measuring these affordances remains a challenge. While free-response studies provide a rich source of human interpretations, they are costly to analyze and often contain ambiguities. We investigate alternative elicitation methods, including ranking charts, ranking conclusions, and rating salience, to determine their effectiveness in capturing visualization affordances. Alternative approaches varied in their sensitivity to chart familiarity and specific affordance factors. Salience ratings aligned well with gold-standard affordances collected from free-responses but failed to capture chart-specific insights, while ranking methods overemphasized familiar chart types. Additionally, we compared human responses across all elicitation methods to outputs from GPT-4o to evaluate the extent to which large language models (LLMs) could replicate human-derived affordances. These findings underscore the importance of evaluating multiple elicitation methods and clarify the potential and limitations of LLMs as proxies for human interpretation.

A Computational Account of how Anxiety and Impulsivity Impact Uncertainty Computations

Anxiety and impulsivity can impact human decision making by changing our underlying uncertainty computations. In an online study of 109 participants, we tested how anxiety and impulsivity impact perceptual ("what am I seeing?") and conceptual ("how does this fit into my knowledge of the world?") uncertainty, when both types of uncertainty are presented together. participants were asked to classify and rate their confidence on 35 naturalistic hybrid animal images with varying degree of perceptual and conceptual uncertainty (e.g. 50% deer – 50% goat vs. 20% moose-80% penguin). Higher impulsivity scores correlated with more uncertainty in ratings under most uncertainty conditions. A Bayes-drift-diffusion (DDM) hybrid model predicting people's task ratings and reaction times based on different types of uncertainty showed how impulsivity and anxiety impact uncertainty computations in mixed perceptual-and-conceptual uncertainty settings, by updating and leveraging conceptual beliefs to estimate the reliability of the estimated perceptual category response in the DDM.

One Thing at a Time: Investigating the Impact of Increased Cognitive Demand on Semantic Prediction in Older Adults with and without Hearing Impairment

Prediction is essential in language processing and is often studied as an isolated task. However, in real life, listening often occurs alongside other cognitively demanding activities. We investigated how cognitive demand affects prediction in older adults and whether hearing impairment impacts this process. Eighty-four older adults with hearing impairment (HI) or normal hearing (NH), matched in working memory scores, participated in a visual world paradigm study with semantically predictable and unpredictable sentences. The experiment featured two conditions: listening without a memory task and listening while concurrently retaining a three-item visuo-spatial memory load to simulate competing cognitive demands. Divergence point analyses showed that memory load delayed prediction timing in NH participants, and an interaction revealed differences occurring with hearing loss. As prediction timing was impacted both by load and hearing loss, our work suggests that prediction requires executive resources.

Same memory, different words: The Role of Language, Proficiency and Cognitive Load in Traumatic Autobiographical Memory Recall

Literature indicates that lingua franca use in clinical settings impacts bilinguals communicating in their second language (L2)—but what drives these changes? This study examined how language, proficiency, and cognitive load influence emotionality during traumatic autobiographical memory (AM) retrieval. Building on evidence that L2 use diminishes emotional resonance, we hypothesized that proficiency would moderate this effect. Additionally, we expected lower emotionality in first-language (L1) recall under cognitively demanding conditions, mirroring second-language effects. Greek native speakers (N=107) recalled a traumatic AM in L1, L2 (English), or L1 under cognitive load (visuospatial working memory task) while undergoing physiological EDA assessment. Self-reported arousal, distress, and traumatic symptoms were measured. Notably, L2 alone did not lower emotional intensity—only lower L2 proficiency did, highlighting its key moderating role. Cognitive load during L1 retrieval significantly decreased emotionality, supporting resource-dependent emotional processing models. These findings reveal proficiency and cognitive demands' interplay in shaping bilingual emotional processing.

Conceptual Analysis of Analogical Transfer in Common Programming Languages

Analogical transfer is well studied, but much less is known about how students transfer within specific domains. Computer science presents an opportunity to study such transfer, as students often transition from block-based (e.g., Scratch) to text-based (e.g., Python) programming languages. As an early step in understanding programming transfer, we present a conceptual analysis that predicts when students may be aided by analogical supports when transferring from Scratch to Python. We are specifically guided by Structure Mapping theory, which states analogy is a process of aligning objects and relations based on their common structure (Gentner, 1983, 2010). Much research has found that surface similarity influences transfer; thus, we categorized various programming concepts (iteration, booleans, etc.) based on their perceptual similarity. Further, we make predictions about where and how progressive alignment (Kotovsky & Gentner, 1996) can be used to facilitate relational understanding. This analysis sets the stage for future experimental work.

Exploring 3- and 11-month-olds' understanding of social versus nonsocial goals

Research in the cognitive sciences has demonstrated infants' remarkable ability to mentalize, such as their capacity for goal attribution (Woodward, 1998). However, failed replications of seminal findings have brought into question the strength of such capacities. One possibility is that infants are likely to mentalize in socially relevant/evaluative contexts (Woo et al., 2023). In two pre-registered studies, the present work investigated this possibility by testing whether 3-month-olds (via VoE; Woodward, 1998) and 11-month-olds (via anticipatory looking; Cannon & Woodward, 2012) will show stronger evidence of goal attribution for social versus nonsocial goals. Preliminary analyses revealed surprising results: Three-month-olds (N=52) show evidence of attributing location goals only in the social condition. By contrast, 11-month-olds (N=36) showed the reverse: Their anticipatory eye-movements showed evidence of attributing location goals in the nonsocial condition. The poster will present data from the full target sample (N=64) and further interpret these novel findings.

Can you imagine the spiciness of a /hi/ sound?

Previous studies have found that character type, voicing, or vowel type affects the sensorimotor and affective information imagined from individual characters or pseudowords in Japanese. The purpose of this study was to examine whether character type, voicing, or vowel type influence the taste information imagined from individual characters. Fifty and seven participants rated five taste information (sweet, salt, sour, bitter, and spicy) imagined from five vowels (a, i, u, e, or o) written in hiragana or katakana, respectively, in Study 1. Additionally, 40 participants rated six taste information (sweet, salt, sour, umami, bitter, and spicy) imagined from pairs of three consonant types (voiced: b, semi-voiced: p, or voiceless: h or f) and five vowels written in hiragana in Study 2. The results of these studies suggest that character types, voicing, or vowel types affect the taste information associated with individual Japanese characters.

Facilitating effect of finger movements on artificially generated front vowels

Previous studies have found that front vowels facilitate finger movement better than back vowels. The previous experiments were conducted with human voices. The aim of this preregistered experiment was to see if the same results as in the previous studies could be achieved with artificially generated vowels. In the experiment, the task was to press the key on the keyboard that corresponded to the vowel sound heard. The results showed that there was no difference in the percentage of correct responses, but the response times were relatively short for the front vowels (especially for /i/), compared with the back vowels (i.e., /u/ and /o/). Even when we conducted another experiment in which we used long vowels to make pronunciation clearer, we also obtained the similar results. The results suggest that artificially generated voices can facilitate finger movements similar to listening to human voices.

Do children use action frequency to infer social closeness?

Children and adults alike use nonverbal cues to infer closeness, but how do they determine which actions signal close relationships? This study examines whether children use the frequency of an action to infer social closeness, and if they prefer to befriend those with fewer close relationships. Participants watched a central character greet four people using one novel action (e.g., toe tapping) and one person with a different novel action (e.g., hip bumping). Children ages 6-8, but not 4-6, identified the infrequent action as an indication of "best friends", suggesting they associate rarer actions with closer relationships. Additionally, children preferred to befriend those with fewer close relationships. These findings suggest that older children use action frequency to gauge closeness and believe that closer relationships are less frequently occurring. We plan to conduct a followup study exploring whether the same effect is found with sound-based greetings as well.

Early Neurophysiological Signatures of Multi-digit Number Length Processing

The Hindu-Arabic numeral system associates number length and value. This study investigated number length encoding in multi-digit number processing while controlling for overall visual size. Using scribbled line patterns to equalize visual extent, participants (N=27) compared tie numbers to a fixed standard ("555"), creating congruent (e.g., "6666" vs. "555") and incongruent (e.g., "77" vs. "555") conditions where number/string length and numerical value matched or conflicted. Targets were compared based on numerical value or string length. Findings revealed three distinct processing stages: First, enhanced N1 negativity (120-150 ms) at parieto-occipital sites reflected early length encoding, with greater amplitudes for larger string-length distances. Second, decreased P2p amplitudes (150-190 ms) at parietal sites varied with both numerical and string-length distances, indicating refined magnitude processing. Finally, decreased P3 amplitudes (300-360 ms) at central sites for incongruent trials reflected cognitive conflict resolution. These findings provide novel evidence consistent with an early number length encoding mechanism.

How and when do generic language and structural explanations shape essentialist beliefs about social categories?

Generic language about social categories risks amplifying essentialist beliefs, but its influence can be reduced by framing category features as products of extrinsic structures. We investigated which elements of essentialism are fostered by generics and whether structural framing dampens essentialist assumptions even for categories negatively stereotyped in society. In Experiment 1, participants read generic, quantified, or specific statements about novel categories and then rated the categories on ten dimensions of essentialism. Compared to other statements, generics led participants to essentialize the referenced categories broadly, viewing them as natural kinds with high inductive potential. In Experiment 2, participants read generics about novel or real-world categories (e.g., Mexican immigrants), sometimes accompanied by a structural explanation. Such explanations reduced essentialist interpretations of the generics for novel categories, but slightly increased them for real-world categories. Generics appear to induce broad essentialist beliefs, and structural framing may be insufficient for mitigating their problematic social consequences.

Linguistic shifts and emotion regulation across languages: The case of Spanish "estar"

When using cognitive reappraisal to regulate their emotions, English speakers shift their language to be more psychologically distant (e.g., using "I" less often) and more abstract (e.g., using more words for categories). For Spanish speakers, successful emotion regulation has been linked to increased use of "estar," the temporary form of "to be." Does "estar" track psychological distance, abstractness, or some other perspective shift? We reanalyzed published data from Spanish-English bilinguals who transcribed their thoughts while responding naturally to negative images or reappraising them. We found that (a) using "estar" was negatively correlated with using both distant and abstract language, and (b) increased use of "estar" when reappraising predicted reappraisal success, even when controlling for shifts in distancing and abstractness. These results suggest that "estar" may be a unique linguistic marker of successful regulation. Perhaps by framing a situation as transient, this Spanish-specific regulation tool may make negative emotions more bearable.

Supporting Knowledge Transfer in Programming: Insights from K-12 Computer Science Teachers

Many states now provide access to computer science (CS) courses across multiple grade bands. Younger students are often introduced to block-based programming languages before transitioning to more conventional text-based languages (Kao et al., 2022). However, students often struggle to see connections between previously learned programming languages and novel ones (e.g., transitioning from block-based to text-based programming). To better understand how knowledge transfer occurs in CS classrooms, we interviewed teachers regarding the examples of knowledge transfer they observe in their classrooms, including whether and how their CS curriculum supports such knowledge transfer between programming languages. Qualitative analyses revealed emerging themes related to syntactical affordances and challenges of programming languages, whether and how teachers make direct or indirect comparisons between languages in their instruction, and the types of transfer teachers observe most frequently in their classrooms. Our findings will guide the development of curricular materials that support programming language knowledge transfer.

Listener Trade-Offs between Acoustics and Semantics in Noisy Speech

Understanding speech demands integration of various kinds of information over time. Previous research has shown that listeners can use both semantic and acoustic cues to understand spoken words (Bushong, 2024) and that listeners can dynamically reweight different acoustic cues relative to each other (Kapatsinski, 2024); but it is unclear if listeners are able to readjust the relative contributions of semantic and acoustic cues depending on the current context (e.g. adverse listening conditions). The current study manipulates the acoustics of a target word, a semantic cue (sentential context), and the location of the semantic cue relative to the target word. Additionally, we manipulate the level of noise in the utterance and the location of that noise (full utterance vs. target word). Noise reduces listeners' ability to use acoustic cues for spoken word recognition and as local noise increases, semantic context is up-weighted. However, as global noise increases, semantic context is also down-weighted.

Roundedness and symmetry in the perception of similarity to the circle or roundness

Research shows that shape perception is sensitive to both roundedness or angularity (e.g., Bar&Neta, 2006) and symmetry (e.g., Dehaene et al., 2006), and that these features also affect the perception of similarity (Tversky, 1977). Roundness is the measure of how closely the shape of an object approaches that of a circle. In an online quasi-experiment (n=74), we tested combination pairs of 19 geometric figures (created according to symmetry properties and roundedness) to answer the question whether the symmetry would contribute more than the roundedness of the figure corners to the roundness perception. Participants did a forced choice task on figure pairs presented in a random order. The results show that for regular polygons, roundedness determines the similarity assessment even when the symmetry is up to three orders higher. For pairs containing regular figures, choices were made faster compared to pairs with asymmetrical and non-regular figures.

The paradox of trait impressions in naturalistic contexts: rich information, sparse predictive cues

People spontaneously infer traits that shape critical decisions. Prior research identified important impression cues across channels (face, body, clothing, environment), but often studied them in isolation. These findings may not generalize to naturalistic contexts, where people rapidly form impressions from rich, multi-channel cues. We addressed this by quantifying comprehensive cues (Study 1) and manipulating individual cues (Study 2) in everyday images using computational tools. Across two large-scale, pre-registered studies (N = 3,004, U.S. representative), we found that despite abundant information, only a sparse set of cues predicted impressions. These cues carried either unique information beyond other cues, shared information with other cues, or both. Many cues previously theorized as important did not explain trait impressions directly but shaped how other cues influenced judgments. We confirmed for a subset of predictions manipulating the predictive cues causally changed impressions. Our findings advance understanding of how trait impressions form in complex, naturalistic environments.

Reconceptualizing Knowledge: Evaluating the Ontological Complexity and Emergence of Knowledge in Cognitive Systems

This study examines knowledge as an emergent, adaptive system shaped by the interaction between internal cognitive processes and external factors. Contrary to traditional views of knowledge as a static repository, this framework emphasizes its evolution through self-organizing processes and conceptual reorganization. The research addresses the limitations of computational models in capturing the complexity of cognition, questioning whether they can adequately represent the depth and emergent qualities of cognitive processes. Additionally, it explores how computational approaches might reconcile with the embodied, experiential nature of knowledge. The study adopts an interdisciplinary approach, combining systems theory and cognitive modeling. It focuses on functional variances and state transitions within, how they adapt and organize themselves. The goal is to uncover the ontological complexity of cognition and provide a nuanced understanding of how knowledge emerges, adapts, and reorganizes within cognitive systems and environmental contexts.

Distinguishing Human vs. AI-Generated Texts: How Humor and Emotional Expression Shape Perceived Authorship

This study explores the cognitive processes behind authorship perception in text-based communication, examining how humor and emotional expressions influence the ability to distinguish human- from AI-generated texts. Drawing on theories of language processing and social cognition, we investigate whether humor and emotional tone serve as cues for attributing human-like qualities. Through an experiment, 212 participants evaluated text messages varying across humor (present/absent), emotional expression (positive, negative, neutral), and authorship (human/AI). Using Likert scales and open-ended responses, participants assessed human authorship likelihood and described the cognitive strategies guiding their judgments. Findings reveal how emotional and humorous content shapes authenticity judgments, offering insights into cognitive mechanisms underlying human-AI interaction. This research bridges classical cognitive theories with future challenges, highlighting the role of emotional and pragmatic cues in evolving digital communication contexts. Implications for cognitive science and the study of language processing are discussed.

Cognitive and neural mechanisms of spatial learning and transfer in adults

Several cognitive theories explain successful learning of spatial visualization skills such as mental rotation. Spatial learning may occur through domain-specific changes to spatial transformation ability (in parietal cortex), embodied sensory-motor changes (in premotor cortex), or domain-general changes to executive functions (in prefrontal cortex). To evaluate these hypotheses, we analyzed brain activity during mental rotation using fMRI in 60 adults who completed 9 weeks of spatial visualization training. Mental rotation robustly activated bilateral parietal, premotor, and prefrontal areas both before and after training, with increased activity in the intraparietal sulcus, premotor cortex, and dorsolateral prefrontal cortex after training. However, improvements in spatial visualization and transfer to geometric reasoning were coupled with parietal and premotor changes, and not prefrontal changes. These results support the hypothesis that spatial learning is driven primarily by refinement of spatial transformation and sensory-motor imagery, although other brain regions may adapt secondarily as a byproduct of learning.

Adaptive Attention and Memory in Multi-Option Multi-Feature Decision Making

Deliberation over a set of options is sometimes needed to choose the most preferred. An example is comparison shopping, where options are products, each with a variety of features. There is evidence that both dorsal and ventral prefrontal cortex (PFC) contribute to these deliberations. We report the results of mouse-tracking experiments involving deliberative choice, constraining a computational cognitive neuroscience model of the role of PFC, and associated brain areas, in shifting attention between option features, as well as in tracking preferences over time. Options are visually displayed, but option features remain hidden until the mouse is moved to their locations. In this way, mouse movements provide a measure of attention allocation. Results reveal strategic patterns of attention shifting which adapt over the course of deliberation and over the course of practice. Observed adaptation is consistent with the reinforcement learning mechanism hypothesized to update the cognitive control states of PFC.

The Impact of Personality Types and Cognitive Learning Styles on ESL Learners in Art and Science Programmes

This study aimed to explore the relationship between ESL learners' cognitive styles and personality types, focusing on how personality influences perceptual learning preferences and language learning strategies. Participants included 20 English majors and 58 Nursing majors in a Year 1 ESL course (62 females and 16 males). Data were collected using a structured questionnaire, including the Myers-Briggs Type Indicators (MBTI) test, and 40 items on learning styles and strategies. Results indicated that the most common personality types among English majors were INFP and ISFP, while Nursing majors frequently exhibited ISFJ and ENFP types. The least common types were INTJ and ENTP. Learning style preferences showed 37% favored reading/writing, 33% visual, 26% auditory, and 4% kinesthetic. Significant relationships were found between language learning strategies and introverted/extroverted personality types, though no significant gender differences were observed. The findings suggest that ESL instructors should consider these differences to enhance classroom dynamics and effectiveness.

Individual differences in animacy cognition are reliable and externally valid

Animacy is a fundamental yet difficult to define notion in cognitive science. For example, people judging whether words refer to entities that are alive are faster and more accurate for animals (e.g., tiger) than plants (e.g., petunia) and slower and less accurate for natural abiotic entities than artifacts (e.g., cave and ocean vs. slipper and bicycle). The current study demonstrates the reliability and validity of individuals' aliveness judgments. 169 English-speaking Americans completed the aliveness study twice. Individual d-scores representing the difference in aliveness judgments between animals and plants at Session 1 predicted d-scores at Session 2 (r = .87, p < .001), as did d-scores representing the difference between natural abiotic entities and artifacts (r = .84, p < .001). These measures also predicted attitudes such as humans having the right to extract natural resources. Future research must address how differences in environment/culture contribute to differences in animacy cognition.

Language Shapes Blame and Victimhood in Bilinguals' Autobiographical Memory

To what extent does language shape what we remember about ourselves? The present study tests whether Spanish-English bilinguals recall different types of information when using each language to reflect on accidents in their past. Participants wrote paragraphs describing two accidental events in which they were involved, one from childhood and one from adulthood. Afterward, they assessed their share of blame and victimhood for each of the accidents they described. All participants were balanced bilinguals and were randomly assigned to complete the task either in English or Spanish. When thinking in English, people recalled events in which they were equally likely to be the cause or the victim. Conversely, when thinking in Spanish, people were more likely to recall events in which they were the victim than events for which they were to blame. These findings inform our understanding of the role of language in shaping autobiographical memory for accidental events.

Sentence cues or semantics? Using eye tracking to study sentence processing in heritage Spanish-English bilinguals.

This study investigates how heritage Spanish-English bilinguals process sentences with canonical and non-canonical word orders, focusing on inanimate (IA) subjects and objects in subject-verb-object (SVO) and object-verb-subject (OVS) structures. By examining whether participants rely on sentential cues or semantic processing, we aim to test predictions from the Competition Model, which emphasizes cue reliability and validity, and the Good-Enough Processing Model, which suggests reliance on heuristics in challenging syntactic contexts. Using the Tobii Pro Fusion eye tracker, we are collecting eye movement data from 50 bilingual participants (Spanish AoA: 0-3 years; English AoA: 0-8 years) as they read 80 sentences (40 per language), balanced for verb agreement and randomized to control for order effects. Participants will identify the subject after each sentence and complete tasks assessing language dominance (BLP), vocabulary (LexTALE, LexTALE-ESP), and literacy skills in English and Spanish. Results will advance our understanding of current theories of sentence processing.

Space-time perception of Mandarin Speakers: age, temporal-focus and contextual priming

Research regarding the Temporal focus hypothesis has contributed to understanding space-time cognition. However, the relative influence of temporal focus, contextual priming, and age on implicit space-time mapping requires further investigation. Seventy-one Mandarin-speaking participants in Taiwan responded to either past or future-related questions, followed by the temporal diagram task and the temporal-focus questionnaire. Data were analyzed through chi-square and a three-way ANOVA. Results indicated a significant difference in front-back mapping between younger and older participants while not between the past and the future-primed conditions. Notably, most participants tended to locate future events in the front box. Furthermore, the ANOVA results revealed that past or future-focused statements and age significantly related to participants' temporal attitudes. The current study partially supports the temporal hypothesis; age is significantly related to Mandarin speakers' implicit space-time mapping and temporal focus, whereas the priming condition was insignificant.

Overfitting of Explicit Strategies during Sensorimotor Learning

Multiple learning processes contribute to successful goal-directed actions in response to changes in physiological states and environments. Among them, explicit strategies play a crucial role, enabling rapid and flexible sensorimotor adaptation. Yet, how the training target distribution impacts strategy discovery remains poorly understood. To address this, we conducted a visuomotor adaptation reaching task that isolated explicit strategy. We manipulated two training distribution features in a 2×2 between-participants design (N = 50/group): the spatial arrangement (dense vs. distributed) and the target number (2 vs. 8). To pinpoint the strategies participants adopted, all groups periodically reached to a shared generalization target without feedback. Learning was faster with fewer and denser targets. Strikingly, the training target number had no effect on generalization, but those trained with densely arranged targets adopted simpler yet flawed strategies, leading to poor generalization. These findings suggest that training distribution influences strategy discovery—potentially steering participants toward overfitting, non-generalizable solutions.

Learning from errors: Effects of feedback timing and warnings on cue–target and error–target recall

Erroneous guessing with corrective feedback enhances recall, with immediate feedback often outperforming delayed feedback. According to the errors-as-mediators hypothesis, immediate feedback strengthens error–target associations, improving recall. Alternatively, the semantic encoding and episodic discrimination hypothesis posits that immediate feedback facilitates strategic inhibition of errors, enhances target encoding, or both. This study examines how feedback timing (Immediate vs. Delayed) and post-error warnings (Warning vs. No Warning) impact recall. Participants studied weakly related word pairs (e.g., "swim–float") by guessing the target and receiving feedback either immediately, after a 5-minute delay, or with an immediate warning and delayed feedback. In Experiment 1, delayed feedback reduced cue–target recall, while warnings modestly mitigated this effect without reaching statistical significance. Experiment 2 revealed that delayed feedback improved error–target recall, while warnings nonsignificantly impaired recall. These findings challenge the errors-as-mediators hypothesis, suggesting that immediate feedback optimizes learning by enhancing encoding and episodic salience of the correct target.

The Language of Doubt Impacts Consensus Judgements

Scepticism of science is growing, with notable gaps between public acceptance and scientific consensus. This divide is especially concerning for collective action problems such as climate change. Highlighting consensus on a topic is an effective strategy to spur belief change and increase trust in science, but the deliberate manufacture of doubt has obscured this information. In our experiment, participants read factual or moral claims, each presented with public agreement statistics from sources like Pew Research Center. Participants indicated their prior belief, then judged either a presence or lack of consensus on each claim. We found that consensus judgements were largely based on the existing percentage of agreement and one's prior beliefs about the claim. However, prior belief had minimal impact when judging a lack of consensus, suggesting that framing information to introduce doubt may influence how people interpret scientific agreement.

Gestures on Memory: How Speech-Gesture Congruency Influences Memory and Metamemory Across Test Types

The current study investigates how (dis)fluency, manipulated by the (in)congruence between speech and gesture, can impact metamemory and memory. In two experiments (N=36 for each), we employed a mismatch paradigm where participants were presented with short videos of an actor simultaneously verbalizing a verb-object pair (e.g., distributing cards) while performing the congruent (e.g., distributing cards) or incongruent iconic hand gesture (e.g., drawing cards). After each video, participants evaluated speech-gesture compatibility on a 5-point Likert scale and provided a JOL rating (0-100). In Experiment 1, they were administered a free recall test, writing only the speech, while Experiment 2 utilized a recognition test to identify the exact video from the encoding phase. Results indicate higher memory performance and higher JOL ratings for congruent videos than incongruent videos. In contrast, no significant differences in memory or metamemory were observed in the second experiment.

EEG Delta-Beta Coupling in 2-year-old Offspring of Pregnant Persons Receiving a Diet-and-Exercise Intervention: A Randomized Controlled Trial Follow-up

Background: Delta-beta coupling (DBC) is a neural marker of emotion regulation (ER), with elevated DBC linked to cortical over-processing of emotional stimuli. This study investigates the effects of the Be Healthy in Pregnancy (BHIP) intervention, combining a high-protein, energy-controlled diet, nutrition counseling, and physical activity, on offspring DBC. Methods: Pregnant individuals received either the BHIP intervention or usual care. Twenty-four offspring at follow-up completed resting-state EEG at age two using a 128-channel system. DBC was quantified as the correlation coefficient between delta (2–4Hz) and beta (13–30Hz) power across epochs. Group differences were analyzed using Fisher's Z-tests. Results: BHIP offspring exhibited significantly lower DBC in frontal (p=.017), central (p=.014), and parietal (p=.009) regions compared to controls. Conclusion: Reduced DBC reflects a neural profile linked to efficient ER, enabling context-appropriate cognitive resource allocation. These findings suggest prenatal diet and exercise potentially modulate neurodevelopment, warranting validation in larger, more diverse cohorts.

Disposition or Disruption: How do young learners explain inconsistent causal evidence?

The causal world is highly inconsistent. While past research demonstrates even young learners' ability to reason from probabilistic evidence, there has been little examination of how learners reason about the nature of causal inconsistencies. For instance, what do we think about why variations occur? Four- to six-year-olds (N=90) watched either a person or machine repeatedly act in one of two possible ways. At test, the opposite behavior occurred, and children were asked whether the change was due to an internal disposition/capacity for variation or constraints in the external context. Results reveal a significant bias towards internal explanations of agents' inconsistencies (70%, p=0.01, two-tailed binomial) and for external explanations of machines in older (63%, p=0.09, two-tailed binomial), but not younger (50%) children – suggesting young learners consider both inherent and contextual sources of variation in causal relationships and gradually develop complex expectations about the different likelihoods of these sources depending on domain.

From pixels to physics: an image-computable model of physical predictions

Having reasonable expectations of how scenes will unfold is crucial in our life. One prominent hypothesis is that we perform probabilistic physics simulation in our mind. However, the question of how people infer underlying scenes from observations and how this affects downstream predictions about physics interactions is under explored. Current models usually make simplified assumptions that the 3D geometric states of objects are already given. To better understand the role of perceptual uncertainty in people's physical predictions, we explore the idea of vision as inverse graphics and design a model that can infer a posterior distribution of object states given the raw visual inputs. This perceptual uncertainty is then propagated to a probabilistic physics simulation model to derive physical predictions. We compared the model's predictions and generalizations on a wide range of physical scenarios from the Physion dataset and found that it captured both participants' successes and error patterns.

Stereotyping as Bayesian Inference among Black Adults in the U.S.

Do people's stereotype judgements align with what Bayes' Theorem dictates their judgements should be? Although prior work suggests they do, such research has generally been carried out with White undergraduates and minority social group stimuli (e.g., McCauley & Stitt, 1978; Solanki & Cesario, 2024). To determine whether these findings hold across diverse populations and stimuli, 870 Black American adults participated in a conceptual replication study that examined the congruence between stereotype judgements and a Bayesian criterion. Using trait and social group stimuli from prior research along with novel stimuli reflecting stereotypes about social groups such as White people and police officers, correlations between stereotype judgements and the Bayesian criterion were nearly half the size of those found in prior work (r = .34, vs. r = .64, Solanki & Cesario, 2024). An ongoing follow-up experiment designed to probe potential explanations is also discussed.

Predictability effects of Spanish-English code-switching: A directionality and part of speech analysis

Previous code-switching research (Carter et al., 2010) demonstrates Spanish's tendency to be the matrix language and the involvement of the determiner-noun part of speech (PoS) combination in Spanish-English code-switching. This research, however, primarily uses the spoken Miami Bangor Corpus (MBC), limiting generalizability across speech communities/modalities. We examined the MBC (N=261,711), the spoken Spanish in Texas Corpus, STC (N=416,784), and the written LinCe Corpus, LC (N=278,093) to analyze language directionality and PoS effects across speech communities and modalities. Bootstrap analyses indicate that Spanish was the matrix language at a higher proportion than English for MBC and LC, but English was for STC. Logistic regression analyses show the particle-coordinating conjunction combination was the strongest PoS predictor of a code-switch. These results suggest that corpus modality and speech community both affect matrix language proportions and that previously unconsidered PoS combinations may be involved in code-switching.

Influence of reward on visuomotor adaptation in complex tasks

Explicit strategies play an important role in visuomotor adaptation, but are subject to substantial capacity limits, calling into question their efficacy in complex settings. For instance, recent work has shown that when task complexity exceeds working memory capacity, observers are no longer able to fully adapt. Here, using a visuomotor rotation task in which participants were tasked with simultaneously adapting to eight target-rotation pairs, we examined the extent to which such capacity constraints may be ameliorated by reward-based feedback. We found that the mere presence of explicit reward did not change participants' behavior – a uniform reward distribution led to a similar pattern of behavior as has been previously reported. However, when a subset of target-rotation pairs was associated with a greater reward magnitude, participants demonstrated enhanced adaptation to them, which improved overall performance. These findings highlight the utility of reinforcement learning for enabling motor learning and adaptation in complex tasks.

Iconic Meanings Are Learned Earlier: Homophones Provide Insight on Iconicity's Role in the Acquisition of Words

Iconic words are those whose sounds share properties in common with their referents, such as "clatter" or "hiccup." Research shows that children learn iconic words earlier than arbitrary words and that iconicity may help children form these connections. However, another factor to consider is that iconic words have forms that are easier to produce. To gain further insight into the link between iconicity and acquisition we studied homophones. This allowed us to hold the form of each word constant and examine whether iconic meanings are acquired earlier. Participants provided iconicity ratings on 1668 total meanings for 390 word forms. We ran a mixed effects linear regression and found an effect of iconicity on test-based age-of-acquisition, controlling for word form, length, frequency, phonological neighbourhood, and meaning-specific familiarity. These findings suggest that children learn iconic meanings earlier than arbitrary ones and support iconicity as an important factor in word-learning.

Does it matter that "bed" looks like a bed? Orthographic Iconicity in English

Iconicity is a resemblance between form and meaning. In spoken language, research has focused on the resemblance between the sounds of words and their meanings. Here I report the collection orthographic iconicity ratings: the extent to which a word's printed form resembles its meaning. This was an exploratory study to determine if participants could provide meaningful ratings on this dimension, and if it had any effects on language processing. Words high in orthographic iconicity tended to be more concrete, imageable, earlier acquired, contain less frequent letters and have more consistent spelling to sound mappings. Importantly, after controlling for a variety of lexical and semantic variables, orthographic iconicity predicted a significant amount of variance in lexical decision time and accuracy, across three different megastudy datasets. I consider whether orthographic iconicity is a meaningful construct, and what alternative explanations exist for these results.

The Effectiveness of Iconic Cues in Word Learning Using the Human Stimulation Paradigm

Iconicity refers to a resemblance between the form of a signal and its meaning. Examples include spoken words that imitate sound based meanings (i.e., onomatopoeia; e.g., "splash") or representational gestures (e.g., a holding the hands far apart to indicate "large"). Caregivers use iconic cues when talking to their children, however, the potential of these cues for successful word-referent mapping across communicative contexts remains unclear. This study examined the effectiveness of iconic cues when referents were physically present or absent. Using the Human Simulation Paradigm, 320 adult participants watched naturalistic videos of a parent and child discussing objects, with all utterances of a target referent "beeped". Participants' task was to guess the referent. We found that iconic cues improved accuracy when the target was absent. This supports their function of bringing referents "the to the mind's eye" in displaced contexts.

Exploring the mechanisms that enable multimodal reasoning about data visualizations in vision-language models

Humans can readily integrate visual, linguistic, and numerical information to extract meaning from symbolic displays of information. For instance, answering even a simple question about a data visualization requires connecting tokens of language to visual features in the plot to support quantitative inferences. What are the core computational mechanisms that enable integration across modalities to support such reasoning? Open-source vision-language models (VLMs) might provide a useful testbed for investigating these mechanisms, but doing so requires a high degree of experimental control. To achieve this control, we procedurally generated a large dataset containing pairs of questions and data visualizations that varied along several independent and ecologically important dimensions, including the number of observations and how they were distributed. We identified several open VLMs whose performance was sensitive to this variation, establishing their viability for further exploration of the mechanisms underlying multimodal reasoning.

Abstraction and Optimization in Statistical Learning: A Randomized Controlled Trial of Implicit and Explicit Reading Intervention for Students with Dyslexia

Statistical learning (SL), or the ability to unintentionally extract patterns underlying sensory information, allows the human mind to acquire various regularities in written languages. However, we are unclear about its potential in learning multiple probabilistic regularities in a complex language, such as the sub-lexical mappings between orthography (form), phonology (sound), and semantics (meaning) in Chinese characters. We tested (implicit) SL and its explicit form in acquiring the Chinese sub-lexical mappings as a randomized controlled trial. Ninety-five 1st-4th graders with or at risk for dyslexia were randomly assigned to an implicit-SL group that was exposed to a set of characters with the sub-lexical form-sound-meaning mappings, an explicit-SL group that was exposed to the same set of characters with explicit instruction on the form-sound mappings, or a no-SL control group. Only the explicit-SL group showed abstraction of form-sound mappings, while only the implicit-SL group showed optimized reading processes across phonology and semantics.

Recovering belief structures using a language model on a naturalistic dataset of attitude change

On the Reddit forum ChangeMyView, users post beliefs and invite others to challenge them. In this study, we aimed to determine whether a GPT-4-based analytical pipeline could accurately recover belief structures from a subset of posts on predefined topics, identified through covariation statistics from a lab sample. This approach would enable us, in a second stage, to extract novel insights from naturalistic data on belief structures that have not been directly elicited in lab studies, providing a bottom-up examination at scale. Our findings suggest that the pipeline captures meaningful belief patterns, aligning moderately with human responses in structured surveys. Analyzing 3082 posts from 346 users revealed distinct ideological clusters and belief patterns that mirrored well-established political divisions. This method offers a scalable way to study belief networks, shedding light on their role in shaping societal attitudes.

When unpredictable does not mean difficult to process

During language comprehension, words that are less expected tend to take more effort. This phenomenon has been described by the hypothesis that cognitive cost scales in surprisal (negative log probability; Hale, 2001; Levy, 2008), with a core justification being that surprisal quantifies the amount by which a rational comprehender's beliefs about meaning change upon encountering a word. However, this focus on next-word prediction may be too narrow. In this work we advocate measuring processing cost directly with the size of the change in beliefs about meaning, a reframing which implies a novel class of potential situations where surprisal may systematically overestimate cost. We identify typographical errors as a test case, and implement estimators of surprisal and belief-update in a noisy-channel model of comprehension as inference about intended strings. In a self-paced reading time study, we present evidence that human reading time behaves as predicted by belief-update size, rather than surprisal.

Learning from thought experiments in early childhood

Thought experiments have been credited with generating new knowledge in the history of science. Although many parallels have been drawn between the thinking of scientists and children, it is not clear if children can generate new knowledge via thought experiments. We tested if the use of an extreme case thought experiment can help 6- to 9-year-olds to overcome the misconception that heavier rather than larger objects displace more water. A total of 70 children (MAge = 88.94 months) were assigned to a Control condition and to an Extreme Case condition designed to elicit children's existing understanding of solidity, namely that two material objects cannot occupy the same space at the same time. Children received no feedback in either condition. We found that children in the Extreme Case condition performed better on both the Learning and Far Transfer trials, suggesting that thought experiments can serve as a learning tool in childhood.

Triadic comparisons reveal representational motifs in human color perception.

While individual differences have been found in many aspects of color perception, it is unclear whether mental representations of color vary between individuals in a structured way. To test for structured variability, we collected perceptual similarity judgments for 58 colors in a triadic-comparisons procedure, and from each participant's judgments, embedded the colors into a personalized three-dimensional space. The personalized embedding predicted a participant's similarity judgments on held-out items significantly better than did (a) embeddings from other participants, suggesting reliable individual differences in perceived color similarity (p < 0.0001), and (b) color coordinates in a standardized perceptual color space (CIELAB; p < 0.0005). Across individuals, embedding structure did not vary randomly but fell into two clusters, one encompassing more distinct color categories and the other a more continuous perceptual space. The results suggest the existence of previously unrecognized "motifs" in how people represent colors.

A Graph Theory Approach to the Bidialectical Nature of English

The English lexicon is a creole consisting of early-learned Anglo-Saxon (AS) words and late-learned Latinate (LA) words with some other origins at lower frequency. The creole nature of the lexicon is most evident in pronunciation rules. We pair etymological and age of acquisition data for ~20,000 English words with phonological and semantic networks to study word organization over time. Analyses reveal dramatic sound and meaning shifts over time in the most densely connected cores of the graphs. Phonologically, there is a shift in middle school from AS "air" words to LA "ale" words. Semantically, the dense core shifts from an AS/LA balanced set of food and number words to medicinal and chemical terms between junior high and high school. These shifts are evident in monolinguals. This AS to LA shift also affects individuals from diverse language backgrounds, especially L2 English learners from L1 Latinate (e.g., Spanish) and non-Latinate languages. The implications of our findings will be discussed.

Shared control impairs cognitive control: Human responses inhibition slows when machines fail to inhibit

In order to fulfill goals, humans make use of cognitive control, which is a suite of processes to plan and manage thoughts and actions. One such process is response inhibition, which entails stopping a response when an action becomes inappropriate. Traditionally, response inhibition is measured in experimental settings in which humans have unilateral responsibility for inhibiting the action. However, in the real world, humans are increasingly sharing control with artificial intelligence (AI), with the paradigmatic case being partially automated vehicles. We designed an experiment that includes some aspects of partially automated vehicles and found that when humans share control with an AI that often but does not always stop, human response inhibition is significantly slowed even when the AI does not intervene. This reveals a cost of sharing control to human cognitive control, suggesting that the benefits of partial automation should be weighed against the costs of impaired human control.

How does social learning affect trapped learners?

Learning traps are stable sub-optimal decision rules that discourage necessary exploration for learning optimal decision rules. We investigated how trapped learners respond to observational learning opportunities. We predicted participants would behave according to a selective value-shaping model, thus escaping the learning trap and learning the optimal rule via observational learning. We did find that trapped learners were significantly more likely to escape their trap in the observational learning condition compared to the asocial control, though most still remained trapped. We found that decision rule inference, not trust, was the key limiting factor in successful observational learning. When trapped learners correctly inferred a partner's optimal rule, they almost invariably adopted it. These findings suggest social interventions for learning traps should support understanding of others' decision strategies rather than merely exposing learners to alternative choices, and theoretical models must extend beyond simple value-shaping to account for decision rule inference processes.

Care Beyond Empathy: Towards a More Accessible Theory of Prosocial Clinician-Patient Interaction

Empathy is a widely used and poorly understood concept, in spite of the significant intellectual thought that has been devoted to elucidating its meaning. Rather than focusing on the semantics of empathy, we explore its telos and utility—particularly in clinical care—through analyses of the "double empathy problem" and Theory of Mind, along with thought experiments drawn from healthcare scenarios. We find that although cognitive and affective empathy are often considered essential components of treating others with care and respect, they are neither sufficient nor clearly necessary for achieving these aims in clinical contexts. In fact, both varieties of empathy can easily lead to detrimental social outcomes. With this in mind, we suggest de-emphasizing the role of empathy in clinical culture. A clearer focus on actionable strategies and behaviors which facilitate positive patient-clinician interactions may help demystify "soft skills," thus making competence in social aspects of healthcare more accessible to all.

From Observation to Attribution: How Event Features Shape Responsibility Judgments in Real-World Traffic Videos

When people witness a harmful event, they can judge who is responsible and to what degree. Such judgments depend on multiple factors, including what happened, whether the agent could have prevented the outcome, and whether they were aware that their behavior could cause harm. This information is either observed or inferred from perceptual inputs, often based on details that are present in naturalistic videos but are difficult to convey in word descriptions (e.g. if a pedestrian crossing the road was hit by a car, how suddenly did the pedestrian start crossing?). This study investigates how humans generate rich representations of responsibility from visual input using a controlled set of naturalistic traffic videos sampled to achieve wide variation along relevant dimensions. From these videos, we extract a variety of event features describing entities and their relations, studying the extent to which different features contribute to responsibility judgments in realistic settings.

The effects of real-world novelty exposure on episodic memory specificity across development

Episodic memory specificity improves as we age and allows us to recall experiences in detail. In controlled experiments, novelty modulates memory specificity. Here we examine how real-world novelty exposure affects memory specificity over varying delays in a developmental sample (aged 12-25) using geolocation tracking, experience sampling, and periodic memory assessments. Data collection is ongoing, but preliminary results (N=44 of 120 target) suggest that participants' subjective ratings of experienced daily novelty are correlated with GPS metrics of daily location variability, and that the effect of encoding-to-retrieval delay on general memory is greater in older participants. High location variability interacts with subjective novelty on the encoding day, facilitating general memory on less novel days and specific memory on more novel days. Finally, for both general and specific memory, greater location variability on the retrieval day facilitates memory for more recently encoded items while hindering memory for more remote items.

A Shared Spatial Mental Representation System for Navigation and Reasoning

Recent studies revealed that the brain's system for representing physical space is also recruited to represent relations that are not inherently spatial. We investigated the relationship between the ability to form these abstract spatial representations and other spatial abilities. In four experiments, participants created spatial representations of a series of premises relating objects on two dimensions (e.g., A is faster than B, B is louder than C) and answered inference questions based on these relations. The type of information, i.e., spatial information (A is above B) versus abstract, non-spatial information (A is smarter than B), did not affect task performance. Individual differences in how ‘grid-like' the created representations were, as well as reasoning consistency, correlated with some spatial abilities, including path integration. These findings are consistent with the view that a common spatial mental representation system underlies our ability to form mental representations of physical and abstract spaces.

Spot the ball: Inferring Hidden Information from Human Behavioral Cues

Humans often infer the state of the world by observing how others interact with it—when crossing a street, for instance, we may follow the movement of others without directly seeing the traffic. This ability to extract hidden information from human interactions with the environment is crucial for adaptive behavior. In this study, we explore how people make such inferences in Spot the Ball, a task where participants predict the location of a masked soccer ball in single-frame images. We created a large dataset by scraping YouTube videos, identifying compelling images using CLIP, and masking the soccer ball through inpainting. Our findings show that human participants rely heavily on pose and gaze cues to infer the ball's location. While providing this information improves GPT-4o's performance, it remains significantly below human accuracy. These results highlight the significance of intention inference, with potential applications in self-driving cars, assistive AI, and humanoid robotics.

Conceptual and affective alignment in sensory metaphor

What is conveyed by sensory metaphor? Zhu et al. (2024) found that concepts metaphorized by the same sensory words were more closely aligned in dual categorization (IAT) tasks than was predicted by their alignments in word embedding models. The present study utilized 132 distinct IATs to measure conceptual alignment of words metaphorized by 99 sensory metaphors to test generalizability. This time, affective alignments were additionally measured: A semantic differential method and principal components analysis yielded a three-dimensional affective vector space, which was used to compute a measure of affective alignment. Linear models showed that affective alignments and conceptual alignments (from word embedding models) each predicted independent portions of the variance in the IAT data. Even with both of these sources of variation taken into account, concepts sharing a common sensory metaphor were more aligned than those paired at random. Sensory metaphors may simultaneously convey affective, semantic, and perceptual meaning.

Some Innate Characteristics of Neural Models of Morphological Inflection

Neural Network Models of Morphological Inflection (NNMIs) have deep relevance to cognitive science stemming from the central role that they played in the Past Tense Debate of the 1980s and 1990s. Critics of the connectionist approach to the mind frequently pointed to NNMIs' shortcomings in the area of developmental realism: they argued that regardless of their ultimate accuracy, they failed to capture patterns of child language acquisition including developmental regressions and a propensity for over-regularization rather than irregularization. However, NNMIs have seen impressive improvement in the deep learning era of the 2010s and 2020s. Have modern NNMIs solved the old problems of developmental realism? We find that they have not. The persistence of these shortcomings suggests that they reflect "innate" characteristics of NNMIs as a class of learner, and that even substantial advancement in neural architectures and subsequent performance increases do not necessarily entail increased cognitive plausibility.

Tracing interactions between crosslinguistic differences and flavour perception

This study examined if crosslinguistic labelling differences affect categorization and similarity ratings of food samples. In Experiment 1, we tested if linguistic encoding warps gustatory perception exhibited as language-induced biases in food similarity ratings. To illustrate, in Mandarin, ‘cherry' (樱桃/literally ‘baby peach') and ‘peach'/(桃) share the character 桃whereas ‘cherry' and ‘peach' are lexically distinct in English. We predicted biases in Mandarin speakers to judge flavours with character overlaps as relatively more similar. In Experiment 2, Mandarin and English native speakers (N=20/group), were given 16 triplets of food samples (e.g., apple/pear/pineapple) to perform ABX categorizations. If label overlaps affect perceptual categorization, we expected more frequent apple-pineapple-like categories in English than in Mandarin speakers. Contrary to predictions, language-specific overlap types did not significantly affect either similarity ratings or categorization patterns. We interpret these findings as absence of top-down linguistic modulation of chemosensory signal processing.

Social position as a constraint on linguistic alignment

During conversation, individual speakers become part of a larger conversational system. This is observable in interpersonal coordination, variously described as alignment, synchrony, convergence, and complexity matching. While this coordination is ubiquitous, the degree of coordination varies greatly between conversations, interlocutors, and group-level goals. There is growing interest in understanding the constraints that operate at the level of the conversational system. We explore how social distance, the status and demographic differences of conversation partners, constrain linguistic alignment. To do this, we leverage the CANDOR dataset (Reece et al., 2022), which contains 1656 approximately half-hour ZOOM conversations between strangers and a large battery of follow-up questions regarding their conversation, personality, and demographic information including perceptions of status. In this exploratory study, we use recurrence analysis to examine how various coordinative behaviors, particularly lexical and syntactic alignment, are influenced by the social distance between conversation partners. Experimental follow-ups are discussed.

Reading skill affects reading saccades well into late childhood

Skilled readers show faster and longer saccades as they move efficiently through text. While studies of adolescents are quite sparse, teens are assumed to have adult-like behavior with rapid developmental trajectory of the oculomotor parameters (Blythe, 2014; Rayner, 1998; 2009). Age and reading skill effects on saccadic behavior were examined in young adults and adolescents reading naturalistic multi-line texts. Eye movements were recorded from 113 college students and 52 adolescents, who read publicly available English language PROVO corpus. Participants' reading expertise was measured by vocabulary and reading comprehension tests. Linear mixed-effects regression models revealed that age interacted with reading expertise in how fast readers move through text (forward, regressive and return sweep saccades velocity). Age and vocabulary affected only teens. Individual differences emerged in a more heterogeneous population that is earlier on the developmental trajectory. The study found tighter than previously assumed coupling between oculomotor saccadic parameters and reading expertise.

Are people aware of biases in their judgment and decision-making? A rigorous large-scale test

People exhibit systemic biases in their judgment and decision-making, and these biases are often presumed to operate outside of awareness. Nevertheless, there are surprisingly few direct empirical examinations of this question. Here, in two studies (total N = 727), we test participants' awareness of 11 classic biases. Participants completed a series of tasks, each inducing one bias (e.g., the anchoring effect, decoy effect, halo effect, etc.), and then reported whether and how they believed they were influenced by each bias. We found that, aggregating across tasks, participants' reports tracked how much they were actually influenced by each bias (with correlations between 0.3 - 0.45), indicating significant awareness. There were also substantial individual differences, with many participants exhibiting near-perfect awareness. This research argues against the notion that people are inherently unaware of their decision-making biases, and instead supports views that place conscious processing closer to the center of human decision-making.

Probing for experience-driven critical period effects in a large language model

Critical period effects (CPEs), wherein learning during childhood is easier, are often considered to be maturationally-driven. However, recent findings of CPEs in artificial, non-maturing models suggest they can be driven by experience alone. We investigate to what extent experience can drive CPEs in second language (L2) grammar acquisition by probing for CPEs in a large language model (LLM). Constantinescu et al. (2024) report no evidence for experience-driven L2 CPEs in LLMs, but do not discuss the fine-grained grammatical rules used in prior behavioral studies of CPEs. We document that LLM learning patterns vary across these rules, but mostly find no evidence for human-like CPEs. Our results therefore support previous conclusions that CPEs in L2 grammar acquisition are not driven by experience alone.

Generative AI-Assisted Clinical Interviewing of Mental Health

The standard assessment of mental health typically involves clinical interviews conducted by highly trained clinicians. This approach faces significant challenges, including high costs, overburdened clinical workloads, variability in clinician expertise, and a lack of standardization. Recent progress in large language models presents an opportunity to address these limitations by simulating clinician-led interviews, however, the validation of such AI-driven clinical interviews remains sparse. We developed and evaluated an AI assistant designed to conduct clinical interviews (N = 303). Another AI assistant task was analyzing the interview transcripts to generate diagnostic insights based on DSM-5 criteria and to provide comprehensive justifications for its assessments. The results showed that the general AI-powered clinical interview correlated with self-reported, clinician-diagnosed mental disorders that were non-significantly different, and had significantly lower co-dependencies, compared to state-of-the-art rating scales. These findings suggest that AI-powered clinical interviews can offer an accurate, cost-effective, and standardized approach to diagnosing common mental disorders.

Prototypicality and frequency do not necessarily affect children's sentence comprehension

Prototypicality and frequency have been assumed to play a critical role in children's sentence comprehension. However, Mandarin-speaking five-year-olds' comprehension of the SVO structure consisting of subject verb object, ba structure consisting of subject ba object verb, and bei structure consisting of subject bei object verb, all of which were plausible but contrasted with animacy, namely animate-animate (AA), animate-inanimate (AI), inanimate-animate (IA), and inanimate-inanimate (II) indicateed that the prototypical AI combination did not surpass all the remaining patterns and the ba and bei structures with low frequencies did not necessarily result in worse performance. Animate noun phrases (NPs) did not always trump inanimate NPs to be agent/doer. Although five-year-olds' performance with the IA bei structure followed eADM's and good-enough representation's predictions, the ba structure's better comprehension in the AI and II conditions than any other structure suggests that arrangement of argument order may not be as prominent as those theories claimed.

Exploring Vector Representations for Phonological Similarity

Recent research has compared representation models of word meaning (Brown et al., 2023, Cognitive Science 47:e13291), however, less research has compared representation models of words' perceptual features. Thus, we compared vector-space word representation models that can be used to quantify words' phonological similarity. The main structure of the model was adapted from Cox et al.'s orthographic representation model (Behavior Research Methods 43:602-15, 2011). Variations of the model included phonetic mapping scheme, encoding scheme, the inclusion of lexical stress, and the combination of orthographic and phonological representations. We tested the model variants against human-rated phonological similarity and both phonological and orthographic Damerau-Levenshtein distance. Open n-gram encoding (1 ≤ n ≤ 2) performed better overall than terminal relative encoding across all phonological similarity metrics. Concatenated orthographic-phonological vectors improved the prediction of human ratings with terminal-relative encoding only. Using more fine-grained phonetic mapping or including lexical stress had minimal effects.

Linguistic encoding and blue shade discrimination: Insights from 80 languages

Contrasting color terminology for different shades of blue in one's native language is often reported to modulate visual discrimination speed and similarity ratings. Representative studies to date exhibit serious limits in that many rely on two-language comparisons and modest sample sizes. Here we address both limitations with a dataset of 3912 participants from a sample of 80 native languages out of which 16 lexically distinguish blues (TwoBlues) based on brightness (incl. Azeri-Burmese-Greek-Thai-Turkish-Ukrainian). We used a new ‘color guesser' game. Participants saw grids of blue tiles in 3x3 arrangements, with the middle left blank. Two blue probes appeared under each grid, manipulating within/between/across blue combinations. The task was to quickly decide which probe completes the grid. Contrary to predictions, neither accuracy rates nor reaction times to distinct shades of blue differed significantly between language groups (TwoBlues vs OneBlue). We draw implications for models that posit top-down linguistic modulation of visual processing.

Learning and teaching are uncorrelated in an algorithm transmission task

Selective social learning is thought to be crucial for cumulative cultural evolution, as it helps preserve valuable but complex knowledge. Traditional models of cultural evolution have emphasized selection for performance. However, recent experiments demonstrate a potentially paradox: learning from high performers can compromise transmission. Should learners select teachers based on their performance or their teaching ability? This study examines whether teaching ability correlates with performance in algorithmic concept learning. Thirty participants learned a sorting algorithm from examples and explained the concept to beginners. A second cohort rated the helpfulness of these explanations after learning the concept themselves. Using an algorithm we developed to assess how well participants had learned the concept, we found no significant relationship between teaching effectiveness and task performance. Our results extend findings from prior research to the setting of algorithmic concept learning, and highlight a fundamental dilemma in cultural transmission.

From the ears to the eyes: using pupil size to explore the perceived appeal of languages

Are some languages more appealing than others? And do specific linguistic features, such as phonetic or prosodic characteristics, contribute to this perceived beauty? Previous studies have been inconclusive, mostly because they primarily relied on subjective ratings prone to social desirability biases. To overcome this limitation, we explore a novel approach, using pupil size as a proxy for linguistic appeal. Pupil dilation has been linked to pleasurable stimuli, such as music and environmental sounds, suggesting similar reactions for appealing languages. In our experiment, participants listened to artificial languages modeled on the phonetic structures of real languages, while their pupil size was measured. By combining this eye-tracking data with traditional ratings, we demonstrate that (a) eye-tracking is a reliable method for investigating phonaesthetic appeal and (b) languages with different linguistic characteristics vary in their perceived attractiveness. This interdisciplinary approach sheds light on the cognitive processes underlying the aesthetic perception of languages.

The Developmental Role of Spatial Abilities in Predicting Science Achievement in Elementary and Middle School: A Cross-Sectional Study

Spatial abilities are relevant to scientific achievement, yet little is known regarding development of spatial abilities among adolescence. This study thus examines the development of spatial abilities during adolescence, and how different spatial abilities differentially predict science achievement. Over 1,006 students from grades 4, 6, and 8 were assessed with four different spatial abilities tasks, a science achievement test (TIMSS), and assessment of control variables. Significant grade differences in spatial abilities were observed. Spatial abilities account for about 14%, 13%, and 13% of variance in science achievement in Grades 4, 6, and 8. Extrinsic-dynamic abilities emerged as the strongest predictor of science achievement. The current findings showcased the development of different spatial abilities across Grades 4 to 8 and confirmed the significance of spatial abilities, particularly extrinsic-dynamic spatial abilities, in science learning. Interventions that target spatial abilities may be a potential way to prepare students for the science curriculum. Keywords: Spatial Abilities, Science Achievement, Cognitive Development, Spatial Cognition, STEM

The Decision Environment and Confidence in Experiential Risky Choice

We investigated how the decision environment influences choice and confidence in a binary decisions-from-experience (DfE) task. We manipulated four aspects of the decision environment: option composition (safe vs. risky; SR, or risky vs. risky; RR), the riskiness of the EV-maximizing choice, feedback type (partial or full), and problem order (SR before or after RR). We found that participants made more EV-maximizing choices and were more confident in SR environments and when the EV-maximizing choice was the safer option. Participants' behavior was influenced by their understanding of the payoff structure: confidence dropped sharply after discovering a previously unexpected outcome, and experiencing RR first led to over-sampling and reduced rewards in later SR environments. A computational model incorporating choice and reward history provided a unifying explanation for these patterns. Our results shed light on how decision environment features interact to shape sampling biases, subsequently altering learning and decision-making outcomes.

The relationship between statistical learning and different facets of language ability: evidence from auditory and visual modalities

The relationship between language and general cognition is a key question in cognitive science. Statistical learning (SL)—the ability to extract environmental regularities without supervision—is considered a key contributor to language ability. Our study comprehensively assessed adults' language abilities (grammatical sensitivity, pragmatic comprehension, semantic prediction, violation processing), auditory and visual SL, and other cognitive functions (short-term memory, working memory, perceptual speed) to control for their effects on both language and SL. We hypothesized that auditory and visual SL would predict language abilities, with a stronger relationship for auditory SL. Surprisingly, visual SL was the best predictor of grammatical sensitivity and pragmatic comprehension, while semantic prediction and violation processing were not explained by general cognitive abilities. These results align with findings supporting a SL-language relationship and demonstrate that language is intertwined with general cognition, but also point out that facets of language ability differ in their reliance on general cognitive processes.

Metaphorical Triangulation

Metaphors are powerful tools for explaining abstract concepts, but a single explanatory metaphor may be ineffective if the target system is sufficiently complex or the metaphor is counterintuitive. Drawing on theoretical and empirical research on metaphor and analogy, I describe a more systematic explanatory strategy: metaphorical triangulation. This involves (1) describing an intuitive mental model of the target phenomenon in terms of a concrete—but flawed—metaphor, drawing attention to its weaknesses; (2) presenting an alternative metaphorical model designed to address these weaknesses; and (3) providing supplementary metaphors to further develop the preferred account of the target phenomenon and address shortcomings of the alternative metaphor. I show how philosopher Daniel Dennett used this strategy to illuminate a range of puzzling issues, from evolution to consciousness, and I present some preliminary empirical support for this approach. I encourage scientists and educators to consider how they might use metaphorical triangulation in their work.

Investigating Mental Simulation and Mental Imagery Using Sentence-Picture Verification

This study hypothesized an underlying similarity between mental simulation, the automatic activation of multimodal features during language processing, and mental imagery, the voluntary effortful process of deliberate imagination. The mechanisms involved in both processes are underspecified and may be shared. To test this, participants performed a sentence-picture verification task with two blocks. In both blocks, participants read a sentence that implied the visual features of an item, then saw a (mis)matched picture of the item and were asked to verify if the item in the sentence appeared in the image. In one block, participants performed the same task but were instructed to form a mental image corresponding to the sentence. Predicted common patterns of results between these conditions will indicate similar processing with differences in time courses suggesting similar mechanisms. Distinct patterns will suggest these processes are entirely different. Results and implications for language processing will be discussed.

Self-other blurring: self-referential facial dynamics representation

People do not see themselves during real-life face-to-face interactions. Strikingly, after a 5-minute interaction with a stranger, the stranger is likely to be more familiar with the appearance of the interlocutor's facial expressions than the interlocutor is. However, people control and feel their facial movements. We examined whether an internal transformation into a visual representation exists, allowing people to assess the specific dynamics of their own facial expressions compared to those of others. Leveraging advanced video processing AI tools, we decoupled participants' facial features from their facial dynamics and tested whether observers engage preferentially with individuals who share their own facial expression dynamics more than those of others. We further examined whether there is a distinct brain activity when viewing one's own facial expressions compared to others' facial expressions. Altogether, we propose that there is a self-facial dynamics representation that influences the processing and perception of others through self-referenced comparisons.

Inferential Language is Limited and Unevenly Distributed in Popular Texts Used For Literacy Tutoring

Inferencing is essential for reading comprehension and higher-order thinking (Kendeou et al., 2019). Teachers are encouraged to teach students inferential language (Foorman et al., 2016), but the role of connected text itself in scaffolding inferencing remains unclear. In this study, we examined if texts used in a high-impact tutoring program for elementary students vary in inferential language. We identified four categories of inferential language: mental state terms, emotional state terms, metacognitive knowledge, and cognition and active processing. We used an AI-driven text analysis to tally these in the nine most popular text sources. Findings revealed minimal inferential language overall, with significant variation across sources, p < 0.001. As text difficulty increased, mental and emotional state terms (e.g., think, feel, happy) became more common, but academic and metacognitive terms (e.g., realize, reflect, analyze) remained scarce. Results highlight the need for further research on how text complexity influences student outcomes.

Intrinsic relations impair abstract rule learning

Across four experiments, we identified two distinct mechanisms for learning sequential rules: one for capturing arbitrary structures (e.g., a yellow dot followed by a blue dot signals "get ready" then "go"), and another for recognizing intrinsic relationships (e.g., a small dot followed by a big dot signals "get ready" then "go", perhaps represented by the rule "getting bigger means go"). Both mechanisms support abstract representations, as in Experiments 1 and 2, adults generalized sequential rules to novel combinations. However, Experiment 3 and 4 revealed that when both types of rules were present during learning, intrinsic rules disrupted the abstraction of arbitrary rules. This interference led to poor generalization performance. Overall, these findings suggest that intrinsic and arbitrary systems compete during rule learning, with intrinsic relationships imposing constraints on how sequential patterns are represented. These results are relevant for applications such as syntax learning, human factors, and graphic design.

Cross-linguistic transfer of informativeness biases in the kinship domain

Different languages map words to concepts in distinct ways, and this can result in "semantic accents" in multilingual individuals. For example, Hindi kinship terms make several distinctions not made in English, e.g., "older sister" (didi) vs. "younger sister" (behen). This study expands on previous literature to assess whether informativeness biases present in one's native language impact word choice when speaking non-native languages, particularly in development when semantic networks are in flux. We expect that, when asked to refer in English to individuals who differ along dimensions such as seniority or lineage, Hindi native-speaking children will more frequently use modifiers such as "older" or "maternal" than English native-speaking children. Preliminary results from Hindi-speaking participants (N = 74, M(age) = 14;2) suggest that their use of modifiers decreases as their level of English proficiency increases, in line with recent research proposing distinct but mutually influential lexical networks in multilingual speakers' languages.

Language-specific event role mappings in multimodal possession-transfer event descriptions

Event descriptions require mapping event roles from an underlying conceptual representation to surface speech and gesture. Encodings in co-speech gesture tend to align with language-specific options that govern encodings in speech, but are relatively understudied for event roles that can be omitted in speech (e.g., argument-dropping languages like Turkish allow omission of core event roles, including agents and recipients). We examine the content of multimodal possession-transfer event descriptions across two typologically distinct languages (English, Turkish), differing in the grammaticality of argument-dropping. We find that language-specific encoding patterns heavily affect recipient and agent mentions in free event descriptions across modalities. Overall, Turkish speakers mentioned recipients and agents less frequently than English speakers. Although recipient and agent co-speech gestures were used more frequently in Turkish, they rarely contributed information beyond what was encoded in speech. This suggests that argument-dropping in Turkish occurs at a level of representation that is shared across modalities.

Serial Reproduction Reveals the Interaction of Tempo and Rhythm Perception in Music and Speech

Music and speech differ in their time scales: music is generally slower, reducing linguistic communication but facilitating discrete rhythm categories, a universal feature of music but not speech. However, the mechanisms underlying these differences remain unclear. Here, we examine the interaction between tempo and categorical perception of rhythm using large-scale iterated reproduction experiments. Participants (N=1,304) heard music or speech rhythms and reproduced them by tapping or speaking. Their productions were then passed to the next participant over 5 iterations. In music, complex rhythm categories emerged at preferred tempo, while simple categories dominated at extreme speeds. In speech, the pattern deviated from simple integer ratios, and its tempo dependency differed from that of music. Importantly, slowing speech rhythms during transmission induced music-like categories, suggesting that speech categories are temporarily dependent. Our findings highlight how communicative constraints shape distinct rhythmic structures in music and speech.

Preparing a learner for an independent future

Caregiving helps learners survive in the present and ultimately thrive independently without their caregiver in the future. While some caregiving provides immediate benefits, other actions focus on long-term development, even if they cause short-term discomfort or setbacks. For example, a parent might allow their child to fail in a game to learn a useful lesson about the value of perseverance. Here, we develop a probabilistic model of caregiving with a recursive theory of mind using the Memo programming language that captures these intuitions. The model considers learners as POMDP planners, and plans over such learners to intervene on their beliefs in a way that will be valuable in the future. As predicted by the model, participants favor improving learners' knowledge over immediate efficiency, but only when that knowledge has future value. Effective caregivers thus think several moves ahead, accepting short-term costs to prepare learners for long-term success.

Modeling Object Knowledge from Child Visual Experience

The distributional approach to language has been helpful in understanding and making predictions about children's semantic and linguistic development. In the current research, we apply similar techniques to study children's conceptual development using their first-person visual experiences. This study investigates the distributional properties of objects encountered in early visual experience and how they may contribute to the learning of concept organization. Frames were extracted from head-mounted camera videos at regular intervals, segmented into objects, and manually annotated with their superordinate and subordinate categories. We then built a distributional model of object–image co-occurrences and computed the similarity of different objects based on their distributional statistics. We show that objects' distributional patterns would allow children to make useful predictions about objects' high-level semantic categories, such as foods, appliances, and electronic devices. These results highlight that early distributional experiences may facilitate category formation, with implications for developmental theory and computational modeling.

Examining Individual Differences in Within-Category Variability Reasoning

Cognitive developmental research suggests that people exhibit essentialist biases when reasoning about categories, leading them to underestimate within-category variability. However, prior accounts have been limited by small datasets per participant and a reliance on cohort-level analyses. We developed a Markov Chain Monte Carlo with People (MCMCp) task using ladybeetles as a model species. In the task, participants select the best version of a ladybeetle, and complete surveys assessing their biology knowledge and essentialist reasoning. We conducted individual-level analyses, focusing on hue—the feature participants reported using most to guide their MCMCp decision-making. Preliminary findings reveal variation in adult and children's category variability reasoning, with some participants accepting greater diversity in ladybeetle hue, while others showed more constrained, essentialist-like responses. We discuss these findings in relation to biology knowledge and essentialist reasoning, highlighting the importance of individual-level analyses in revealing factors that shape complex category reasoning and perceptions of within-category variability.

Human and LLM performance on linguistic test: Content effect and task demands

Large Language Models (LLMs) display an impressive set of capabilities in linguistic understanding. While advanced models outperform humans on certain tasks, LLM reasoning and linguistic competency differs from that of humans (Felin & Holweg, 2024; Mahowald et al., 2024; Niu et al., 2024). In this study, we evaluate humans and GPT-4o on the Winograd Schema Challenge, a pronoun resolution task. We focus on Japanese, a relatively understudied language in the emergent field of human-LLM evaluation. To assess human vs. LLM performance, we manipulate task demands and content. We report three findings: (i) Humans outperform LLMs in the baseline condition, i.e. the standard pronoun resolution task. (ii) As task demands increase, both human and LLM performance on the task declines (cf. Hu & Frank, 2024). (iii) We find evidence for content effects (cf. Lampinen et al., 2024): LLMs surpass humans as the content of the task is manipulated to favor LLMs.

Beyond Interpolation: Enhancing Large Language Models (LLMs) with Mental Models

Large Language Models (LLMs) demonstrate high performance across various tasks, yet they struggle with those requiring complex comprehension and reasoning. LLMs are not solely reliant on memorization: responses can be generated to novel prompts by interpolating between learned data points in a continuous vector space. However, they exhibit limitations in their inherent reasoning capabilities. Despite efforts to enhance their reasoning abilities, such as Chain-of-Thought prompting and test-time inference techniques, LLMs still face challenges in this domain. In contrast, humans utilize mental models—internal representations of situations and concepts—to adapt and solve novel situations. Integrating external modules that emulate the construction and utilization of mental models could offer a promising avenue for enhancing the reasoning abilities of LLMs. This approach could bridge the gap between current LLM capabilities and human-like reasoning, potentially leading to more robust and reliable LLMs.

Measuring the Semantic Consistency of Ordinal Annotations via Text Embedding Spaces and Its Applications

We propose a method for measuring the consistency of ordinal annotations based on a pre-trained embedding vector space. Intuitively, our method finds a direction in the embedding space along which data points align as closely as possible to their annotated ranks. The proposed approach guarantees a globally optimal solution that is free from approximation errors. Thus, it yields a unique consistency measure given a dataset with human-provided ordinal annotations and a pre-trained embedding model. This feature facilitates a wide range of applications, including not only ordinal prediction but also the unsupervised detection of annotation errors within datasets, as well as consistency assessment of stage-based scales (e.g., whether the transitions "beginner to intermediate" and "intermediate to advanced" form linear progressions in the embedding space) during dataset construction. We evaluate our method using real-world datasets with ordinal annotations to demonstrate its effectiveness.

Constructing Multilingual Readability Metrics for Cross-Language FKGL Comparisons

The Flesch-Kincaid Grade Level (FKGL) is a widely used readability measure for English. However, the formula involves English-specific word counts, making it difficult to establish a comparable formula in other languages. This study provides a theoretical analysis of the FKGL formula, uncovering its fundamental principles. Unlike prior work, we demonstrate that the formula can be reinterpreted as representing the "average syllable count per sentence." While vocabulary breadth expands with increasing grade levels, the range of syllables per word remains constant, regardless of age or grade. This invariance likely contributes to FKGL's enduring applicability. We validate our framework using empirical evaluations with the British National Corpus (BNC), confirming the theoretical soundness of our approach. Furthermore, we extend the FKGL formula to other languages, proposing a multilingual readability metric. This research not only deepens our understanding of existing readability measures but also provides a robust framework for cross-linguistic readability comparisons.

A Meta-analysis of Age-related effect on Loss Aversion

Prospect Theoretic Loss aversion (λ) suggests we overvalue losses compared to gains during decision making but the influence of age on loss aversion is unclear. We conducted a random-effects meta-analysis on 7 studies that measured loss aversion and the age of participants. The pooled estimate (β=0.0886, 95%CI [0.0087 0.1686] revealed an effect of age on loss aversion. Considering age as continuous (n=1103, 5 studies; β = -0.005233, t= -2.511, p = 0.0122) or categorical (n=1856, 8 studies; β=0.00856, t= 0.111, p = 0.912) reflected substantial differences. Dividing participants broadly (< 46 and > 45 years) to explore differences in estimates of loss aversion (λ>1, λ =1, λ<1), also found significant differences, X2 (2, 1856)=7.5362, p = 0.0231. These findings underscore the veracity of meta-analysis for advancing and explicating previously underexplored aspects of decision theory.

Examining Item Difficulty in NLP: To What Extent Do Examinees Affect Item Difficulty?

Recent research in Natural Language Processing (NLP) has focused on estimating the difficulty of text content, culminating in a shared task conducted in 2025. However, since many researchers in NLP are not experts in educational psychology, the item difficulty in these shared task datasets is commonly defined by the proportion of examinees who answer an item correctly, and language model performance is evaluated accordingly. This definition is inherently sensitive to changes in the set of examinees who answer correctly, thereby altering item difficulty. To overcome this issue, educational psychology employs item response theory (IRT) to separate item difficulty from the examinee population. In this study, we investigate the extent to which language model performance evaluations differ when using IRT compared to the traditional method, based on the proportion of examinees who answered items correctly.

MESS (Mimed Expressive Short Stories) Database: Showcasing the potential of pantomime for story transmission

Sometimes, stimulus preparation is the most resource-consuming stage of experiments on communication. We present an open ready-to-use database of mimed expressive short stories (150 .mp4 files, 03:34:56 of footage) funded by the National Science Centre of Poland under the agreement UMO 2021/43/D/HS2/01866. It can be used as a stimulus in experiments or for annotation or rating. We describe how the database was created: (1) preparation: we used Chat GPT 4.0. to create rich stories, with varying age, gender, number of characters, and themes; (2) recording: we worked with professional actors for maximum expressiveness; (3) editing: we synced front and side shots for maximum visibility; (4) annotating: we analysed representational strategies to argue that the database showcases the potential of pantomime for storysharing and discuss its implications for the pantomimic scenarios of language origins (e.g., Arbib, 2018, 2024; Zlatev et al., 2020).

Be concrete and specific: how speakers introduce novel topics in naturalistic language

We asked whether the concreteness and specificity of the language used by conversation participants change depending upon the familiarity and the presence/absence of an object discussed. Additionally, we explored whether interlocutors engaged in distinct abstraction processes (analogical comparison; superordinate categorization) and whether they focused more on object's features or on their own experience. We used the ECOLANG corpus (Gu et al., 2025), a semi-naturalistic dataset of interactions in which 31 knowledgeable "speakers" describe novel/known objects to an "addressee" when the object is physically present or absent. We analyzed 22,581 sentences produced by the "speaker" and measured the concreteness and specificity of 1,612 nouns used. Results showed that more concrete and specific nouns were used for novel objects suggesting a need for precise information. Additionally, abstraction processes were more likely when the object was present and novel. Finally, when the object was present and known, interlocutors focused more on personal experience.

A Study of Context Effect in Large Language Models

Large language models (LLMs) are increasingly used for decision-making, raising ethical questions about their decisions compared to human decision makers. A key aspect of human decision-making is the context effect, particularly how decoy options alter previous choices. This study examines whether LLMs exhibit similar context effects in human cognition. We test multiple GPT models using semantic choice probability prompts to assess four context effects: similarity, attraction, repulsion, and compromise. While LLMs exhibit some differences in their context effect maps compared to humans, they consistently exhibit all four-context effects examined qualitatively. Since users primarily seek suggestions rather than direct decisions, we also prompt LLMs to generate suggestions and pros-and-cons analyses, with results further confirming the context sensitivity of the LLMs. These findings suggest GPT models are not only effective in assisting context-dependent decision-making but also offer a scalable, cost-efficient tool for designing psychological study of context effects.

Consequences of prior experience on visual problem solving

People rarely face the same problem twice, but many problems are similar. What strategies do people discover when solving similar problems, and what is the impact of that experience on how they approach new ones? Here we investigated how people's strategies changed over time while solving sets of related visual reasoning problems. Participants (N=42) were given a sequence of "tangram" puzzles to solve, which could be reconstructed either exclusively with small pieces or with a special large piece to reconstruct the tangram in fewer moves. Later, participants attempted puzzles where a different large piece was helpful instead, so we could measure the impact of prior experience on how they approached puzzles favoring a different strategy. We found that participants reconstructed tangrams more quickly over time, and generally used the appropriate large piece for the problem at hand, reflecting their ability to flexibly adapt their strategies to new problems.

Beyond Word Meaning Mappings: The Role of Low-Informative Events in Conceptual Alignment

Word meanings are rarely transparent from their extralinguistic contexts. How children learn words from an input with "low-informative" (LI) events is of interest because even adults struggle to learn from LI events (Gleitman & Trueswell, 2020; Medina et al., 2011). This study revisited LI events' contribution to learning by probing what can be gleaned from LI events even if they don't yield exact meanings. Adults (N = 120) learned words (e.g., "modi") that had English meanings (e.g., "apple") from LI events. Participants then both guessed the word's exact meaning and rated several candidate meanings. Although LI events failed to yield accurate mappings of meanings, they led to representations (derived via the ratings) that were semantically aligned with those of the true meanings. These results highlight the potential for LI events to get learning off the ground and the implications of viewing word learning as more than a mapping problem.

Scaling Interleaved Practice: Preliminary Evidence and Lessons Learned from a Systematic Replication

We report on an in-progress replication study to test the efficacy of interleaving in U .S. middle-schools. Interleaving combines two learning strategies, discrimination learning and spaced practice, by mixing different problems within an assignment and spacing similar problems across assignments. While the benefits of interleaving are well-documented in controlled experiments, only recently have there been attempts to widely scale this principle to education contexts. The study design is a randomized controlled trial with a sample of 13 classes, 239 students, and 6 schools. Preliminary results show effects in the predicted direction, students who received interleaved practice performed higher than those who received blocked practice (g=.11), but the difference is not significant. While our sample lacks the statistical power to detect effects, we are in the process of collecting data from two additional cohorts. This study contributes to how learning research can be translated to advance learning in authentic contexts.

Parents and Children Create Semantic Regularities During Naturalistic Toy Play

Children must learn individual word meanings and the semantic connections among words. Prior work has probed children's sensitivity to different semantic relations, but little is known about how this knowledge develops. This study examined whether parents and children jointly created semantic regularities in naturalistic everyday interactions. Forty-four parents and their toddlers (M = 20.4 months, range: 13.2-31 months) participated in a 10-minute free play session with 27 toys from three categories (food, animals, vehicles) while wearing head-mounted eye-trackers. Children's looking from one object to another was categorized as a same category (e.g., dog-cat) or different category (e.g., dog-car) transition, revealing temporal sequences of semantically related toy play: Children were more likely to transition between objects from the same category and heard more unique words from the same category during these toy transitions. Together, these findings demonstrate that parents and children create semantic regularities for learning through shared attention and action.

Strategy selection in complex tasks through adaptive integration of learned and online metareasoning

When facing tasks that are difficult to solve optimally, people can construct simplifying strategies that trade off utility with cost (Ho et al., 2022, Callaway et al., 2022). How we do so is an open question, especially in domains with large, structured strategy spaces where strategy evaluation itself is costly. One proposal is that people select strategies without much online computation, by a process of (reinforcement) learning through experience (Lieder & Griffiths, 2017). We present an alternative, resource-rational metareasoning framework that integrates strategy learning with adaptively bounded amounts of online strategy evaluation. We compare these proposals using a new video game task in which players traverse a grid of moving colored tiles while respecting complex rules about valid color sequences. Players quickly discover simplifying strategies, such as "only step on red tiles," and adapt when the environment changes to favor new strategies, in ways that are most consistent with adaptive metareasoning.

Additive Analogies Reveal Compositional Structure in Neural Network Weights

A central question in cognitive science is how to reconcile connectionist and symbolic models of the mind (e.g., Fodor & Pylyshyn 1988, Smolensky & Legendre 2006). Attempts have been made to bridge these competing schools of thought by showing how compositional structure can emerge in continuous vector representations (e.g., Manning et al. 2020). A key example is Mikolov et al. (2013), who demonstrated that word embeddings learned by a neural network encode semantic structure: subtracting the vector "man" from "king" and adding "woman" approximates "queen" (i.e., king - man + woman ≈ queen). Our work moves up one level of abstraction, from representations to functions. We analyze whether entire networks display emergent compositional structure by treating a trained network as a single vector (obtained by concatenating the network's parameters) encoding its function. We show that these parameter vectors can be recomposed through simple additive analogies to create networks with new functions.

Evaluating the Efficacy of MathByExample: Preliminary Evidence

Rather than merely solving math problems, students that self-explain correct and incorrect worked examples increase their procedural knowledge and develop deeper conceptual understandings of key concepts. The present study is a large-scale test of the efficacy of the MathByExample intervention, which targets key math concepts and common misconceptions for students in 5th grade through correct and incorrect worked examples. Our cluster-randomized controlled trial included 42 schools (n = 830 5th grade students) across two cohorts. Preliminary results show effects in the predicted direction, students who received MathByExample exercises outperformed students in the control condition, yet the difference is not significant. Our poster will discuss possible explanations for the findings, discuss exploratory moderators (e.g., dosage received of MathByExample exercises), and include data from a third cohort for which data collection is currently ongoing.

The Generalized Lotka-Volterra Interactive Activation Model of Word Recognition

Connectionist models like the Interactive Activation (IA, McClelland & Rumelhart, 1981) model serve an indispensable role in cognitive science by providing a concrete and testable framework for describing how percepts at different levels of abstraction might interact during cognitive processing. However, discontinuities in the governing equation for the IA model limits the set of analytical tools that can be used to understand the model's dynamics. We developed a novel model of word perception, gLoVIA (generalized Lotka-Volterra Interactive Activation model) which borrows the mathematical structure of a generalized Lotka-Volterra model. A robust method for initializing the community matrix yields a gLoVIA model with high word report accuracy, plausible lexical competition, and word superiority effects for vocabulary sizes up to 1000 words. Our results suggest that the gLoVIA model may be sufficient to explain empirically observed effects in word perception, while being more amenable to analytical methods for characterizing its dynamics.

Narrative Communication as a Learning Tool for Resolving Exploration-Exploitation Dilemmas

Narratives and storytelling are proposed to be essential means through which humans acquire, preserve and transmit information about their environment. The current project investigated narrative transmission in the context of a multi-armed bandit task; an experimental paradigm that simulates an uncertain exploration-exploitation environment. Following the task, participants taught the next generation of players how to find rewards in the task by writing the ending to a folktale about two foragers, one explorer and one exploiter, living in the same environment. Preliminary analyses indicate that whether participants chose to transmit a story that encouraged exploration, exploitation, both strategies, or neither, was best predicted by individual differences. Reported strategy or actual behaviour and performance in the task were not key predictors. Future study plans include investigating how performance, behaviour and narrative transmission preferences are affected by receiving a narrative as learning material before the task compared to individual trial-and-error or factual descriptions.

The contributions of explanation simplicity and source expertise to evaluations of disagreeing explanations

Learning about complex, scientific topics often involves reading competing explanations posited by multiple disagreeing sources. This necessitates comprehending both explanations, understanding the extent of their disagreement, and determining which is more likely. In a series of three experiments, we investigated the role of features of explanations and their sources in readers' evaluations of the explanations. Specifically, we presented participants with pairs of disagreeing explanations that varied in their simplicity, the expertise of their source, and the salience of each feature. We examined the extent to which these features individually and interactively affected readers' evaluation of explanations, the causes they attributed to the disagreement, and curiosity about the topic of disagreement. We also examined the role of individual differences between readers, namely their prior topic knowledge and trust in science, in these outcomes. The findings inform theory about how people evaluate explanations and learn about science.

Quantifying the context-level valence and arousal in children's written language

Analysis of adult language shows positive correlations between a word's affective features (valence, arousal) and those of its surrounding language context, but whether this extends to children's language remains unclear. To address this, we quantified the emotional context of words within a large corpus of children's written stories (N > 100,000, ages 7-13). Following Snefjella and Kuperman's (2016) procedures, we defined a word's context as the five content words immediately preceding and following it (10 in total). Context valence and arousal were computed for each occurrence and averaged across all occurrences in the corpus, yielding affective values for 24,383 words and their contexts. We found positive correlations between context-level and word-level affective variables (valence: r = 0.46, arousal: r = 0.32), consistent across the 7-13 age range. This study extends adult findings to children's written language and provides a resource for future research on emotional contexts in language development.

Measuring sustained attention across timescales to predict learning in real-world environments

How well students learn depends on their ability to sustain attention. However, it is currently unclear how to measure sustained attention in the classroom and relate those underlying attentional dynamics to academic engagement and performance. Here we leverage a suite of sustained attention instruments to explore how individual differences in sustained attention account for differences in learning outcomes in a university STEM course (N=248). We found that a student's ability to sustain attention predicted their subsequent academic achievement in the course. Sustaining attention was also associated with STEM-related stress, anxiety, and students' confidence in their ability to learn the course material. We are additionally exploring interaction logs from the digital textbook students used to investigate the mechanisms linking sustained attention to subsequent achievement. Together, these findings highlight the promise of studying attention and learning across timescales to advance mechanistic understanding of human cognition in real-world environments.

When Empowerment Disempowers in Multi-Agent Assistance

People can assist others even without knowledge of their specific goals. However, this ability remains a challenge for artificial intelligence, limiting the development of assistive technologies. This issue is particularly important in caregiving, where rather than focusing on helping with a particular task, one may aim to improve the broad autonomy of the care recipient. One promising approach to care that does not require goal inference is to maximize the empowerment of another. Empowerment is a computational measure of one's ability to control their environment. Prior work on empowerment assumes one-to-one interactions, overlooking the fact that assistive technologies are often deployed in multi-agent environments that may include caregivers or family members. Here, we show that optimizing for one person's empowerment may inadvertently disempower others. Finally, we develop and test multi-agent extensions of empowerment that enable an assistant to empower one person without harming another.

Children rely on gestures more when the set size of countable items increase

Research suggests that gestures are crucial in conveying numerical information, particularly during counting tasks involving larger set sizes (Gunderson et al., 2015; Gibson et al., 2018). Besides, children's understanding of numbers develops in stages, with significant milestones around four and five years old (e.g., cardinality). However, the use of gestures by preschoolers during counting tasks, especially with higher set sizes, remains poorly understood. Data were collected from 59 children (Mage = 4;6, 26 girls) who placed beads onto a stick matching dotted cards (1-9) in video-recorded sessions for later gesture coding. Glmer analysis revealed that higher set sizes (4 to 9) correspond to increased gesture use (p<.001). In contrast, set sizes negatively relate to counting accuracy (p<.001). However, gesture use did not significantly relate to counting accuracy (p=0.74). Our findings indicate that gestures are essential to children's numerical understanding regardless of accuracy, particularly when tackling tasks beyond their comprehension level.

An Intervention Program: Helping Deaf Children with Hearing Parents in Acquiring Turkish Sign Language

Many deaf children first encounter a language in deaf schools, often experiencing adverse effects of late exposure on several aspects of language. Little is known about how late acquisition of sign language affects lexical sign acquisition and how the structural architecture of signs modulates this process. Here, we investigate the lexical development of late-signing children (N=11; MeanAge=84.7 months) through 8-week intervention using mobile-compatible web-app teaching lexical signs in Turkish Sign Language to investigate whether (1) children acquire basic signs during intervention and (2) phonological complexity influences this process. Preliminary results revealed a significant improvement in sign learning (β=1.7545, p<0.001) regardless of phonological complexity (ps>0.05). Findings underline the importance of accessible platforms to support language development for this population and extend previous work on the effect of phonological complexity on articulation accuracy during learning in hearing adults by showing its lack of influence in deaf children on 2-choice recognition without articulation.

Scope Ambiguity Resolution of Negated Connectives in English Corpora

Prominent U.S. legal cases have turned on the resolution of linguistic ambiguities that arise from interactions between negation and coordination. Negative disjunction sentences (ND, "John didn't buy cake or cookies") are ambiguous between two readings: a neither-nor reading and a not-both reading. Negative conjunction (NC, "John didn't buy cake and cookies") is similarly ambiguous. Experimental findings on such scope ambiguities (Jasbi et al. 2023, Tobia et al. 2023) suggest that in general, listeners favor the neither-nor reading for ND but are less biased in their interpretations of NC. We present a corpus analysis with researcher annotations of several hundred tokens of these constructions in English treebank data. Initial results are consistent with previous conclusions based on controlled experimental data, suggesting that an attested tendency in linguistic comprehension is also reflected in naturalistic production. Our dataset can further be used to explore factors that modulate the disambiguation in context.

Adapting to the Unfamiliar: Communicative Adjustments in Human-Robot Interaction (HRI)

Audience design, the ability to tailor communication to an interlocutor's perceived competence and conversational behavior, is a fundamental cognitive process shaping communicative interactions. Robots provide a unique context in which speakers must establish communication strategies with minimal prior experience with their interlocutor, making HRI an effective tool for exploring audience design. This study examines whether people adapt their communication differently when addressing a robot versus a human and how prior conversational success influences these adaptations. In a word-guessing game, participants described word meanings to an audience, who maintained a 50% accuracy rate across randomly ordered correct and incorrect trials. They played one round with each audience separately. We analyzed adaptations in participants' description strategies (e.g., definitions, semantic categories) and lexical choices (e.g., specificity, word frequency). Findings indicate that adaptation unfolds dynamically over time, shaped by accumulating conversational experience rather than fixed patterns. This study refines theories of language adaptation across diverse audiences and highlights implications for social cognition.

The Uncanny Valley meets the Humorous Hill: Things are funny when they match a pattern but fall short on quality

We propose the "Humorous Hill" hypothesis: things are funny when they match an expected pattern, but fall short of being good. We suggest this form of humor underlies the amusement felt towards many of children's utterances, and much of the recent engagement with AI. We tested the Humorous Hill by using language models (LMs) to create novel examples of open-ended categories in two domains (paint colors and movie titles). The LMs varied in quality and architecture, including n-gram models with increasing windows (2-grams to 9-grams), and increasingly sophisticated transformer-based models (GPT babbage to GPT-4). Participants (N=300) rated items for category membership, goodness of fit, and humor. Across models and categories, we found an inverted U-shaped relationship between humor and accuracy. We propose that much of people's engagement with artificial agents is driven by finding their outputs humorous, rather than good – a form of humor that also applies to children.

Does perceptual chunking facilitate predictive processing in spontaneous speech?

Surprisal theory holds that the processing difficulty of a word is determined by its predictability in context (Hale, 2001; Levy, 2008). However, memory limitations hinder the integration of the full context, as evidenced by dependency locality effects (Gibson 1998). We propose that the local context for predictive processing may be established by cortical tracking of perceptual chunks, which are periodically occurring linguistic units (Giraud & Poeppel, 2012). We identified perceptual chunks in 97 extracts of natural speech using behavioural data. For each word, we derived surprisal values from GPT-2 based on three contexts: the full extract, the current perceptual chunk, and the previous three words. Surprisal conditioned on the chunk context was higher than on the full extract but lower than on the previous three words. This suggests that perceptual chunking may offer an optimal window for predictive processing within working memory capacity.

Who's the actor? Performing and observing pantomimed actions

Social perception research demonstrates that people can infer the high-level goals driving many motor actions. But what about the rich visuomotor processes underlying such actions? Visually-guided behavior relies on a complex feedback loop between agents and environments, with subtle corrective adjustments made online. What do observers understand about this dynamic? Here, we explore these questions through "pantomimed actions". We created a stimulus set of videos where agents performed both genuine object-directed actions (e.g. stepping over a box), and pantomimes of those actions (e.g. stepping over an imagined box). Independent subjects then watched these videos and had to determine which videos were which. Collapsing across actions, observers successfully discriminated real actions from pantomimes. However, certain actions were more discriminable than others. This suggests that (1) observers understand how online visual information shapes human motor behavior; (2) The ability to "fake" actions may be more robust than previously suggested.

Learning to remember and remembering to learn: memory distortions as semantic compression of episodes

Memory is not a faithful recording of sensory experience. Rather, a century of research has shown memories are prone to systematic distortions through interpretation, selective encoding and subsequent modifications. Recent applications of Rate Distortion Theory (RDT) offer a normative framework for understanding memory encoding as lossy compression, accounting for a range of phenomena such as gist-based distortions. However, RDT assumes the statistics of the environment are static and known—a stark contrast to the brain's continual need to update its internal (semantic) model of the world. We propose an extension of RDT where the compression model is itself learned from experience, creating a dynamic interplay between compression and learning, which in turn induces characteristic path-dependencies in learning. By reinterpreting previous empirical findings in the light of this proposal, our approach broadens the explanatory scope of RDT to a wider range of memory phenomena such as associative memory errors and post-event misinformation.

Reconceptualizing Autonoetic Consciousness

Autonoetic consciousness (AuC), originally conceptualized as the phenomenological marker distinguishing episodic from semantic retrieval, faces significant empirical challenges. In particular, episodic retrieval was demonstrated orthogonal from phenomenal characters such as sense of ownership. It has thus been argued that AuC is not a valid construct and should be abolished. While we agree with the empirical failure of AuC's original conceptualization, we highlight that the phenomenon pertaining to the construct serves practical significance, particularly in terms of supporting one's sense of self extended through time. Considering empirical cases of dissociations between specific properties pertaining to AuC, including mental time travel, sense of ownership, and introspection, we offer a scientifically-valid reconceptualization of AuC as a specific type of reflective awareness in which explicit metacognition is leveraged to support the sense of personal identity. Under this reconceptualization, episodic retrieval appears as a contingent feature of autonoetic consciousness but not the other way around.

Describing and remembering complex motion events in ‘real-world' videos

Languages lexicalize motion events in different ways. Previous studies, using short clips of isolated motion events, have shown that these cross-linguistic differences can affect attention and memory. In this study, we use videos of ‘real world' complex motion events to test the effects of language patterns on perceived saliency of event elements as well as the granularity of speakers' descriptions. To increase the perceived importance of the task and its applicability to forensic contexts, participants (N=64) were informed that they were viewing CCTV footage of potential criminal activity, before completing a surprise or expected memory task. Participants watched a two-minute video depicting multiple interconnected motion events with several paths and manners of motion, placement events, as well as different event endings. Results unpack the extent of thinking for speaking effects on later ‘eye witness' description, as well as nuanced patterns of description change on iterated descriptions of the same video.

Spatial skills in grassroots athletes and athletes with functional impairments: A screening study in Latvia

Visuo-spatial abilities contribute to successful athletic performance (Millard et al., 2021), and the results of visio-spatial tests have a predictive and diagnostic value (Moreau et al., 2012). In our study, we test participants from grassroots sports (n=186) and adapted sports (n=30) with tests covering allocentric and egocentric cognition (Mental rotation (Shepard & Metzler, 1988), Perspective taking (Kozhevnikov & Hegarty, 2001), Santa Barbara Sense of Direction Scale (SBSOD) (Hegarty et al., 2002)) using a web application. We compare differences in spatial skills in both groups depending on demography (age, gender) and sports training characteristics (type of sport, training regularity, experience). Preliminary results suggest that (1) mental rotation performance is similar across groups; (2) participants from the grassroots sports perform better in the perspective-taking test; (3) SBSOD measurements are slightly better in the adapted sports.

Prosodic Cues in Differentiating Request and Permission Directive Speech Acts

Directive Speech Acts (DSAs), encoded by the imperative mood, exhibit various interpretations across languages, including acquiescence, advice, requests, and commands (Searle 1979; Wilson & Sperber; Kaufmann 2012). This many-to-one relationship between form and meaning raises the key cognitive question: how do speakers disambiguate these interpretations? This study examines the role of intonation in Greek imperatives, focusing on Nuclear Pitch Accent placement and boundary tones within the autosegmental-metrical model of intonational phonology (Pierrehumbert 1980; Ladd 1996; Arvaniti & Baltazani 2005). Recordings were controlled to isolate linguistic cues provided by prosody while minimizing extralinguistic prosodic cues related to the speaker's emotional state. Preliminary findings indicate that a rising boundary tone consistently signals a request interpretation, whereas a combination of NPA and a falling boundary tone suggests weak permission. A follow-up study incorporating contextual stimuli, describing both the speaker's and addressee's situations, is underway to examine the influence of contextual cues on interpretation.

Mechanisms Of Working Memory Allocation In Reward Learning

Working memory (WM) is a core driver of cognition, supporting executive control, decision-making, and learning. In reward learning, WM works alongside slower reinforcement learning (RL) to establish associations between states, actions, and rewards. WM's capacity is highly limited, necessitating careful allocation of WM resources to optimize performance. How humans manage this WM constraint during reward learning, storing valuable information while discarding superseded data, remains an open question. In this study, we utilize a dynamic reward learning task to isolate rapid WM processes from slower RL mechanisms during reward learning. Through computational modeling we explore the operations humans use to allocate their limited WM resources efficiently. Our findings show strategies including (1) reward-dependent memory operations (write, forgetting, and over-write probabilities) and (2) strategic clearance policies (removing redundant and task-inconsistent data). This research clarifies WM's role in reward learning, highlighting the importance of WM operations in supporting complex behavior.

Wrong for the Right Reason? Using Successes and Failures of Large Language Models to Understand Human Thinking

If a person answers a question correctly, how can we tell if the answer reflects an underlying understanding of the phenomenon, or if it is based on merely surface-level associations? Cognitive science has developed multiple tests, such as Winograd Schemas, that ostensibly require a respondent to use some kind of world/situation model rather than just associations. What then are we to make of large language models (LLMs) successes on some of these tasks? We present a series of probes to LLMs and people about everyday situations, finding that models sometimes respond correctly for the wrong reason and in other cases make seemingly 'catastrophic' mistakes by applying the wrong model--often in human-like ways. Our results suggest that probing the basis of LLMs' successes and failures can help inform human problem solving and in some cases call into question our previous tests of human understanding.

Co-Emergence of Sensory Modalities: Exploring the Dynamic Interaction and Adaptive Integration of Sensory Inputs in Meaning-Making Processes

This study examines how sensory inputs collapse into meaningful experiences through semantic inferences. For instance, hearing a bark and seeing an animal are integrated into a unified experience. This process involves the active organization of sensory data, shaped by intentionality and sometimes past experiences. Intentionality, such as focusing on identifying animals, directs this process. A mathematical model formalizes how sensory inputs and inferential structures interact to generate coherent experiences. Challenging traditional views, the study proposes that sensory modalities do not simply combine into a static whole but instead co-emerge through dynamic interaction. Each modality contributes unique information, and their integration leads to a richer understanding. This ongoing process reflects the ontological nature of experience, arising from the interaction of sensory data and inferential structures. The goal is to emphasize the adaptive nature of perception and demonstrate how experiences continuously reshape as new sensory inputs and contextual information emerge.

Leveraging Machine Learning for Acoustic Feature Analysis in Neurodevelopmental Disorders: Insights into Emotional Profiles

Neurodevelopmental disorders (NDDs) in preschoolers often involve social and communication deficits, contributing to heightened anger, sadness, and anxiety. This study examined whether acoustic features of speech could detect emotional dysregulation in 65 French-speaking children (age 4) diagnosed with ADHD, developmental language disorder, psychosocial issues, or cognitive impairments. Using standardized assessments, participants were grouped by emotional/psychological, speech/language, or cognitive/motor difficulties. Audio recordings from structured and unstructured tasks were processed via openSMILE, generating 153 features capturing spectral, prosodic, and energy parameters linked to emotion. A random forest classifier compared these acoustic profiles to EmoDB samples labeled with negative emotions. Results showed that children with NDDs exhibited unique acoustic markers of negative emotions, though differences among subgroups were minimal. Attempts to pinpoint anxiety as a diagnostic feature were inconclusive. Overall, machine learning–based acoustic analysis holds promise for identifying emotional dysregulation, encouraging further multimodal approaches in clinical assessments, and more robust early interventions overall.

What Perceptrons Might Tell Us About Our Own Abilities

Minsky and Papert's (1969) book *Perceptrons* is often remembered as the book that (counter-productively) ended neural network research for nearly two decades. One of the authors' main results was that perceptrons (under reasonable limitations) cannot detect if a pattern is fully connected. Perhaps less known, to their initial surprise, the authors also showed that if guaranteed there are no holes in an image, perceptrons *can* detect if a pattern is fully connected. Given the simplicity of perceptrons, it seems reasonable to think that they might suggest a lower bound for what humans can visually detect without moving their eyes. If so, the results on connectedness suggest some counter-intuitive findings about human perception, namely that we should be able to learn to solve 2D mazes at a glance and detect how many objects are in an image at a glance (i.e., subitize) even when the number is large.

A Modular Framework for Analyzing Theory of Mind Learning in Competitive Tasks

A key challenge of theory of mind, or the ability to reason about others' mental states, is understanding the process by which others' perceptions influence their beliefs. While specific tasks---like competitive feeding---benchmark participants' ability to infer beliefs, it remains unclear how such capabilities can be learned. In this work, we introduce a modular framework that solves a computational, competitive-feeding-like game in which two agents compete. By systematically replacing modules of a successful rule-based framework with neural networks, we identify which capabilities can be learned from narrow sets of experiences, and which are critical for robust generalization. Using feature extraction techniques, we analyze how different architectures process task-relevant information. Finally, we describe and compare three novel approaches to improving generalization via first-person exposure to uncertainty: role reversal with the opponent, artificial observation masking, and synthesizing beliefs from conflicting information.

Detecting an illusion's perception change

There are binary optical illusions that people typically perceive only one of the two images. These illusions involve a cognitive process. The process seems to on the edge between System 1 and System 2 and doesn't seem to be a conscious choice. In one well known example, people see either the old woman or the young lady and have difficulty perceiving the other image. We have found a way to manipulate an example of a binary choice perceptual illusion that starts with one image and at a particular point in the transition promptly switches to the other image in a Gestalt sense. The process is repeatable, reversable, appears to be stable over many repetitions, and has several characteristics that can be manipulated, such as the number, density, and size of the object(s) and the rate of change. The poster shows how to do this with a simple protocol.

The Persistent Concrete Bias in Monkeys

Human children and adults in industrialized societies typically exhibit a relational bias—they readily attend to abstract relationships (e.g., comparing heights)—whereas young children, adults from minimally schooled cultures, and non-human primates often display a concrete bias, focusing on absolute features (e.g., total surface area). The underlying cause of this bias—whether innate, working memory limitations, or an artifact of limited learning experience—remains unresolved. We tested this by pitting concrete matches against relational matches in a match-to-sample task. Monkeys initially preferred concrete matches. However, monkeys learned to select relational options when reinforced, indicating that they were capable of overcoming their default bias. Crucially, once explicit feedback was removed, they reverted to concrete matching, indicating a persistent bias rather than a lack of learning opportunities. This suggests an inherent primate predisposition toward concrete processing that may be overridden by the cultural and educational influences of industrialized societies, which favor relational processing.

Social information eases discourse processing in human listeners

When using language, we talk often about other people. Given this skew in lived experience, we asked whether listeners process discourse with social topics more efficiently than non-social discourse. Thirty-nine participants listened to normed pairs of passages that differed only in three phrases to yield either a social or non-social topic. After listening, participants recalled all they could about the passage. While participants' recall accuracy across conditions was similar (p = .19), social discourse was recalled significantly faster (M = 103 s, SD = 52 s) than non-social discourse (M = 121 s, SD = 80 s, p = .01). Additionally, they relied less on verbatim memory for social than non-social discourse (p < .0001). This suggests that participants' experience with social language allows them to rely on existing social schema rather than verbatim memory. Our communicative system is sensitive to even the tiniest bit of gossip in the input.

Envisioning: The Cognitive Challenge of Prompt-based LLM Interactions

Large language models (LLMs) such as ChatGPT have replaced conventional interface designs with prompt-based natural language interactions. LLMs exhibit dynamic capabilities to fulfill a broad range of tasks and ad-hoc functionalities (e.g., "rewrite these appliance installation instructions for a five-year-old"). However, their open-ended interface replaces Norman's gulf of execution with a new cognitive challenge for end-users; namely, the gulf of envisioning clear intentions and task descriptions in prompts to obtain a desired LLM response. To address this gap, we propose a cognitive model of the Envisioning process based on protocols of generative AI prompt-based interactions. The model highlights three cognitive challenges people face when requesting help from LLMs: (1) what the task should be (intentionality gap), (2) how to give instructions to do the task (instruction gap), and (3) what to expect in the LLM's output (capability gap). We make recommendations to narrow the gulf of envisioning in human-LLM interactions.

Artificial Neural Networks Reveal a Cognitive Continuum Toward Human Abstraction

Do neural network models that fail to behave human-like reflect a fundamental divergence from human cognition, or do they mirror earlier developmental or evolutionary stages? We propose that such models may, in fact, offer insights into the origins of human abstraction. We evaluated over 200 pretrained neural networks alongside macaques, Tsimane natives, US adults and children on three visual match-to-sample tasks targeting increasing levels of abstraction: visual-semantic similarity, shape regularity, and relational reasoning. As task demands grow more abstract, just like monkey's, model decisions increasingly diverge from adult human behavior. However, representational similarity analyses reveal shared internal structure with all human groups, suggesting overlapping cognitive strategy. We further show that model alignment depends on specific design choices—architecture, scale, training regime, and language supervision—highlighting which inductive biases support human-like abstraction.

From Meanings to Sounds: Development of Language Prediction in Toddlers

Grounded in predictive processing theory, this study explored the idea that young learners generate top-down expectations about upcoming words, both in meaning (semantic) and sound (phonological), to aid early language development. Researchers hypothesize that language acquisition is facilitated by children's growing ability to anticipate not only the content of an utterance but also the specific forms those utterances might take. By examining how toddlers transition from broad conceptual understanding to accurate phonological prediction, this work sheds light on the cognitive mechanisms that enable rapid language growth possible in early childhood. The main goal was to determine when and how toddlers begin to form semantic and phonological expectations for upcoming words. To address this, three preferential looking experiments were conducted with Spanish-speaking toddlers at 18, 24, and 30 months. Highly constrained sentences were played aloud while toddlers viewed pairs of images: either a target or a competitor (semantic or phonological) and an unrelated distractor. As the toddlers listened, their gaze patterns revealed whether they anticipated the correct word or a related image before the target was fully pronounced. The rationale was that if toddlers can detect cues in the sentence and map them onto future words, they will show anticipatory looks toward images that match meaning or sound. Analyses revealed a progressive pattern: at 18 months, toddlers clearly predicted specific words in strongly constraining contexts, but showed no consistent anticipation of semantic alternatives. By 24 months, toddlers not only looked toward the correct referent but also demonstrated meaningful shifts toward pictures sharing semantic features with the target. This suggests they were extracting and forecasting aspects of meaning ahead of time. However, reliably predicting word forms based on phonological cues emerged more robustly at around 30 months, when children also shifted their gaze to phonologically similar items before the target was spoken. These findings highlight a developmental trajectory in which toddlers leverage broader conceptual knowledge first, refining phonological detail later as their linguistic system matures.

Emerging Morphosyntactic Prediction in Early Childhood: A Visual Tracking Study

Emerging Morphosyntactic Prediction in Early Childhood: A Visual Tracking Study Linguistic prediction, a key mechanism in language acquisition (Dell & Chang, 2014), enables anticipation of upcoming linguistic input based on contextual cues (Mani & Huettig, 2012; Angulo-Chavira et al., in review). This eye-tracking study investigated gender-based morphosyntactic prediction in Spanish-speaking toddlers aged 30 and 36 months. Participants heard highly constraining sentences (e.g., La gallina puso su... [The hen laid its…]) while viewing images of a gender-matching competitor (cuchillo [knife]), which shared only grammatical gender with the target (huevo [egg]), and a distractor (gorra [cap]). Results revealed that 36-month-olds, but not 30-month-olds, shifted their gaze toward the gender-matching competitor before hearing the target noun, indicating developmental differences in morphosyntactic processing. These findings suggest that the ability to integrate morphosyntactic cues into predictive language processing emerges by 36 months, providing empirical evidence on the developmental trajectory of linguistic prediction in early childhood.

Children's expectations of dominant and prestigious leaders

Humans are enmeshed in many hierarchical relationships, such as that between a parent and child. Leaders in dominance hierarchies are typically strong and intimidating, while leaders in prestige hierarchies are respected for their expertise. In an ongoing study (N=13), we ask whether preschoolers ages 4–5 have different expectations about how dominant and prestigious leaders divide resources. Children learned about two social groups, the dominant Glerks and prestigious Zonks. Children watched a leader and subordinate from each group pick five apples and divide them into two baskets in a 5-0 or 3-2 split. We then pointed to one basket and asked children whether the leader or subordinate took that share of apples. In preliminary results, we find that children expect the dominant (76.9%) but not prestigious leader (46.2%) to take all of the apples, which suggests that children expect dominant leaders to claim a larger share of resources.

Rapport Development in Human-AI Collaboration Tasks

Rapport is a central component in fostering effective and meaningful interactions between humans and AI agents. It forms the foundation of trust, mutual understanding, and collaboration, all of which are essential for achieving shared goals in cooperative tasks. However, a significant factor influencing rapport development is the diversity of individual differences among human users. This diversity complicates the process of rapport building, as AI agents must adapt dynamically to the unique needs and intentions of individual users. The ability of AI systems to detect and respond to such differences remains a pressing research challenge, as mismatched responses can lead to frustration and disengagement.Theory of mind capabilities enable AI agents to infer user intentions, preferences, and emotional states, while advanced emotion-processing systems allow for nuanced and context-sensitive responses. This study examines how integrating these capabilities into an LLM and RAG enhanced AI Tutor can help sustain engagement and alignment, fostering rapport in Human-AI collaborative learning environments.

"Apples and Oranges" - Evaluating Reaction Time Measures as a Paradigm to Contrast Expert vs. Novice Performance in Complex, Dynamic Task Environments

Previous research has effectively employed the fast-paced action puzzle video-game Tetris for understanding the acquisition of extreme expertise in complex, dynamic environments. A common approach when contrasting expert to novice performance has been the dissection of their interactions with the environment into disjoint sub-tasks – such as reaction time (RT), measured by the input latency to new events on screen. The crucial, underlying assumption to this paradigm is task consistency at all levels of expertise. Using data collected from participants of the Tetris World Championship 2019 and from novices in our lab, we show that this assumption does not hold. While for novices the RT task type remains the same across all conditions, for experts - depending on environmental parameters - RT task type undergoes a shift and under specific conditions does not represent an RT task anymore. Thus, expert vs. novice sub-task comparison may not be a valid paradigm.

The Logic of Bias: Using Cognitive Architecture to Explore Interactions Between Cognitive Abilities and Decision Error

The traditional view of biases being cognitive imperfection has been challenged by several strains of research, such as the PSI cognitive architecture. Here, biases are considered to be engineered by evolution, to prevent dissatisfaction and assist subsequent satisfaction of human needs. PSI's general assumption of higher skills and reasoning capacities alleviating biases has been recently called into question, as high numeracy was associated with an exacerbated effect of political bias. We conduct two studies, the results of which indicate that the basis for this effect 1) does not represent a general cognitive fallacy caused by modulations of perceptional and attentional processes, 2) nor is rooted in the long-term forming of habituated action patterns, associated with prior beliefs. This strengthens the evidence for it to be specific to group dynamics with strong affiliative bounds. Further, we propose a set of revisions to PSI, necessary to model this expert bias phenomenon.

"He's bigger so he has to be older": Children's development of age concepts from 3 to 5 years old

The concept of age is difficult for children to understand as it requires coordinating knowledge across several domains of abstract concepts (e.g., time, number, biology). We tested 122 three- to five-year-old children on their identification of which of two figures is older, as well as on their knowledge of which of two numbers is greater and their ability to temporally order past memories. Consistent with prior research, we found that young children are influenced by size in making age judgments, demonstrating a bias to respond that someone who is bigger is older. However, we show that by age 4, children can incorporate numerical age cues to make accurate age judgments. Among other possible interpretations, these findings suggest that children may initially conflate age with size before identifying chronological time as the relevant domain for age, exhibiting a conceptual change for which acquiring numerical knowledge may play a key role.

Core knowledge influences explanatory reasoning in children and infants

Explanations help us make sense of the world. How does early knowledge shape explanatory reasoning? We first asked whether children's knowledge of object physics shapes their explanation preferences (Experiment 1). In previous work, children preferred teleological explanations (referencing purposes) to mechanistic explanations (referencing underlying processes) for natural events — a domain where children may lack robust knowledge. Here, preschoolers (N=26) saw natural events, non-surprising physical events, and surprising physical events. Then, they chose between teleological and mechanistic explanations. Children strongly preferred mechanistic explanations, suggesting that early physical knowledge influences explanatory reasoning in children. We then asked whether this extends to infants (Experiment 2). Infants (N=23) watched a ball bouncing off (non-surprising) or rolling through a wall (surprising event). Following the surprising event, infants showed systematically different looking patterns at explanatory vs non-explanatory information (e.g., wall with a hole vs. no hole). Thus, core object knowledge influences explanatory reasoning in early development.

Beyond Rewards: How Information Value and Time Horizon Shape Exploration

Prior research has shown that American children (ages 3 to 8) explore uncertain options at strikingly high rates — even when it comes at a cost to reward maximization, when explicitly instructed to seek rewards, and whether they are choosing for themselves or others. In contrast, adults explore significantly less, showing greater sensitivity to cost. What drives this developmental difference? In a series of studies, we test whether children's exploration is motivated by information value rather than reward value and whether adults also incorporate information value into their decisions. We also examine the role of time horizon in the explore-exploit tradeoff. Our findings reveal that both children and adults consider information value and time horizon when deciding whether to explore or exploit.

"Did I really do that?!" Unexpected outcomes of children's own actions as a driver of learning and exploration

Children hold rich expectations about the physical and social world. While observing objects, events, and agents that violate these expectations elicit surprise and enhanced exploration—reflecting prediction error (PE) about the world—, it remains unknown whether children are also sensitive to prediction errors about the outcomes of their very own actions. Here we investigate whether four- to five-year-old children (N=96) are surprised when their self-guided choices produce unexpected outcomes. Upon achieving unexpected success in a chance-based card game, children who did so via their own unguided choices reported more surprise than children who produced the same success with help of a reliable aid. Moreover, in choosing what to explore next, children chose to re-play the surprise-inducing game over a novel, alternative game. Despite prior literature suggesting overconfidence in young children, these results indicate that children are sensitive to their own abilities, readily detecting expectation-outcome discrepancies in their own action outcomes.

Deciphering human meta-cognition in creative problem-solving

Previous studies on human meta-cognition, represented by confidence in perceptual decisions, often focus on over-simplified environments that yield experiences with limited semantic dimensions. However, in real-life situations such as solving a new problem, people need to make sequential decisions in a complex environment, exploring vast combinations of actions that unfold over time. How do people make meta-cognitive evaluations out of the rich, high-dimensional cognitive experiences in such situations? Here we develop a computational method that models each individual's meta-cognitive ratings (e.g., difficulty) of problem-solving experience in a visual puzzle game, using information-theoretic metrics derived from their own action sequences. Individuals are assumed to be Bayesian to update their "thought-space distributions" with their own behavioral distributions on different semantic categories. Our results show that information discrepancies between beliefs at different moments can predict individual differences in self-reported difficulty.

Do speech models use phonological features?

Distributional and phonetic considerations lead linguists to posit that speakers represent phones as members of overlapping natural classes (ex., labials {[p, b, m]}, nasals {[m, n, Å‹]}), which can be represented in a feature system ([+labial], or [+nasal]). Choices of values {-, +, 0} and features {labial, nasal} in this system make different predictions about what classes are accessible targets for generalization (Mayer 2020). Building on the hypothesis that LLMs and humans share the objective of resource-rational analysis of their environment (Leider & Griffiths 2019), we assess the canonical correlation between different proposed feature systems and representations of sounds they describe in self-supervised deep learning models trained on speech, HuBERT and Wav2Vec2. We also examine differences in representational similarity of phones implied by these alignments. We find that although differences in canonical correlations between feature systems and model representations are small, they have qualitatively distinct error patterns for novel data.

Lesion Network Mapping of Cotard's Delusion: Unique and Shared Neural Circuits in Nihilistic Delusions, Misidentification Delusions, and Altered Consciousness

Cotard's delusion, featuring delusions of nonexistence, is rare and not well understood. We used lesion network mapping (LNM) to investigate how focal lesions disrupt large-scale circuits in CD and compared them with other neuropsychiatric conditions. Nineteen lesion-induced CD cases were identified from a systematic review; each lesion was mapped to a standard template, and resting-state fMRI from 1,000 healthy subjects provided functional connectivity. A group-level CD map was defined by stringent sensitivity and specificity thresholds, which then served as a seed for connectivity analysis. We assessed spatial correlations with 31 published lesion-network datasets. A CD-specific network emerged in the right inferior frontal cortex, anterior insula, anterior temporal pole, and temporoparietal junction. Strongest overlaps were observed with Capgras delusion, akinetic mutism, mania, and loss of consciousness, reflecting shared disruptions in self-awareness and salience processing. Unique peaks in the temporoparietal junction and frontal operculum highlight CD's distinct nihilistic features.

How language shapes learning: Visual statistical learning in deaf and hearing children

Statistical learning (SL) is a domain-general learning mechanism necessary for multiple areas of cognitive development. The present study investigates whether children can simultaneously track temporal and spatial visual statistics and how individual differences in cognitive abilities and early language experience relate to SL. Fifty-eight hearing children aged 4–6 years (mean = 5.8) completed a novel visual SL paradigm, tracking the spatiotemporal statistics of four cartoon alien triplets. Cognitive control, receptive vocabulary, and auditory SL were also assessed to measure individual differences. Children achieved 56% accuracy on 2AFC test trials, performing above chance and demonstrating learning of complex patterns. For children under 6.5 years (n = 28), visual SL performance was positively associated with receptive vocabulary (r = 0.65) and cognitive control (r = 0.56). Future testing with deaf children in oral-speech or bilingual (ASL/English) programs will explore how language experience shapes SL capacities, offering insights into early cognitive development.

Assessing Early Communicative Milestones in Preterm and Full-Term Children Using the Pebbles App

Children born preterm are at increased risk of delays in early communicative development. However, studies often focus on extremely or very preterm children (< 32 weeks of gestation). To address this limitation, we use data from the Pebbles App, which reflects the gestational age distribution in the population. This app allows caregivers to document their child's development and assesses the age of attainment of 14 early communicative milestones. Preliminary analyses include more than 4000 children. For most communicative milestones, we did not find significant differences between preterm and full-term children within the first two years using corrected age. However, preterm children acquired some milestones, such as babbling, earlier than full-term children. A possible explanation is the greater communicative experience during the extended time in the extrauterine environment. These findings challenge traditional assumptions about language delays in preterm children and highlight the importance of using more representative samples.

What do we understand from experiments in language evolution: inferences from multiple-choice vs. open-ended semantic space paradigms

In our poster, we challenge the multiple-choice paradigm used in many communication experiments and the validity of conclusions that can be drawn from it. We concentrate on two well-constructed experiments that have made claims that humans can understand improvised or interspecies forms of communication (e.g., Ćwiek et al., 2021; Graham & Hobaiter, 2023). We hypothesized that participants would perform worse when asked to provide free-text answers, compared to the original design of multiple choice. Our results indicate that participants performed worse compared to the original studies. The post hoc analysis showed that, in many cases, relevant semantic domains were correctly identified by the participants, but hardly any responses were fully congruent with the target concept. We conclude by discussing which types of questions are better addressed with the multiple-choice vs. open-text paradigm, and how the results of each of them can be mapped onto a larger picture of language evolution. References Ćwiek, Aleksandra, Susanne Fuchs, Christoph Draxler, Eva Liina Asu, Dan Dediu, Katri Hiovain, Shigeto Kawahara, et al. 2021. Novel vocalizations are understood across cultures. Scientific Reports 11(1). 10108. https://doi.org/10.1038/s41598-021-89445-4. Graham, Kirsty E. & Catherine Hobaiter. 2023. Towards a great ape dictionary: Inexperienced humans understand common nonhuman ape gestures. (Ed.) Frans B. M. De Waal. PLOS Biology 21(1). e3001939. https://doi.org/10.1371/journal.pbio.3001939.

Kenyan, Chinese and US children rely on different metacognitive strategies when solving a problem

Recent theories suggest that metacognitive development is affected by cultural context. However, cross-cultural research on metacognition is sparse and often involves verbal assessment (e.g., "How sure are you that your answer is correct?"), which might not have cross-cultural validity. The present study assessed metacognition by coding children's naturalistic behavior in a problem-solving task. Participants had to assemble objects to build a track according to a model. We compared Kenyan, Chinese, and US children's metacognitive strategies (N=95; 6-10-year-olds). Results revealed that Chinese children relied more on monitoring strategies (e.g., checking the model) than Kenyan and US children, whereas Kenyan children relied more on control strategies (e.g., organizing workspace) than US and Chinese children. Moreover, in all cultures, the number of metacognitive strategies used increased with age. The results suggest differences and similarities in the preferred metacognitive strategies of children across diverse societies.

Cognitive and motor dynamics of speech processing during walking

Walking, traditionally considered an automated process, can become cognitively demanding during dual-task (DT). Up to now, language-motor interactions, such as walking and listening to speech remain underexplored in DT studies, despite the frequent co-occurrence of these activities in daily life. In addition, research on embodied semantics points at the potential of certain words' meaning interacting with actual body movements. Yet no study so far has addressed this issue in relation to gait. This ongoing experiment examines a) the potential motor-cognitive interference of concurrent walking and speech processing and b) the potential semantic effects when action verbs are actively processed during walking. We tested 20 adults using motion capture with concurrent optical imaging (fNIRS) to assess gait variation along with frontal and motor cortex activation. Preliminary findings suggest that gait patterns remain consistent with and without speech processing during walking. However, processing action-related verbs while walking is associated with reduced motor cortex activation.

Iterated LASSO reveals highly distributed and variable representations of faces, places, and objects.

Recent studies have used complex regularization procedures for whole-cortex neural decoding, with results suggesting that neural representations may be much more widely distributed and variable than previously suspected. Such work typically requires extensive parallel compute infrastructure, bespoke regularizers, and complicated workflows. We considered whether comparable results can be obtained from the iterated LASSO, a simple algorithm using standard L1 regularization to conduct an iterative voxel-selection scheme. We applied the procedure to decode stimulus class (face, place, or object) from whole-brain 3T-fMRI data individually in each of 8 participants, achieving a remarkable 98% classification accuracy on held-out images. The algorithm found signals in about 8% of voxels across cortex, many appearing outside the traditional occipito-temporal regions thought to support visual object representation. Moreover, the model weights revealed wide variation across participants in how and where stimulus information is neurally encoded—results consistent with prior work that deployed much more complex workflows.

How Metacognition and Personality Traits Shape Self-Regulated Learning?

Self-regulated learning (SRL) is essential for directing one's own learning. While its importance is widely recognized, empirical research on how SRL interacts with cognitive, metacognitive, and personality traits remains limited. This study extends our previous work by exploring how these traits influence children's information-seeking behaviors. We interviewed 134 children (ages 8-11) about learning a concept independently and assessed metacognitive ability and personality traits using the Big Five Questionnaire – Children Version. Results revealed that, while Extraversion did not predict the chosen source, other personality traits were significantly linked to SRL components. Children higher in Agreeableness were more likely to choose to learn from human-sources, and Openness to Experience was associated with greater enjoyment of learning (p=0.039). Higher metacognitive ability was slightly associated with lower achievement expectancies, consistent with literature contrasting metacognition with wishful thinking. These results underscore the role of individual variability in shaping SRL, informing tailored interventions to support SRL.

Which mindreading for ostensive communication? An Event-Related Potentials (ERPs) study of how the brain processes communicative and informative intentions

According to the ostensive-inferential model (OC), human communication is characterized by two different types of intentions: communicative intention (CI) and informative intention (II). In its classical formulation, the processing of those intentions are considered prerogative of adult humans. In recent years, a deflationist perspective on OC has emerged: This new approach suggests that basic forms of OC can be observed in both human infants and non-human primates. Classical perspectives posit the hypothesis of high-level inferential mindreading for both CI and II. Conversely, deflationary perspectives associate basic forms of mindreading with basic forms of OC. We present an ERPs study on OC. Three primary findings emerged, relating to the amplitude of two early components, P100, N170, and one later component, LC1. This suggests that the detection of intentions occurs within 200-millisecond. We address the empirical and theoretical implications of these findings within the context of a deflationary perspective on OC.

Perceived musicality in an android increases positive social attributions

Social robots increasingly mimic human traits. When a human-like robot (an android) seems to engage with music, a universal human behavior, how do people judge its social attributes? Musical engagement can enhance the android's perceived human-likeness, which may increase affinity but also trigger discomfort (the uncanny valley effect). In Experiment 1 (N =192), an android showed apparent musicality through movement-music synchronization (vs uncoordinated; or without music). In Experiment 2 (N=160), we manipulated musicality by adding (vs. not adding) headphones during movement – implying the presence of music participants could not hear. In both, participants rated the android with higher apparent musicality as warmer, more competent, and eliciting less discomfort (all p's<0.01, measured by the RoSAS scale; Carpinella et al., 2017). These findings show that human perception of androids is impacted by cognitive schemas about and attribution of musicality, which can be inferred even without hearing music directly.

Working a memory – Cognitive impact of chant acquisition in children

Working memory, as proposed by Baddeley and Hitch, is a key model for understanding cognitive functions critical to learning, yet its role in language learning through phonological processes is less explored. The "Sanskrit Effect," coined by Hartzell et al., refers to the cognitive benefits derived from chanting Sanskrit, enhancing brain regions involved in language and memory. This study examines the effects of phonological training through the acquisition of a linguistically and metrically complex Sanskrit chant, the "Shiva Tandava Stotra," on the working memory and attention of children aged 5-7. Over six months, children engaged in daily mantra-based rhythmic exercises, which targeted phonological awareness, rhythmic processing, and attentional control. Pre- and post-assessments revealed significant improvements in auditory-motor synchronization, phonological decoding, working memory retention, and attention. This study underscores the potential of integrating rhythmic-linguistic frameworks into educational curricula as a tool to support cognitive, linguistic, and attentional development, particularly in children with learning challenges.

Probing Perceptual Constancy in Large Vision Language Models

Perceptual constancy is the ability to maintain stable perceptions of objects despite changes in sensory input, such as variations in distance, angle, or lighting. This ability is crucial for recognizing visual information in a dynamic world, making it essential for Vision-Language Models (VLMs). However, whether VLMs are currently and theoretically capable of mastering this ability remains underexplored. In this study, we evaluate 33 VLMs using 253 experiments across three domains: color, size, and shape constancy. The experiments include single-image and video adaptations of classic cognitive tasks, along with novel tasks in in-the-wild conditions, to evaluate the models' recognition of object properties under varying conditions. We find significant variability in VLM performance, with models excelling in shape constancy but struggling with color and size constancy. These results suggest that while VLMs are proficient in object recognition, they do not fully replicate the robustness of human perceptual constancy.

Neural correlates of mental attention in adolescents: a cross-sectional fMRI study

Mental attention, a maturational component of working memory, develops significantly during adolescence, yet its neural correlates remain unclear (Arsalidou et al., 2010). This study used fMRI to examine brain activity in adolescents (13–16 years, n = 28) performing a blocked-design Color Matching Task with increasing difficulty. Results revealed consistent activation in frontoparietal regions, including the dorsolateral prefrontal cortex, superior parietal lobule, and cerebellum, across easy and moderate difficulty levels. Higher task demands recruited additional regions, such as the middle frontal gyrus and dorsal anterior cingulate cortex, with lateralization patterns varying by difficulty and age. Whole-brain analyses highlighted distinct recruitment of attentional networks across difficulty levels. Findings align with working memory research, emphasizing the protracted maturation of the prefrontal cortex and functional reorganization of mental-attentional networks during adolescence. This study advances our understanding of cognitive development and contributes to models of working memory and attentional control in developing brains.

Toward Human-AI Co-Evolution: Integrated Learning Framework and Critical Self-Regulation Mechanisms

Human-AI Integrated Learning (HAI-IL) reconceptualizes collaborative cognition through a four-layer constructivist framework (self, cognitive, interaction, external), demonstrating how adaptive co-evolution occurs across cognitive, decision-making, and feedback dimensions. Where traditional learning systems separate human and machine roles, HAI-IL establishes interdependent symbiosis: Externally, learners operate as unified human-AI entities (Hybrid-intelligence), while internally, AI functions as cognitive extensions rather than replacements. A self-regulation mechanism driving Ethical Dual-Spiral (human chain and AI chain) Regulation ensures alignment between human values and AI operations, dynamically monitoring system outputs against "AI for Social Good" principles. Our findings reveal this framework enhances proactive human agency while enabling neural-like adaptability in AI agents. The model demonstrates particular efficacy in multiple-fields, where HAI-IL mitigates workforce polarization risks inherent to AI-deployment. We suggest that AI development needs deeply integration with human. By establishing technical benchmarks through dual-perspective measurement methodologies, HAI-IL moves beyond reactive human-AI interactions toward true mutual adaptation systems.

Synchronized development of possibility reasoning and theory-of-mind: Evidence from a cross-cultural study

Mental file theory provides a valuable framework for explaining the synchronized development of theory-of-mind (ToM) and other tasks that require taking perspectives into account. To investigate whether possibility reasoning also taps into this process, we examined its correlation with ToM in German (N=58) and Chinese preschoolers (N=47), and in Austria (ongoing replication study). Our results show a positive correlation between false belief reasoning and possibility reasoning in both samples. After controlling for age, the correlation survived in the Chinese sample but not the German sample. This suggests that both kinds of reasoning may depend on common processes, which emerge around age 4. Findings from a replication study will be presented to clarify the inconclusive results. We also investigated relational reasoning which showed no correlation with ToM and exhibited distinct developmental trajectories within cultures, indicating that it is shaped by culture rather than a general cognitive milestone.

Spatial and Math Anxiety Differentially Predict Spatial and Math Performance

Spatial skills are crucial for math learning and success in STEM fields, including computer science and artificial intelligence. Given that math anxiety harms math learning and interests, understanding how spatial skills moderate the relationship between STEM-specific anxieties (e.g., math and spatial anxiety) and math and spatial performance is important for providing insights into STEM readiness. In a pilot study (N=41; 30 females), undergraduate students reported their levels of spatial and math anxiety, along with their spatial (block rotation and spatial relations) and math calculation performance, while controlling for general anxiety and cognitive fluency. Although spatial anxiety did not correlate with math and spatial performance, math anxiety negatively correlated with spatial relations and math performance. Thus, math anxiety seems to extend beyond mathematical domains, whereas spatial anxiety does not. Future research should explore spatial interventions aimed to improve math efficacy, reduce math anxiety, and determine whether these effects support STEM learning.

Kalulu: Evidence-Based Adapted Phonics Instruction for Literacy Across Languages

Ensuring that all children have access to evidence-based reading instruction requires scalable solutions that respect linguistic diversity (Castles et al., 2018). Kalulu is a fully automated system designed to develop phonics-based reading programs for any symbol-to-sound language. Grounded in cognitive science, it generates instructional materials—including books, paper-based games, and digital applications—tailored to the phonological structure of each language. Leveraging AI, Kalulu analyzes grapheme–phoneme correspondences to build a progression of two mappings per week, enabling 100% decodable texts while integrating reading, writing, and vocabulary instruction. Initially tested in France with over 1,000 children, Kalulu is now being deployed in Brazil (1,000+), Colombia (500+), and Mayotte (300+), with ongoing expansion into Argentina. This poster outlines the automation pipeline, field implementation strategies, and current cross-linguistic research in collaboration with local school districts and NGOs. We aim to show how cognitive science can accelerate the global scaling of evidence-based literacy instruction—providing free and open-access learning.

Visualizing Motion Traces Enhances Pursuit Detection in Dynamic Scenes

Detecting dynamic spatial relationships, such as pursuit, can be cognitively demanding (Scholl & Gao, 2013). Research has identified visual cues that influence pursuit detection, including the distance between the pursuer and the target (Meyerhoff, Schwan & Huff, 2014) and the number of objects in a scene (Gao, Baker, et al., 2019; Kon, Khemlani & Lovett, 2024). We turn to data visualization research to explore techniques to improve pursuit detection in dynamic scenes. For example, displaying trajectory histories can aid processing by reducing cognitive load (Heer & Robertson, 2007). We conducted a study in which participants viewed visualizations of six moving dots, either with or without trace lines, and determined whether one dot was chasing another. Preliminary findings suggest that trace lines improve the speed and accuracy of pursuit detection. Our results bridge visualization and vision science, suggesting that trace lines might enhance pursuit detection by providing less transient shape-based visual cues.

Learning from Failure and Success: Children's Achievement Emotions and Learning Choices

Failure is an unavoidable part of learning, and how to manage it influences children's future learning and success (Eskreis-Winkler and Fishbach, 2019). Whereas children can experience both negative and positive emotions following failure (Nelson et al., 2018), it remains unclear how children's achievement emotions change following both failure and success, and whether emotional change aligns with future learning choices. A total of 107 preschool children attempted three unsolvable puzzles without help, then successfully completed one solvable puzzle with guidance as needed. ANOVA results showed a significant negative linear trend in children's affective responses, with no rebound after the success experience. When selecting a future learning task, children exhibited a strong preference for the prior successful rather than any of the incomplete puzzles; yet, logistic regression indicated that this decision was not predicted by changes in emotion. Findings suggest that children showed an adaptive emotional response following both failure and success.

The Role of Early Experience in Judging the Temporal Order of Visual Events: Insights from Late-Sighted Children

One of the cornerstones of sensory cognition is the ability to infer cause-and-effect relationships between entities in our sensory environment—a skill that depends on accurately perceiving the temporal order of events. Here, we asked whether early sensory input is critical for developing this proficiency in the visual domain by studying children born blind who gained sight late in childhood. Several years post-surgery, these late-sighted children performed on par with typically-sighted controls in determining the sequence in which two visual events occurred. However, this ability was not evident immediately after surgery but emerged over a protracted developmental period, underscoring the importance of subsequent visual experience. Our findings demonstrate that the neural resources necessary for this foundational aspect of sensory cognition remain accessible beyond infancy, offering fundamental insights into the role of early experience in sensory and cognitive development as well as practical guidance for clinical rehabilitation following sensory deprivation.

Developmental Trajectories of Metacognition in Auditory Serial Recall of Environmental Sounds

Auditory streams with varying features (changing-state, e.g., J,K,L) impair serial recall more than repetitive streams (steady-state, e.g., J,J,J), known as the Changing-State Effect (CSE). Despite extensive CSE research, metacognitive development – how individual monitor recall under auditory distraction – remain unexplored. This study tested 26 adults (Experiment1) and 40 children (5–12 years, Experiment 2) on auditory serial recall and meta-memory accuracy (self-estimate performance). Distractors (steady-state/changing-state letters) followed target sequences of environmental sounds. Results showed: (i) both distractor types modulate meta-memory accuracy (vs. silence) but not serial recall performance (ii) serial recall improved with age and (iii) the discrepancy between meta-memory accuracy and recall accuracy declined with increasing age. These findings reveal distinct developmental trajectories: while serial recall improves steadily with age, metacognitive accuracy in the presence of auditory distraction becomes more precise and less disrupted by such interference as individuals grow older, suggesting divergent mechanisms for memory performance and its self-monitoring under CSE.

Age-related changes in cognitive flexibility: fMRI meta-analysis

To examine neural mechanisms underlying cognitive flexibility changes with ageing, we synthesized findings from 87 fMRI studies, comprising 120 experiments with 2308 adult participants distributed across young, middle-age, and older groups. Our meta-analysis was focused on rule-retrieval and rule-discovery processes, assessed with Task-Switching Paradigm and Wisconsin Card Sorting Test, respectively. Activation Likelihood Estimation analyses revealed age-related decreases in brain activation related to general switching ability, particularly in posterior regions, alongside an anterior shift in older adults, consistent with the Posterior-Anterior Shift in Aging (PASA) model. Rule-retrieval tasks consistently engaged left-lateralized frontoparietal regions across all age groups, with middle-age adults additionally recruiting the right cerebellum and medial-frontal gyrus. For rule-discovery tasks, age-related decline was observed in bilateral frontoparietal regions, while older adults also showed unique activation in the left inferior-frontal gyrus. These findings highlight differential ageing trajectories for rule retrieval and rule discovery, potentially reflecting compensatory neural mechanisms and dedifferentiation processes.

Children's Reasoning about Third-Party Intervention in Peer Relationship Context

Third-party intervention (TPI) has been shown to emerge early in human ontogeny. However, little is known about how social relationship information influences children's reasoning of TPI. The current study answered this question by exploring 6- to 11-year-old (N = 108) Chinese children's reasoning of how a third-party observer would (descriptive norms) and should (prescriptive norms) intervene against unfair resource allocation, and how the reasoning was modulated by the peer relationships (friend, disliked peer, stranger) between the observer and unfair allocator. Results showed that peer relationships affected children's expectations (would question) of TPI from age of 6, with this influence strengthening with age. However, children's judgments (should question) of TPI were not affected by peer relationships or age. The results reveal that considerations of peer relationships drive children's descriptive norms regarding TPI to increasingly diverge from prescriptive norms from age 6, deepening our understanding of the development of contextualized moral cognition.

The Emergence of Name Sound Symbolism in Children

Sound symbolism refers to the finding that certain phonemes are perceived to be better fits for particular properties such as shape (e.g., the Maluma/Takete effect). Sound symbolism extends from nonwords to real first names (i.e., the Bob/Kirk effect). Children begin to show sensitivity to sound symbolism at one year of age, however, the development of name sound symbolism remains unexplored. Additionally, previous work on name sound symbolism has highlighted people's tendency to associate femaleness with round shapes, and maleness with spiky shapes. We investigated the emergence of name sound symbolism and gender-shape association in five- to seven-year-olds, in comparison to adults. We also collected measures of children's language abilities. Results indicate that both associations are stronger in adults than children. Moreover, while gender-shape association is observable in children, name sound symbolism develops later, and language skills did not influence development. These findings provide insights into the development of crossmodal correspondences.

Resting-state EEG and Reading Skills in German-speaking Children

Resting-state neural oscillations have been linked with attention and language processing, yet their specific role in reading skills remains unclear. We studied the associations between resting-state EEG and reading skills among 83 German-speaking first to third graders. Our analyses focused on power spectrum density across four frequency bands (delta, theta, alpha, beta) and cluster-based connectivity, in children with varying reading abilities, controlling for age. Spectral analysis showed no significant associations between reading skills and frequency bands. Functional connectivity analysis revealed a negative association with delta-band activity in the central-parietal and with theta-, beta- and delta-band activity in the central-occipital network. Pseudoword reading and reading comprehension were negatively associated with theta-band activity in the left central-occipital and delta-band activity in the right central-occipital network, respectively. Our findings provide insights into the complex relationship between neural oscillations and early reading and suggest resting-state functional connectivity as a neural marker of reading skills.

Human Adaptation of Learning Strategies Resembles Policy Gradients

A hallmark of human intelligence is not only the capacity to learn from the environment, but also the ability to adapt the learning process itself in response to changing demands. This meta-learning ability, known as "learning to learn," has been extensively studied in cognitive science and artificial intelligence for decades. While task-optimized recurrent neural networks have offered qualitative accounts of biological learning-to-learn, they fall short in capturing the individual and temporal variability inherent in human decision making. To investigate how humans adjust their learning strategies over time, we introduce a neural network model that estimates dynamic changes in subjects' reinforcement learning (RL) parameters. Across four bandit tasks, we find that RL parameters change over time, indicating that humans continuously adapt their decision-making strategies at both the trial and block levels. These parameter updates are associated with greater rewards and align with policy gradients near current RL parameters, suggesting that humans refine their learning strategies based on task feedback. Taken together, our work provides a novel framework for understanding the adaptive mechanisms of biological meta-learning, with broad applicability across tasks, populations, and cognitive models.

Do ducks lay eggs? How interactivity shapes generic generalizations.

Generic generalizations play an important role in communication and learning. Recently, we argued that all generics favor stability across background circumstances. This, however, appears to be false for minority-characteristic generics ("ducks lay eggs" - only females do). In response to this challenge, we developed a hypothesis that highly interactive systems (like sexually-reproducing species) resist partitioning into subcomponents (male vs. female) in the course of generic evaluation. Consequently, a higher prevalence of the property ("laying eggs") in one subset does not qualify as instability, and generics attributing that property to the kind are not penalized. We evaluated this hypothesis in an experiment with 99 adults, manipulating descriptions of the manner in which novel species reproduced: interactive vs. non-interactive. As predicted, generic generalizations describing interactive mechanisms were endorsed more than non-interactive equivalents. These findings support our stability argument and highlight the key role of causal-theoretical considerations and conceptual representation in assessments of generics.

Degree of bilingualism and cognitive neural processing in adults

Effects of bilingualism on cognitive control remain highly debated. Such debates partly stem from reliance on behavioral measures alone, which may obscure subtle individual differences. Even studies that leverage brain electrophysiology report mixed results, often due to categorizing individuals as monolingual or bilingual. Here, we examined whether the degree of bilingualism was related to the P3b effect—an established electrophysiological measure of cognitive control. Young adults with heterogeneous language experiences completed the Language Social Background Questionnaire (Anderson et al., 2018). Electroencephalography data were recorded from 70 participants who completed the Active Visual Oddball paradigm (Kappenman et al., 2021), a task optimized to isolate the P3b response. We found that more bilingual language experience was associated with larger P3b effects, even in the absence of behavioral differences. These results highlight the importance of characterizing bilingualism along a continuum when investigating bilingual effects on cognitive processing.

Development and Utilization of a Continuous-Space Description of Paintings

Theories of category learning suggest that some aspects of item similarity (e.g., exemplar to exemplar; category center to category center) play a key role in predicting learning. Yet, measuring the distance between items can be difficult for real-world categories, so researchers often use contrived items (e.g., aliens with 1 to 5 circles on their chest). Here, we examined the role of similarity using more naturalistic categories: painting styles. To measure perceived distance between paintings, participants (N = 1,335) completed 512 trials of a triplet task, choosing which of two paintings was visually most similar to a target. Triplets were drawn randomly from 475 still life and landscape paintings by 40 artists. A machine learning algorithm then placed the paintings in a continuous space based on 571,286 total decisions. We used this space in a new learning task, where participants identified the artist of previously seen/unseen paintings. Initial results suggest perceived distances predicted learning outcomes.

Contextual Malleability of Empathy: Effects of Trait Level, Group

Empathy was mainly considered a stable trait, and few studies have investigated whether it can vary across different situations. This investigation explores how contextual empathy of study participants varies across different social group relationship and positive/negative event valence. In this study, participants were divided into high- or low-empathy group by their scores of Empathy Scale. The in/out group membership was manipulated through a point estimation paradigm, and event valence was operationalized by the emotion status of the character in the story event. As expected, results showed main effects across all factors that participants demonstrated contextual empathy differently. More importantly, an ingroup bias is significantly emerged, with participants exhibiting enhanced contextual empathy toward ingroup than outgroup character. Furthermore, positive story events elicited more contextual empathic responses than negative events. These findings evidently provide an empirical support for the context-dependent nature of empathy, challenging its traditional conceptualization as an invariant trait.

A Preliminary Investigation of Spatial Ability and Spatial Anxiety in Prosthetics and Orthotics Students

Spatial ability is crucial in STEM and medical fields, including prosthetics and orthotics (P&O), which focuses on designing, fabricating and fitting of prostheses and orthoses. However, spatial ability in P&O practitioners remains unexplored. This study examined spatial ability in P&O master's students using mental rotation and cross-sectioning tasks, and assessed spatial anxiety, including imagery anxiety, mental manipulation anxiety, and navigation anxiety. At orientation, female students reported higher overall spatial anxiety, but there were no gender differences in spatial ability. Mental manipulation anxiety was negatively correlated with cross-sectioning ability in females (r = -0.44, p = 0.009) but not in males (r = -0.08, p = 0.080). After six months, female spatial anxiety decreased, and overall cross-sectioning ability improved. Gender differences in spatial anxiety decreased, though navigation anxiety remained higher in females (t(46) = 2.60, p = 0.01). P&O program participation appears to improve spatial ability and reduce spatial anxiety.

Effective connectivity analysis in children: exploring the impact of the dorsal and ventral part of inferior frontal gyrus on phonological and orthographic processing

Learning to read requires the integration of top-down and bottom-up processing of orthographic and phonological information. This study investigates how the dorsal and ventral inferior frontal gyrus (dIFG and vIFG) interact with posterior regions, including the ventral occipitotemporal cortex (vOT) and temporoparietal cortex (pSTG-SMG), during a visual word rhyming task. Using Dynamic Causal Modeling (DCM) on fMRI data from children aged 10 to 17 years, we examine directional influences among these regions under four conditions involving phonological and orthographic conflict and non-conflict. We hypothesize that dIFG exerts stronger top-down influence than vIFG, particularly under conditions of conflict. Additional hypotheses address the balance between top-down and bottom-up influences, region-specific effects of phonological and orthographic conflict, and the relationship between top-down modulation and reading skills. Data collection is complete, with 72 participants assessed, and analyses are underway. This pre-registered study aims to advance understanding of the neural mechanisms underlying reading development and inform interventions for reading difficulties.

Bounded Ecologically Rational Meta-learned Inference Explains Human Category Learning

In his metaphor of 'behavioral scissors', Herbert Simon proposed that human behavior is shaped by scissors whose two blades are the structure of task environments and the computational capabilities of the actor. However, previous work has mostly studied the two blades of the "behavioral scissors" in isolation. We introduce a new class of models called bounded ecologically rational meta-learned inference (BERMI), which allows for study of the two blades in conjunction. BERMI is rationally adapted to ecologically valid tasks generated from a large language model, but its computational capacity is bounded. We found that BERMI quantitatively explains human choices better than eight other cognitive models in two different category learning experiments. In addition, it captures several qualitative aspects of human categorization, such as learning difficulty and learning speed, much better than other competing models.

VisChatter: Enhance Synchronous Collaboration on Data Visualization Dashboard with Visual Annotations

Online meetings around data have become integral to insight generation and collaborative decision-making. However, effectively communicating data in these settings presents significant challenges. Visualizations often include multiple patterns to perceive, and verbal descriptions of these patterns can be ambiguous, leading to potential miscommunication. Visual annotations offer a means to clarify these ambiguities and enhance user engagement with the data. Yet, existing online meeting tools often render the creation and management of these annotations cumbersome, detracting from the spontaneity of discussions. To address these challenges, we introduce VisChatter, a tool that facilitates real-time visualization annotation through a multi-modal agent. This agent integrates user speech and mouse movements to generate chart annotations, informed by a formative study that evaluated the efficacy of various annotation techniques. Our evaluation suggests that VisChatter significantly reduces cognitive and physical load during online data pattern communication while maintaining a user experience comparable to established platforms like Zoom.