AAAI Symposium on Generative Models in Cognitive Architectures

18 minute read

Date:

These are my notes I took while attending the AAAI Symposium on Generative Models in Cognitive Architectures in Washington D.C. I had a great experience at this conference discussing the use cases of generative AI models in cognitive modeling.

As with my past blog posts on conferences that have included my notes, the comments and descriptions are my own interpretation of the presentations given by the researchers below and may not necessarily reflect the thoughts or opinions of those researchers. I have tried to add some relevant links to recent research by these presenters to give more context to their work.

AAAI Fall Symposium Series - Integration of Generative Models in Cognitive Architectures

Introduction (John Laird)

Introduction to Structure and Themes of Symposium John Laird

  • Cognitive architectures need help from generative models.
  • Multiple representations of knowledge are relevant for the design and structure of cognitive architectures.
  • ‘Best match to brain structure for dynamic resting brain imaging;
  • LLMs: Transformers interesting because they can determine which features of the prompt are relevant for predicting the next words. Many believe that aspects of CAs such as meta cognition will emerge naturally from LLMs.
  • A survey on large language model based autonomous agents The rise and potential of large language model based agents
  • Limitations of LLMs: No online learning, odd situation of being ‘stuck’ in 2021. Bard doesn’t quite have this issue. One of the points of the workshop is the promise of using cognitive architectures to benefit the limitations of LLMs, and vice-versa.
  • Q: What are the limitations of the use of transformers over RNNs which have seem to be no longer as relevant? A: Transformers appear to have some recurrent components, but they usually are not trained online, so there appear to be large wholes in GMs and what we want out of cognition.
  • Q: What is the status of the physical symbol system hypothesis: A Symbolic processing at some level is necessary but not sufficient. Paul Rossenbloom’s paper goes into this question.

Integrating Large Language Models and Cognitive Architectures: Exploring Synergistic Approaches for AI Agents: Oscar Javier Romero López, John Zimmerman, Aaron Steinfeld and Anthony Tomasic

  • Propose three methods of integrating LLMs into CAs. Nero-symbolic, modular, and agency approach.
  • Running example of cognitive agent application, assisting a blind person navigating.
  • Modular approach: Assumes the common model of cognition.
  • Cognitively augmented large language model. CA used to retrieve information about the semantic relationships if words in the context of a query, such as downtown, resolved based on time, contextual information
    • This seems like an odd application given that LLMs are traditionally much better at referent resolution than CAs. Wouldn’t it be better to just include that information into the LLM prompt?
  • Leveraging LLMs to support multi-modal perception and action. Generate a semantic relationship of a prompt input that can be then used by a CA to do reasoning.
  • Semantic and episodic applications, automatically populate ontologies. Or generate ACT-R productions for a specific application
  • Internal simulation for anticipation and planning, using an LLM to chain sequences of events in a perception-action-perception. Ask what will happen next. Can use GPT to assign a probability of what till happen next based on context.
  • Neuro-symbolic approach: Clarion, focus on the action center subsystem, similar to the procedural memory of other cognitive architectures. Generate generalization or specialization rules that can be applied onto the clarion architecture.
  • Agency approach: Society of mind, Minsky. Many receiving agents that are receiving direction from the global work space. Agents play different roles, may focus on one aspect of the task, pairwise or supervisory role.
  • Multiple CA + LLMs interacting collaboratively to exchange knowledge and intentions with humans.

Commonalities between all three aproaches, all use perception, action, and working memory. All use LLMs to convert symbols into natural language and vice-versa. Main difference between these approaches is the level of integration and the robustness. Neuro-symbolic is tightly integratesm agency is looselym and the modular is between the two.

tool former, how does this relate? Close to the modular approach in that we query external APIs.

Cognitive Architectures for Language Agents Theodore Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths

  • Using language models to combine with value functions learned by robotic systems. Saycan and Generative agents. Interested in extending LLMs into different environments, different types of feedback goals for the agent, etc.
  • Newel and Simon thermostat agent control flow production system. We can think of language models as a distribution over productions. This conceptualizes prompt engineering as a bias over the initial distributions of productions, but all changes to the prompt impact the production, some in unforeseen ways.
    • Shifted the posterior over productions by giving examples but there is no proof that we actually shifted the posterior. And the amount of the shift isn’t quantifiable. How useful is this conceptualization?
  • CoALA: Uses memory, action space, and decision spaces as a simplified architecture of cognition. Operates in the form of natural language instead of logical formulas.
  • Both external actions interacting with environments as well as internal actions involving reasoning, retrieval, and learning.
  • Decision making techniques: Output the action directly, output several actions and evaluate each, or generate many and simulate each iteratively in multiple steps. Ideally we can split these decisions into cycles of planning and execution. SayCan Ahn 2022, ReAct Yao 2022, Voyager Wang 2023, Generative Agents Park 2023, Tree of thoughts 2023.
  • Gap in applications is updating agent code for decision-making. Learn new procedures.
  • CoALA abstractions to help develop language agents.
  • Raises questions of the role of memory in decision-making. Useful to think of committing memory to long term memory as an active decision making process.
  • Belief that more capabilities of large language models means there is less structure required, but there needs to be some architecture, some interface to a motor system or some action processing.

Offline Integrations / Content (David Reitter)

Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs: Sergei Nirenburg, Sanjay Sarma Oruganti Venkata, Jesse English and Marjorie McShane

  • Language endowed intelligent systems automatically collecting computational lexicons or ontologies. Typically requires human knowledge, a significant bottleneck
  • Generate text meaning representations based on lexicons, ontologies, and working memory.
  • Cognitive architectures for LEIAs optionally use an external dataset of language to supplement their knowledge when they lack the relevant information in their ontology.
  • OntoGen creates example sentences that can be used as prompts to LLMs to determine the features of new lexical entries that can be bootstrapped into the lexicon.
  • Generate multi word expressions and validate them based on whether they produce legitimate sentences. How is this done? COCA corpus search.

Generating Chunks for Cognitive Architectures: Goonmeet Kaur Bajaj, Kate Pearce, Sean Kennedy, Othalia Larue, Alexander Hough, Jayde King, Christopher Myers and Srinivasan Parthasarathy

  • Data processing needs of AI vs Cognitive Architectures.
    • CAs: Require constant knowledge engineering, time consuming. Want to automatically generate chunks for CAs.
  • Analogical reasoning with ACT-R. Involves retrieval, mapping, and transfer. Structure mapping engine used to understand the similarity between source domain and the application domain.
  • Want to extract subject object relations from unstructured language. LLMs are great at this.
  • Many triplets were filtered, how? Is this domain specific? If it is than does the whole pipeline require this domain specific knowledge, to remove bad examples.
  • Another approach is to directly prompt the language model to generate chunks, and rank them. Can be difficult due to low accuracy, could be improved by ranking the triples based on the relevance of the task to improve accuracy.

Opportunities and Challenges in Applying Generative Methods to Exploring and Validating the Common Model of Cognition: Catherine Sibert and Sönke Steffen

  • Using fMRI analysis to determine the connection between different regions using dynamic causal modeling.
  • Evaluating predictions of the best fit, difficult due to the predictions of multiple different regions of the brain. Compared with six different alternative architectures.
    • Mentioned the issue of evaluation, how well can this evaluate specific theories like whether action impacts long-term memory, if it is an active process. Would this be possible with this comparison?
  • Expanding the common model, perception is typically mapped to just visual perception, and want to incorporate visual and auditory, different theories of how these two connect differently.
  • Issue of this analysis, more complex models are more likely to provide a good fit for observed data.
  • Data issue of using bad data to predict the observations, but there is still a clear ranking of models based on complexity. Could be used to compare models of different complexity, based on the improvement from bad to good data.

Comparing LMMs for Prompt-Enhanced ACT-R and Soar Model Development: A Case Study in Cognitive Simulation: Siyu Wu, Rodrigo Souza, Frank Ritter and Walter Lima

  • Difficult to extend new applications of models like ACT-R, LLMs can potentially fix these issues. LLMs are used to build models. Compared GPT-4 and Bard in their ability to develop CA models in an interactive methodology.
  • Ask user what they know of ACT-R, educate them on the functionality of the model, and get them to produce the chunks that they think might be relevant. Human-in-the loop has model produce chunks and evaluate them.
  • Human feedback can improve issues of syntax and other qualities that the GPT model gets wrong.

Behavior Prediction using LLMs: Anthony Harrison, Laura Hiatt and Greg Trafton

  • Using CAs to help LLMs help CAs to do cognitive models.
  • Smaller and more restricted tasks than language.
  • Try a smaller model using just a single transformer with a single head.
  • Training GPT with implicit data required to complete tasks, and see if it can then predict human behavior while they are playing that same task.
  • Is there a benefit to mixing human and synthetic data in pre-training GPT models? This is what I am testing for phishing email classification currently.

Individual Module Extensions to Cognitive Architectures (Christian Lebiere)

Exploiting Language Models as a Source of Knowledge for Cognitive Agents: James Kirk, Robert Wray and John Laird

  • Can language models be an effective source of knowledge.
    • Indirect extraction
    • direct extraction
    • model control
  • Interactive task learning: user would need to describe the actions for an AI agent to take or the gols that it needs to achieve. With LLMs, we can instantiate the high level goal from the human with relevant information provided by the cognitive architecture.
  • To solve this task the model asks the human for in the loop responses of instructions for sub-tasks of actions, which allows it to fill in gaps of its knowledge.
  • Challenged of direct extraction of information, accuracy, relevance, interpretability, and other factors. Matching these goals with human expectations is done through human oversight.
  • SOAR model is augmented with the ability to query the LLM API for symbolic working memory information.
  • Search tree analysis and repair selection STARS. Beam search for many responses, evaluate them and repair mismatches with re-prompting, prompt LLM to select candidate, query the human in the loop for a conflict resolution.

Using Large Language Models in the Companion Cognitive Architecture: A Case Study and Future Prospects: Constantine Nakos and Kenneth Forbus

  • Companion cognitive architecture:
  • Software social organisms,
  • Uses semantic information in the form of lexical statements about facts that are used for performing reasoning.
  • Difficult for companion to disambiguate which of the candidate theories that it generates are correct.
  • Alight wikipedia sentences into facts, in the form of query cases that are rule like concepts mapping semantics onto facts.
  • LLMs used for disambiguation, to widdle down the number of interpretations that could explain something.

On Using Generative Models in a Cognitive Architecture for Embodied Agents: Dongkyu Choi

  • Vision language action Trasformer based models could be used in robotics settings to control embodied agents while leveraging the foundation models that are pretrained on a variety of domains.
  • Some of the knowledge contained in LLMs is ‘generalist’ and not ‘common sense’ in that it is highly specific knowledge that can’t be reasoned through common sense, but because there are many texts about those topics online, the model has knowledge of it.
  • GAI and GMs for embodied agents are typically either planning based or action based, or hybrid approaches.
  • Newer task is to form executable action commands through a action translator, which may require some common sense reasoning for applications in house or real world settings, but not necessarily in industrial or factory domain, where on the fly learning is more relevant.

Proposal for Cognitive Architecture and Transformer Integration: Online Learning from Agent Experience: John Laird, Robert Wray, Steven Jones, James Kirk and Peter Lindes

Thursday, 26 Oct

Impact Across Multiple Traditional Cognitive Architecture Modules (Andrea Stocco)

Psychologically-Valid Generative Agents: A Novel Approach to Agent-Based Modeling in Social Sciences: Konstantinos Mitsopoulos, Ritwik Bose, Brodie Mather, Archna Bhatia, Kevin Gluck, Bonnie Dorr, Christian Lebiere and Peter Pirolli

  • Public policy impacting individual behavior, but epidemiological models dont take into account the impact of policy on the behavior of individuals.
  • Simulated agents interacting in a social network to predict the spread of diseases like covid-19. Psychologically grounded behavior predictions that impact the network of social interactions.
  • LLMs used to generate natural language that is used for communication between members of the network,
  • Extracting beliefs and sentiments from online social media sources.
  • Agent based modelling used to explain complex systems, not using mathematical equations. Simple rules that define the interaction of agents in a network.
  • Emergent dynamics resulting from the interaction of the agents.
  • Nodes of the network perceive the global and local information.
    • Seems like in the covid case, people arent basing their decision making on experience in terms of making a choice to wear a mask or get a vaccine.
  • Predictions and observations of the proportion of the population wearing mask is accurate to real world observations in California.
  • Generative Agents: Interactive simulacra of human behavior. System changes what they perceived based on their memory stream. These have been applied onto modelling virus spread in agent based modeling for social networks. Can’t scale this network to large amounts due to the requirements of the LLM.
  • CAs are instantiated based on social media posts, their beliefs and attitudes, and then they are updated by taking new social media and agent communication. Stance detection is used to analyze the beliefs of other agents, their theory of mind.

Bridging Generative Networks with the Common Model of Cognition: Robert West, Spencer Eckler, Brendan Conway-Smith, Nico Turcas, Eilene Tomkins-Flanagan and Mary Kelly

  • Representation agnostic: Neural-symbolic, neural networks, symbolic, nets, etc.
  • Pipeline model of the common model of cognition involves the integration of different perceptual faculties into working memory.
  • One issue of how working memory integrates this is that spontaneous memory retrievals can be disruptive, so we can add a middle memory system to bridge the two.
  • This has been used to model emotional changes, shadow productions which recommend productions (of emotions?) but cant fully control the system.
  • Middle memory productions can be combined to connect things like visual observations and motor control affordances or emotional reactions to perception.
  • Shadow production can be improved through utility feedback, so that the productions that lead to high rewarding situations are made to be more common, likely to continue producing.

A Proposal for a Language Model Based Cognitive Architecture: Kobe Knowles, Michael Witbrock, Gillian Dobbie and Vithya Yogarajan

  • Language model based cognitive architecture (LMCA).
  • GPT can’t identify programming problems where you switch the definition of terms like len and print so that they are opposites.
  • Confabulation, over fitting to training data, and inefficient with data.
  • Thoughts generated by cognitive architecture can impact the generation of thought.

Alternative Integrations/Unifications of Generative Models and Cognitive Architecture (Paul Rosenbloom)

The Grounding Problem: An Approach to the Integration of Cognitive and Generative Models: Mary Lou Maher, Dan Ventura and Brian Magerko

  • Symbolic systems in cognitive architectures vs. connectionist methods.
  • How can AI systems be grounded in a world where they have no direct interaction, claim was that the grounding is between symbols and internal grounding, without external interaction.
  • The vector grounding problem. Performing numerical operations on a numerical operation of symbols, and that when you decode it the meaning is retained through those mathematical operations. The method of achieving grounding in connectionist methods where the input are words, tokens, etc, associated with some external information like sounds, visual information, etc.
  • referential grounding connects word/symbols to some notion that exists in the world.
  • Can either ground concepts in symbols or vectors.
  • Integrating CA and Generative models to replicate human creativity or supplement humans doing creative tasks.
  • Wundt curve, if something is highly similar or dissimilar then it is uninteresting or difficult, and the midpoint aids in creativity.
  • Easier for AI is Combinatorial creativity where two or more concepts are combined into a more mid-point between those concepts, more difficult is transofrmational where the boundry is expanded and the new idea is outside the ‘box’.
  • VAE embeddings for symbols, trying to keep the embeddings grounded throughout the formation of the representations of symbols.

Cognitive Architecture Toward Common Ground Sharing Among Humans and Generative AIs: Trial Modeling on Model-Model Interaction in Tangram Naming Task: Junya Morita, Tatsuya Yui, Takeru Amaya, Ryuichiro Higashinaka and Yugo Takeuchi

  • This talk focuses on concertizing cognitive models by generative AI towards a common ground between humans and AIs.
  • Representation of the information used by cognitive models in a way that is more easy to understand by humans.
  • Shannon communication channel, for communication between humans and AI, and vice-versa. Miscommunication is explained by different interpretations of ambiguous communication information.
  • Tengram naming task between human participants, they explain verbally the visual information that they are both observing, with some slight differences in the rotation of the abstract images.
  • Holistic vs analytical utterances used in communication to create a common grounding between two humans.

Generative Environment-Representation Instance-Based Learning: A Cognitive Model: Tyler Malloy, Yinuo Du, Fei Fang and Cleotilde Gonzalez

  • My talk

A Neuro-Mimetic Realization of the Common Model of Cognition via Hebbian Learning and Free Energy Minimization Alexander Ororbia and Mary Alexandria Kelly

  • Online learning across different task without catastrophic interference in a cognitive neural generative system.
  • Inspired by predictive processing,
  • Global workspace theory

Bridging Cognitive Architectures and Generative Models with Vector Symbolic Algebras P. Michael Furlong and Chris Eliasmith

  • VSAs can be used to encode data structures like lists, trees, graphs, and hash maps
  • Holographic reduced representations of these structures
  • Local smoothness of representations, even when representing categorical data.