Artificial General Intelligence
— A gentle introduction

Pei Wang

[This page contains up-to-date information about the field of Artificial General Intelligence (AGI), collected and organized according to my judgment, though efforts are made to avoid personal biases.] [Español]

From AI to AGI

AI: in different directions, and through seasonal cycles

Artificial Intelligence (AI) started with "thinking machine" or "human-comparable intelligence" as the ultimate goal, as documented by the following literature: In the past, there were some ambitious projects aiming at this goal, though they all failed. The best-known examples include the following ones: Partly due to the recognized difficulty of the problem, in the 1970s-1980s mainstream AI gradually moved away from general-purpose intelligent systems, and turned to domain-specific problems and special-purpose solutions, though there are opposite attitudes toward this change: Consequently, the field currently called "AI" consists of many loosely related subfields without a common foundation or framework, and suffers from an identity crisis:
  • External recognition: As soon as a problem is solved, it is no longer considered as requiring "intelligence" anymore, so the AI community rarely gets credit.
  • Internal fragmentation: The subfields of AI become less and less associated to one another, even though their problems are closely related.

A new spring

Roughly in the period of 2004 to 2007, calls for research on general-purpose systems returned, both inside and outside mainstream AI.

Anniversaries are good time to review the big picture of the field. In the following collections and events, many well-established AI researchers raised the topic of general-purpose and human-level intelligence:

More or less coincidentally, from outside mainstream AI, there were several books with bold titles and novel technical approaches to produce intelligence as a whole in computers: There were also several less technical but more influential books, with the same optimism on the possibility of building general-purpose AI: So after several decades, "general-purpose system", "integrated AI", and "human-level AI" become less taboo (though still far from popular) topics, as shown by several related meetings:

It's summer again

Since 2008, several research communities have emerged, with similar focuses and overlapping participants: More research books have been published:

In mainstream AI, deep learning has made impressive progress in recent years, which raises many people's hope on "human-level" AI once again. The claim "The Turing Test has been passed" and the success of AlphaGo in the board game Go renewed the discussion on what "artificial intelligence" is really about, and how to reach it. There is still no consensus, and the opinions are not even converging.

Several large companies have labeled their results as "steps towards AGI", and their approaches are either extensions of deep learning or integrations of the existing AI techniques. This approach is exemplified by large language models like GPT-4, which is claimed by its creator as "a significant step towards AGI".

Partly triggered by the recent progresses, more and more people consider AGI, or whatever it is called, as really possible. As a consequence, the risk and safety of it becomes a hot topic:

AGI Basics

The most general questions every AGI researcher needs to answer include:
  1. What is AGI, accurately specified?
  2. Is it possible to build the AGI as specified?
  3. If AGI is possible, what is the most plausible way to achieve it?
  4. Even if we know how to achieve AGI, should we really do it?
[My own answers to these questions are here.]
In the following the major answers in the field of AGI are summarized.

What is AGI

Roughly speaking, Artificial General Intelligence (AGI) research has the following features:
  • Stressing on the general-purpose nature of intelligence,
  • Taking a holistic or integrative viewpoint on intelligence,
  • Believing the time has come to build an AI that is comparable to human intelligence.
Therefore, "AGI" is closer to the original meaning "AI", while very different from the current mainstream "AI research", which focuses on domain-specific and problem-specific methods. "AGI" is similar or related to notions like "strong AI", "human-level AI", "complete AI", "thinking machine", "cognitive computing", and some others. Here is an explanation about the selection of the term "AGI".

A complete work of AGI should consist of

  1. A theory of intelligence (described in a human language),
  2. A model of the theory (described in a symbolic/mathematical language),
  3. A computer implementation of the model (realized in software/hardware).

Even though there is a vague consensus on the objective of reproducing "intelligence" as a whole in computers, the current AGI projects are not aimed at exactly the same goal. Though every AGI approach gets its inspiration from the same source, that is, human intelligence, here "intelligence" is understood in several senses. Consequently, AGI projects attempt to duplicate human intelligence at different levels of abstraction:

  • Structure
    Rationale: Intelligence is produced by the human brain. Therefore, to build an intelligent computer means to simulate the brain structure as faithfully as possible.
    Background: Neuroscience, biology, etc.
    Examples:  HTMVicarious
    Challenge: There may be biological details that are neither possible nor necessary to be reproduced in AI systems.
  • Behavior
    Rationale: Intelligence is displayed in how the human beings behave. Therefore, the goal should be to make a computer to behave exactly like a human.
    Background: Psychology, linguistics, etc.
    Examples: Turing Test, ChatGPT
    Challenge: There may be psychological or social factors that are neither possible nor necessary to be reproduced in AI systems.
  • Capability
    Rationale: Intelligence is evaluated by problem-solving capability. Therefore, an intelligent system should be able to solve certain practical problem that is currently solvable by humans only.
    Background: Computer application guided by domain knowledge
    Examples: IBM Watson, AlphaGo
    Challenge: There is no defining problems of intelligence, and the special-purpose solutions lack generality and flexibility.
  • Function
    Rationale: Intelligence is associated to a collection of cognitive functionality, such as perceiving, reasoning, learning, acting, communicating, problem solving, etc. Therefore the goal is to reproduce these functions in computers.
    Background: Computer science
    Examples: Mainstream AI textbooks, Soar
    Challenge: The AI techniques developed so far are highly fragmented and rigid, and it is hard for them to work together.
  • Principle
    Rationale: Intelligence is a form of rationality or optimality. Therefore, an intelligent system should always "do the right thing" according to certain general principles.
    Background: Logic, mathematics, etc.
    Examples: AIXI, NARS
    Challenge: There are too many aspects in intelligence and cognition to be explained and reproduced by a simple theory.

From top to bottom, they correspond to descriptions of human intelligence in more and more general level, and to reproduce that description in computer systems. Since different descriptions have different granularity and scope, the above objectives are related, but still very different, and do not subsume each other. The best way to achieve one is usually not a good choice for the others. [A more detailed discussion of this issue can be found here.]

Not only the "I" in AGI has different understanding, even the "G" has been interpreted differently, as referring to AI systems that
  1. Can solve all problems — though no AGI researcher has taken this position, such a 'strawman' target has been used by some people to claim the impossibility of AGI,
  2. Can solve all human-solvable problems — this is basically why the Large Language Models (LLMs) are considered as AGI,
  3. Can solve all computable problems — this is roughly why models like AIXI are considered as AGI,
  4. Can try to solve all representable problems — this is roughly why models like NARS are considered as AGI.
These "AGIs" are after different goals, as clearly shown in their roadmaps: Because of this diversity in research objectives, the achievements of LLMs have not dominated the current AGI research (as shown in the annal AGI conferences and the Journal of AGI), though to many people outside this research community, "AGI" means "LLM".

Limitations and objections

Since the idea of AI or "thinking machine" appeared, there have been various objections against its possibility. Some people claimed that they have proved that AGI, or whatever it is called, is theoretically impossible, due to certain fundamental limitations of computers.

Many researchers have argued against these objections. Classical arguments can be found in the following works:

Obviously, all AGI researchers believe that AGI can be achieved (though they have different interpretations to the term). In the introductory chapter of the AGI 2006 Workshop Proceedings, I and Ben Goertzel responded to the following common doubts and objections of this research:
  • AGI is impossible.
  • There is no such a thing as general intelligence.
  • General-purpose systems are not as good as special-purpose ones.
  • AGI is already included in the current AI.
  • It is too early to work on AGI.
  • AGI is nothing but hype.
  • AGI research is not fruitful.
  • AGI is dangerous.

Some of the doubts about the possibility of AGI come from misconceptions on what AGI attempts to achieve or what computers can do. The previous subsection has clarified the former issue, while an analysis of the latter issue can be found here.

Strategies and techniques

On one hand, the ultimate goal of AGI is to reproduce intelligence as a whole, while on the other hand, engineering practice must be step-by-step. To resolve this dilemma, three overall strategies have been proposed:
  • Hybrid
    Approach: To develop individual functions first (using different theories and techniques), then to connect them together.
    Argument: (AA)AI: More than the Sum of Its Parts, Ronald Brachman
    Difficulty: Compatibility of the theories and techniques
  • Integrated
    Approach: To design an architecture first, then to design its modules (using various techniques) accordingly.
    Argument: Cognitive Synergy: A Universal Principle for Feasible General Intelligence?, Ben Goertzel
    Difficulty: Isolation, specification, and coordination of the functions
  • Unified
    Approach: Using a single technique to start from a core system, then to extend and augment it incrementally.
    Argument: Toward a Unified Artificial Intelligence, Pei Wang
    Difficulty: Versatility and extensibility of the core technique
Obviously, the selection of development strategy partially depends on the selection of the research objective.

At the current time, the major techniques used in AGI projects include, though are not limited to:
  • logic
  • probability theory
  • production system
  • graph theory
  • knowledge base
  • learning algorithm
  • neural network
  • evolutionary computation
  • robotics
  • multi-agent system
Though each of these techniques is also explored in mainstream AI, to use it in a general-purpose system leads to very different design decisions in technical details.

The ethics of AGI

Even if we have found out how to achieve AGI, it does not necessarily mean we really want to do it. Like all major scientific discoveries and technical breakthroughs, AGI has the potential to revolutionize our life and even the fate of the human species, either in a desired way or an undesired way — or, as things usually go, a mixture of the two.

AGI researchers are aware of their responsibility on this topic, though most of them think that, according to the currently available evidence, progress in AGI research will benefit the human species, rather than to destroy it. Discussions on how to make AGI "safe" have existed in AGI meetings since the very beginning. Sample discussions include

Of course, many crucial problems remain open, but to find their solutions, the research of AGI should be speed up, rather than slowed down. Once again, some wide-spreading concerns and fears about AGI are based on misconceptions about the nature of AGI.

Representative AGI Projects

The following projects are selected to represent the current AGI research, as for each of them, it can be said that
  1. It is clearly oriented to AGI (that is why IBM's Watson and DeepMind's AlphaGo are not included)
  2. It is still very active (that is why Pollock's OSCAR and Brooks' Cog are no longer included)
  3. It has ample publications on technical details (that is why many recent AGI projects are not included yet, except GPT-4 that is used to represent various deep learning projects toward AGI)

The projects are listed in alphabetical order. Each project name is linked to the project website, where the following quotations are extracted. The focus of the quotations is on the research goal (the 1st question) and technical path (the 3rd question). Two publications on the project are selected, usually one brief introduction and one detailed description.

ACT-R [An Integrated Theory of the Mind; The Atomic Components of Thought]
ACT-R is a cognitive architecture: a theory for simulating and understanding human cognition. Researchers working on ACT-R strive to understand how people organize knowledge and produce intelligent behavior. As the research continues, ACT-R evolves ever closer into a system which can perform the full range of human cognitive tasks: capturing in great detail the way we perceive, think about, and act on the world.

On the exterior, ACT-R looks like a programming language; however, its constructs reflect assumptions about human cognition. These assumptions are based on numerous facts derived from psychology experiments. Like a programming language, ACT-R is a framework: for different tasks (e.g., Tower of Hanoi, memory for text or for list of words, language comprehension, communication, aircraft controlling), researchers create models (aka programs) that are written in ACT-R and that, beside incorporating the ACT-R's view of cognition, add their own assumptions about the particular task. These assumptions can be tested by comparing the results of the model with the results of people doing the same tasks.

ACT-R is a hybrid cognitive architecture. Its symbolic structure is a production system; the subsymbolic structure is represented by a set of massively parallel processes that can be summarized by a number of mathematical equations. The subsymbolic equations control many of the symbolic processes. For instance, if several productions match the state of the buffers, a subsymbolic utility equation estimates the relative cost and benefit associated with each production and decides to select for execution the production with the highest utility. Similarly, whether (or how fast) a fact can be retrieved from declarative memory depends on subsymbolic retrieval equations, which take into account the context and the history of usage of that fact. Subsymbolic mechanisms are also responsible for most learning processes in ACT-R.

AERA [Anytime Bounded Rationality; Autocatalytic Endogenous Reflective Architecture]
AERA is a cognitive architecture - and a blueprint - for constructing agents with high levels of operational autonomy, starting from only a small amount of designer-specified code – a seed. Using a value-driven dynamic priority scheduling to control the parallel execution of a vast number of lines of reasoning, the system accumulates increasingly useful models of its experience, resulting in recursive self-improvement that can be autonomously sustained after the machine leaves the lab, within the boundaries imposed by its designers.

AERA demonstrates domain-independent self-supervised cumulative learning of complex tasks. Unlike contemporary AI systems, AERA-based agents excel at handling novelty - situations, information, data, tasks - that their programmers could not anticipate. It is the only implementable / implemented system in existence for achieving bounded recursive self-improvement.

AERA-based agents learn cumulatively from experience by interacting with the world and generating compositional causal-relational micro-models of its experience. Using non-axiomatic abduction and deduction, it constantly predicts how to achieve its active goals and what the future may hold, generating a flexible opportunistically-interruptable plan for action.

AIXI [Universal Algorithmic Intelligence: A mathematical top->down approach; Universal Artificial Intelligence]
An important observation is that most, if not all known facets of intelligence can be formulated as goal driven or, more precisely, as maximizing some utility function.

Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible.

The major drawback of the AIXI model is that it is uncomputable, ... which makes an implementation impossible. To overcome this problem, we constructed a modified model AIXItl, which is still effectively more intelligent than any other time t and length l bounded algorithm.

Cyc [Cyc: A Large-Scale Investment in Knowledge Infrastructure; Building Large Knowledge-Based Systems]
Vast amounts of commonsense knowledge, representing human consensus reality, would need to be encoded to produce a general AI system. In order to mimic human reasoning, Cyc would require background knowledge regarding science, society and culture, climate and weather, money and financial systems, health care, history, politics, and many other domains of human experience. The Cyc Project team expected to encode at least a million facts spanning these and many other topic areas.

The Cyc knowledge base (KB) is a formalized representation of a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday life. The medium of representation is the formal language CycL. The KB consists of terms -- which constitute the vocabulary of CycL -- and assertions which relate those terms. These assertions include both simple ground assertions and rules.

GPT-4 [GPT-4 Technical Report; Sparks of Artificial General Intelligence]
We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.

The combination of the generality of GPT-4's capabilities, with numerous abilities spanning a broad swath of domains, and its performance on a wide spectrum of tasks at or beyond human-level, makes us comfortable with saying that GPT-4 is a significant step towards AGI.

HTM [Hierarchical Temporal Memory; On Intelligence]
At the core of every Grok model is the Cortical Learning Algorithm (CLA), a detailed and realistic model of a layer of cells in the neocortex. Contrary to popular belief, the neocortex is not a computing system, it is a memory system. When you are born, the neocortex has structure but virtually no knowledge. You learn about the world by building models of the world from streams of sensory input. From these models, we make predictions, detect anomalies, and take actions.

In other words, the brain can best be described as a predictive modeling system that turns predictions into actions. Three key operating principles of the neocortex are described below: sparse distributed representations, sequence memory, and on-line learning.

LIDA [The LIDA Architecture; LIDA Tutorial]
Implementing and fleshing out a number of psychological and neuroscience theories of cognition, the LIDA conceptual model aims at being a cognitive "theory of everything." With modules or processes for perception, working memory, episodic memories, "consciousness," procedural memory, action selection, perceptual learning, episodic learning, deliberation, volition, and non-routine problem solving, the LIDA model is ideally suited to provide a working ontology that would allow for the discussion, design, and comparison of AGI systems. The LIDA technology is based on the LIDA cognitive cycle, a sort of "cognitive atom." The more elementary cognitive modules play a role in each cognitive cycle. Higher-level processes are performed over multiple cycles.

The LIDA architecture represents perceptual entities, objects, categories, relations, etc., using nodes and links .... These serve as perceptual symbols acting as the common currency for information throughout the various modules of the LIDA architecture.

MicroPsi [The MicroPsi Agent Architecture; Principles of Synthetic Intelligence]
The MicroPsi agent architecture describes the interaction of emotion, motivation and cognition of situated agents, mainly based on the Psi theory of Dietrich Dorner. The Psi theory addresses emotion, perception, representation and bounded rationality, but being formulated within psychology, has had relatively little impact on the discussion of agents within computer science. MicroPsi is a formulation of the original theory in a more abstract and formal way, at the same time enhancing it with additional concepts for memory, building of ontological categories and attention.

The agent framework uses semantic networks, called node nets, that are a unified representation for control structures, plans, sensory and action schemas, Bayesian networks and neural nets. Thus it is possible to set up different kinds of agents on the same framework.

NARS [Intelligence: From Definition to Design; Rigid Flexibility: The Logic of Intelligence]
What makes NARS different from conventional reasoning systems is its ability to learn from its experience and to work with insufficient knowledge and resources. NARS attempts to uniformly explain and reproduce many cognitive facilities, including reasoning, learning, planning, etc, so as to provide a unified theory, model, and system for AI as a whole. The ultimate goal of this research is to build a thinking machine.

The development of NARS takes an incremental approach consisting four major stages. At each stage, the logic is extended to give the system a more expressive language, a richer semantics, and a larger set of inference rules; the memory and control mechanism are then adjusted accordingly to support the new logic.

In NARS the notion of "reasoning" is extended to represent a system's ability to predict the future according to the past, and to satisfy the unlimited resources demands using the limited resources supply, by flexibly combining justifiable micro steps into macro behaviors in a domain-independent manner.

OpenCog [The General Theory of General Intelligence: A Pragmatic Patternist Perspective; Engineering General Intelligence, Part 1 and Part 2]
OpenCog, as a software framework, aims to provide research scientists and software developers with a common platform to build and share artificial intelligence programs. The long-term goal of OpenCog is acceleration of the development of beneficial AGI.

OpenCogPrime is a specific AGI design being constructed within the OpenCog framework. It comes with a fairly detailed, comprehensive design covering all aspects of intelligence. The hypothesis is that if this design is fully implemented and tested on a reasonably-sized distributed network, the result will be an AGI system with general intelligence at the human level and ultimately beyond.

While an OpenCogPrime based AGI system could do a lot of things, we are initially focusing on using OpenCogPrime to control simple virtual agents in virtual worlds. We are also experimenting with using it to control a Nao humanoid robot. See http://novamente.net/example for some illustrative videos.

Sigma [Lessons from Mapping Sigma onto the Standard Model of the Mind; The Sigma Cognitive Architecture and System]
The goal of this effort is to develop a sufficiently efficient, functionally elegant, generically cognitive, grand unified, cognitive architecture in support of virtual humans (and hopefully intelligent agents/robots – and even a new form of unified theory of human cognition – as well).

Our focus is on the development of the Sigma (∑) architecture, which explores the graphical architecture hypothesis that progress at this point depends on blending what has been learned from over three decades worth of independent development of cognitive architectures and graphical models, a broadly applicable state-of-the-art formalism for constructing intelligent mechanisms. The result is a hybrid (discrete+continuous) mixed (symbolic+probabilistic) approach that has yielded initial results across memory and learning, problem solving and decision making, mental imagery and perception, speech and natural language, and emotion and attention.
SNePS [The GLAIR Cognitive Architecture; SNePS Tutorial]
The long term goal of the SNePS Research Group is to understand the nature of intelligent cognitive processes by developing and experimenting with computational cognitive agents that are able to use and understand natural language, reason, act, and solve problems in a wide variety of domains.

The SNePS knowledge representation, reasoning, and acting system has several features that facilitate metacognition in SNePS-based agents. The most prominent is the fact that propositions are represented in SNePS as terms rather than as logical sentences. The effect is that propositions can occur as arguments of propositions, acts, and policies without limit, and without leaving first-order logic.

Soar [A Gentle Introduction to Soar; The Soar Cognitive Architecture]

The ultimate in intelligence would be complete rationality which would imply the ability to use all available knowledge for every task that the system encounters. Unfortunately, the complexity of retrieving relevant knowledge puts this goal out of reach as the body of knowledge increases, the tasks are made more diverse, and the requirements in system response time more stringent. The best that can be obtained currently is an approximation of complete rationality. The design of Soar can be seen as an investigation of one such approximation.

For many years, a secondary principle has been that the number of distinct architectural mechanisms should be minimized. Through Soar 8, there has been a single framework for all tasks and subtasks (problem spaces), a single representation of permanent knowledge (productions), a single representation of temporary knowledge (objects with attributes and values), a single mechanism for generating goals (automatic subgoaling), and a single learning mechanism (chunking). We have revisited this assumption as we attempt to ensure that all available knowledge can be captured at runtime without disrupting task performance. This is leading to multiple learning mechanisms (chunking, reinforcement learning, episodic learning, and semantic learning), and multiple representations of long-term knowledge (productions for procedural knowledge, semantic memory, and episodic memory).

Two additional principles that guide the design of Soar are functionality and performance. Functionality involves ensuring that Soar has all of the primitive capabilities necessary to realize the complete suite of cognitive capabilities used by humans, including, but not limited to reactive decision making, situational awareness, deliberate reasoning and comprehension, planning, and all forms of learning. Performance involves ensuring that there are computationally efficient algorithms for performing the primitive operations in Soar, from retrieving knowledge from long-term memories, to making decisions, to acquiring and storing new knowledge.

A rough classification

The above AGI projects are roughly classified in the following table, according to the type of their answers to the previously listed 1st question (on research goal) and 3rd question (on technical path).

goal \ path hybrid integrated unified
principle
AERA, AIXI, NARS
function
OpenCog, Sigma, Soar SNePS
capability

Cyc
behavior
ACT-R, LIDA, MicroPsi GPT-4
structure

HTM

Since this classification is made at a high level, projects in the same entry of the table are still quite different in the details of their research goals and technical paths.

In summary, the current AGI projects are based on very different theories and techniques.

AGI Literatures and Resources

AGI collections:

The annual AGI international conference series was started in 2008. The conference websites link to all accepted papers, plus additional materials like presentation files and video records.

Journal of Artificial General Intelligence (JAGI) is a peer-reviewed journal with open access, started in 2009.

The AGI conference and journal are managed by the Artificial General Intelligence Society (AGIS). Everyone interested in AGI can become a member.

Communication venues and social media dedicated to AGI or related research:

Educational materials for students:

Other AGI resources: