The Bayesian Brain and Predictive Processing: A
Critique
Joseph Wayne Smith and N. Stocks
(Medical Xpress, 2024)
1. Introduction
The Bayesian brain hypothesis and the predictive processing framework have become
central to contemporary discussions of perception, cognition, and action. In this essay,
we outline these influential approaches, exploring their mathematical roots, empirical
grounding, and implications for neuroscience and cognitive science, while also noting
the challenges they face. The Bayesian brain idea views the mind as a statistical
inference system, continually updating its understanding of the world through
probabilistic reasoning. Predictive processing extends this logic, proposing that the
brain functions as a prediction engine that constantly works to minimize errors
between its expectations and sensory input. Taken together, the two frameworks aim
to provide a unified explanation of how perception, learning, and action interact, all
expressed through the mathematics of inference and information theory. Yet, as will
be argued, this unification—though elegant—raises questions about computational
tractability, empirical specificity, and the limits of theoretical reach.
1
2. Part A: Exposition
2.1 From Passive Reception to Active Inference
Earlier models of perception tended to treat the brain as a passive receiver of sensory
information, building up internal representations from the raw data of the senses. This
bottom-up view, dominant throughout much of the twentieth century (Marr, 2010),
imagined perception as a stepwise construction of complex mental images from
simple features. However, such models struggled to explain persistent puzzles—why
illusions occur, how prior knowledge shapes what we see, and how the brain
produces coherent experience from fragmentary or noisy data.
In response, a very different idea took shape: the Bayesian brain hypothesis. It
suggests that the brain is not a passive receiver at all, but an active inference machine,
constantly forming and updating beliefs about the world (Knill and Richards, 1996;
Rao and Ballard, 1999). Perception, on this view, is an inferential process—an ongoing
negotiation between what we expect and what we actually sense, governed by the
rules of probability.
Building on this foundation, Karl Friston and others developed the predictive
processing framework, which claims that the brain’s central function is prediction itself
(Friston, 2005, 2010; Clark, 2013, 2016). The brain, they argue, generates internal
models that anticipate incoming sensory signals and adjusts those models when the
predictions fail. In short, perception and learning are not so much about recording the
world as about constantly guessing and correcting.
Over the past two decades, these ideas have inspired extensive empirical and
theoretical work across neuroscience and psychology. Researchers have explored their
implications for perception, action, attention, consciousness, and even mental illness.
The result is a body of work that aspires to unify many aspects of mind under a single
computational principle.
The purpose of this essay is to examine these claims systematically. We review
the mathematical and conceptual foundations of the Bayesian brain and predictive
processing approaches, survey the evidence that supports them, and finally consider
the problems and limits that come with such a grand theoretical vision.
2
2.2 Mathematical Foundations: Bayesian Inference and the Brain
Bayes’s Theorem and Probabilistic Reasoning
At the core of the Bayesian brain hypothesis lies Bayes’s theorem, a simple yet
remarkably powerful rule for updating beliefs when new information arrives. In its
familiar form, the theorem states:
Bayes’s Theorem
P(H|E) = [P(E|H) × P(H)] / P(E)
Here, P(H|E) is the posterior probability—our revised belief about hypothesis H after
seeing the evidence E. The term P(E|H) is the likelihood, expressing how probable the
evidence would be if H were true. P(H) represents the prior belief before new evidence
appears, and P(E) is the overall probability of the evidence itself.
Applied to perception, this framework suggests that the brain treats the world
as a set of competing hypotheses and sensory inputs as evidence that weighs for or
against each. The perceptual task, then, is to estimate which state of the world is most
likely, given noisy and ambiguous data. This idea, developed in foundational work
by Knill and Pouget (2004), recasts perception as an inferential process governed by
probability rather than by deterministic feature extraction.
Hierarchical Bayesian Models
Real-world perception is far too complex to be handled by a single level of inference.
Hierarchical Bayesian models therefore propose that the brain operates through layers
of representation, where higher levels encode abstract, general patterns that generate
predictions for the lower levels. Each level attempts to anticipate the one beneath it,
while discrepancies—prediction errors—are passed upward to refine the higher-level
models (Lee and Mumford, 2003; Yuille and Kersten, 2006).
This notion fits neatly with what is known about cortical organization. The
brain’s anatomy reveals an abundance of feedback pathways: higher cortical areas
send far more signals back down than they receive from below. Such asymmetry
implies that perception involves substantial top-down influence, not merely bottom-
up construction (Mumford, 1992; Felleman and Van Essen, 1991).
In this view, perception is an ongoing conversation between levels of the brain’s
hierarchy. Each level proposes a prediction; each lower level either confirms it or sends
back an error signal. Over time, the entire system converges on a stable interpretation
of the world.
3
Predictive Coding
Predictive coding translates these mathematical ideas into a potential neural
mechanism (Rao and Ballard, 1999; Friston, 2005). The story goes as follows: higher
cortical layers send predictions downward; lower layers compare those predictions
with actual inputs; the mismatch between the two—the prediction error—is then
transmitted upward to adjust the model.
This creates a continuous loop: predictions flowing down, errors flowing up.
According to Bastos et al., distinct populations of neurons may even specialize in
carrying these two kinds of information, with feedback connections carrying the
predictions and feedforward connections carrying the error signals (Bastos et al., 2012)
Seen from this perspective, the brain’s apparent complexity begins to make
sense as an adaptive network striving to minimize discrepancy between expectation
and sensation. The process is dynamic and recursive, rather than linear. Each act of
perception becomes a kind of hypothesis test, a negotiation between the brain’s best
guess and the sensory evidence that corrects it.
2.3 Core Principles of Predictive Processing: The Brain as a Prediction Machine
Predictive processing extends Bayesian reasoning into a comprehensive account of
brain function (Clark, 2013, 2016; Hohwy, 2013). The central idea is that the brain’s
primary role is to minimize prediction error—the mismatch between expected and
actual sensory input. By reducing this discrepancy, the brain continually refines its
internal model of the world, allowing it to anticipate future events more accurately.
This minimization occurs through two complementary strategies. Perceptual
inference updates internal models to better match incoming sensory information,
effectively changing beliefs to fit the world. Active inference, on the other hand,
involves acting to align the world with predictions, thereby changing sensory input to
match expectations. Together, these mechanisms unify perception and action within
a single predictive framework. Perception “explains away” sensory data by improving
predictions, while action generates sensory outcomes that confirm them (Friston et al.,
2009).
Precision Weighting and Attention
A key feature of predictive processing is precision weighting—the system’s assessment
of how reliable its predictions are relative to incoming sensory signals (Feldman and
Friston, 2010). Predictions with high precision (low uncertainty) dampen the influence
of errors, while low-precision predictions (high uncertainty) allow errors to drive
significant model updates.
4
This mechanism naturally accounts for attention. Attention can be seen as
selectively increasing the precision of prediction errors in specific domains,
heightening sensitivity to unexpected events while reducing sensitivity to irrelevant
information (Feldman and Friston, 2010; Hohwy, 2012). In essence, attention reflects
the brain’s tuning of its error signals, aligning with information-theoretic perspectives
that define attention as optimizing precision.
The Free Energy Principle
Karl Friston’s free energy principle extends predictive processing into a general theory
of biological systems (Friston, 2010; Friston et al., 2017). According to this principle,
all living organisms act to minimize variational free energy, an information-theoretic
bound on surprise, or the negative log probability of sensory data given an internal
model.
To reduce free energy, a system can:
• Refine its predictions (perceptual inference),
• Act to shape sensory input (active inference), and
• Update its internal model through learning.
This principle unifies perception, action, and learning as complementary strategies for
maintaining coherence with the environment (Friston, 2010).
2.4 Empirical Evidence for Bayesian and Predictive Processing
Psychophysical Studies
Many psychophysical experiments support Bayesian principles in perception.
Cue Integration: When multiple sensory cues provide information about the
same property—such as depth from stereo disparity and texture gradients—
humans combine them in a way that approximates statistically optimal
Bayesian integration, weighting each cue by its reliability (Ernst and Banks,
2002; Alais and Burr, 2004).
Prior Effects: Prior expectations influence perception. The light-from-above prior,
for example, causes ambiguous shading to appear convex when illuminated
from above and concave when lit from below (Mamassian and Goutcher, 2001).
Motion perception also incorporates priors favoring slow, smooth movement,
5
reflecting assumptions about natural motion in the environment (Weiss,
Simoncelli, and Adelson, 2002).
Contour Integration: The brain integrates discrete edge elements into
continuous contours according to the likelihood that these elements belong
together, consistent with Bayesian predictions (Geisler et al., 2001).
Neurophysiological Evidence
Neural studies provide direct evidence for predictive coding mechanisms.
Prediction Error Signals: Neurons in auditory cortex respond strongly to
unexpected sounds but less to predictable ones, indicating that neural activity
encodes prediction errors rather than raw input (Winkler et al.,1996; Garrido et
al., 2009).
Hierarchical Processing: fMRI studies show prediction errors propagating
upward through cortical hierarchies while predictions flow downward. Higher
cortical areas respond more to surprising events, while lower areas encode the
discrepancy between predicted and actual input (den Ouden, Kok, and de
Lange, 2012).
Repetition Suppression: Reduced neural activity to repeated stimuli—
repetition suppression—aligns with predictive coding, as expected stimuli
generate smaller prediction errors (Summerfield et al., 2008; Kok, et al., 2012).
Neuroimaging Studies
Neuroimaging studies further support hierarchical predictive processing.
Predictive Context Effects: The fusiform face area shows reduced activation to
faces predicted by context, consistent with reduced prediction error (Egner et
al., 2010).
Violation Responses: Unexpected violations of learned patterns activate
prefrontal and parietal regions, reflecting hierarchical prediction error
signaling (Bubic et al., 2010).
Precision Manipulation: Manipulating uncertainty systematically modulates
neural responses, with increased uncertainty amplifying the processing of
prediction errors (Hesselmann et al., 2010).
6
2.5 Applications to Cognitive Phenomena
Perception and Illusions
Predictive processing explains perceptual illusions as cases where strong expectations
override ambiguous sensory input. The hollow-mask illusion, in which a concave mask
appears convex, demonstrates how robust priors about facial structure dominate
conflicting evidence (Gregory, 1980). Similarly, bistable phenomena like the Necker
cube arise from competing interpretations producing comparable prediction errors,
leading to perceptual alternation (Hohwy et al., 2008).
Action and Motor Control
Through active inference, motor control is recast as prediction-driven. The brain
anticipates sensory consequences of movements, particularly proprioceptive feedback,
and reflexes act to minimize the resulting prediction errors (Friston et al., 2009). This
framework elegantly accounts for motor learning, adaptation, and sensorimotor
integration. Disruptions in prediction or precision weighting can explain disorders
such as apraxia or other motor impairments (Edwards et al., 2012).
Learning and Development
Learning involves updating the generative model to reflect causal regularities in the
environment. Synaptic plasticity and structural changes reduce long-term prediction
error (Friston, 2010). Development represents the refinement of generative models
over time. Early in life, priors are imprecise, allowing rapid learning. As experience
accumulates, models stabilize, slowing learning and increasing predictive confidence
(Gopnik and Wellman, 2012).
Attention and Consciousness
Attention emerges from precision weighting, selectively amplifying prediction errors
in relevant domains (Feldman and Friston, 2010). Bottom-up attention reflects
unexpected error signals, while top-down attention represents goal-directed precision
modulation. Conscious experience may correspond to high-level predictions that best
account for lower-level sensory input, with disorders of consciousness arising from
failures in hierarchical prediction or precision regulation (Hohwy, 2013; Clark, 2013).
7
2.6 Applications to Psychopathology
Schizophrenia and Psychosis
Psychotic symptoms can be understood as disturbances in prediction error signalling
and precision weighting (Fletcher and Frith, 2009; Adams et al., 2013). Hallucinations
can arise when internally generated predictions are assigned excessive precision,
experienced as external events. Delusions can result from attempts to explain
anomalous prediction errors. Dopamine appears to encode precision, and
dysregulation can lead to aberrant salience of irrelevant stimuli (Kapur, 2003; Corlett
et al., 2009).
Autism Spectrum Disorders
Autism has been conceptualized as overly precise sensory prediction errors, combined
with imprecise higher-level predictions, leading to heightened sensory sensitivity and
difficulty forming contextual expectations (Pellicano and Burr, 2012; Van de Cruys et
al., 2014). This “hypo-priors” account helps explain detail-focused processing, social
prediction difficulties, and challenges in interpreting environmental regularities.
Anxiety and Depression
Anxiety can reflect over-precise threat priors or excessive uncertainty, driving
hypervigilance (Paulus and Stein, 2006). Depression may involve flattened precision
for positive prediction errors, reducing responsiveness to rewarding outcomes and
reinforcing low mood (Clark et al., 2018).
2.7 Theoretical Implications and Unification
Unifying Perception and Action
Predictive processing integrates perception and action into a single framework, where
both aim to minimize prediction error (Friston et al., 2009). Perception updates
internal models, and action modifies sensory input to meet predictions. Motor
commands can thus be viewed as predicted proprioceptive states that reflexes
implement, resolving traditional questions about sensorimotor coordination.
Connecting Neuroscience and Cognitive Science
The framework bridges computational, algorithmic, and neural levels (Marr, 1982),
linking Bayesian inference, predictive coding algorithms, and hierarchical cortical
structures. Minimizing prediction error aligns with minimizing free energy, connecting
8
cognitive processes to both neural activity and thermodynamic principles (Friston,
2010).
Evolutionary and Developmental Perspectives
From an evolutionary perspective, minimizing surprise ensures survival, favoring
accurate generative models (Friston et al., 2017). Development mirrors this process
ontogenetically: early life emphasizes exploration and high learning rates, while later
stages exploit stabilized models for efficient prediction and action.
2.8 Summary
The Bayesian brain hypothesis and predictive processing framework fundamentally
shift how we understand cognition. By portraying the brain as an active, anticipatory
system, these theories unify perception, action, learning, attention, and
psychopathology. Empirical studies across psychophysics, neurophysiology, and
neuroimaging consistently support prediction error minimization, hierarchical
processing, and precision weighting.
Predictive processing provides a principled bridge between computational
theory, neural implementation, and observable behavior, reframing classical problems
in perception, motor control, development, and consciousness. Clinical conditions
such as schizophrenia, autism, anxiety, and depression can be interpreted in terms of
disrupted prediction or imbalanced precision weighting.
Overall, this theory portrays the brain as a proactive system, continuously
striving to reduce surprise and maintain coherence with the environment, offering a
unified, mathematically grounded framework that connects theory, neural dynamics,
and lived experience.
3. Part B: Critique
3.1 Critical Challenges to Bayesian Brain and Predictive Processing Frameworks
While Bayesian brain and predictive processing frameworks have gained substantial
followings and influence in cognitive neuroscience, they face significant theoretical,
empirical, and philosophical challenges. We now examine fundamental critiques
including: the computational intractability of Bayesian inference in realistic neural
systems; lack of clear neural mechanisms for implementing probabilistic
computations; unfalsifiability concerns stemming from post-hoc explanatory
flexibility; alternative explanations for supposedly supportive evidence, neglect of
action-first and embodied approaches; reductionism about consciousness and
9
phenomenology, and conceptual confusion between descriptive and mechanistic
claims. These challenges suggest that while Bayesian and predictive processing
approaches offer valuable heuristics, their status as fundamental theories of brain
function remains highly questionable.
The Limits of Theoretical Unification
The Bayesian brain hypothesis and predictive processing framework have achieved
remarkable influence, offering seemingly unified accounts of perception, action,
learning, attention, and psychopathology. However, this theoretical unification may
come at the cost of empirical specificity and mechanistic clarity. As these frameworks
have expanded to encompass increasingly diverse phenomena, critical questions have
emerged about their explanatory power, falsifiability, and relationship to actual
neural mechanisms.
We examine major challenges to Bayesian and predictive processing
approaches, organized into computational, empirical, conceptual, and philosophical
categories. Rather than dismissing these frameworks entirely, this critique aims to
clarify their limitations and identify where claims exceed supporting evidence.
3.2 Computational Intractability and the Tractability Problem
The Curse of Dimensionality
A fundamental challenge for Bayesian brain theories concerns computational
tractability. Exact Bayesian inference is computationally intractable for realistic
problems involving high-dimensional state spaces and complex generative models
(Kwisthout, Wareham, and van Rooij, 2011). The number of competing possible
hypotheses grows exponentially with the dimensionality of the problem, creating
what computer scientists call “the curse of dimensionality.”
Consider visual perception: estimating the three-dimensional structure of a
scene from two-dimensional retinal images involves solving an inverse problem with
astronomical numbers of possible interpretations. Even with hierarchical structure,
the combinatorial explosion of possible hypotheses at multiple levels makes exact
Bayesian inference computationally prohibitive (Yuille and Kersten, 2006; Bowers and
Davis, 2012).
Proponents respond that the brain implements approximate Bayesian inference
using sampling methods, variational approximations, or heuristics (Sanborn and
Chater, 2016). However, this response weakens the theory's explanatory power. If the
brain uses approximations that deviate systematically from optimal Bayesian
inference, then behavioral data showing suboptimal performance cannot distinguish
10
between genuine Bayesian computation and alternative non-Bayesian mechanisms
that produce similar outputs (Jones and Love, 2011).
The Problem of Priors
Bayesian inference requires specifying prior probability distributions over hypotheses.
But where do these priors come from, and how are they represented neurally? The
theory faces a dilemma: either priors are innate (raising evolutionary implausibility
for highly specific distributions), or they are learned from experience (creating
circularity, as learning itself requires priors) (Bowers and Davis, 2012).
Moreover, realistic predictive processing models require structured, often
hierarchical priors that encode sophisticated knowledge about causal structure in the
environment. How such complex probabilistic knowledge is acquired, represented,
and updated neurally remains deeply unclear (Marcus and Davis, 2013). The
specification of appropriate priors often requires substantial domain expertise, raising
questions about whether unaided neural learning could discover such priors.
3.3 Neural Implementation Mysteries
Perhaps the most serious computational challenge concerns neural implementation.
Despite decades of research, no clear neural mechanisms have been identified for
representing and manipulating probability distributions (Rahnev and Denison, 2018).
How do neurons encode probability distributions? How are Bayesian computations—
multiplication of likelihoods and priors, normalization by marginal probabilities—
implemented in neural circuits?
Probabilistic population codes have been proposed as implementation
mechanisms (Ma, Beck, Latham, and Pouget, 2006), but these proposals face
difficulties. Population codes require precise tuning of neural variability and
correlation structures that may not exist in real neural populations (Beck, Ma, Pitkow,
Latham, and Pouget, 2012). Alternative proposals invoke sampling mechanisms, but
these face their own implementation challenges and timing problems (Orbán, Berkes,
Fiser, and Lengyel, 2016).
3.4 Empirical Challenges and Alternative Explanations
Underdetermination and Flexibility
A pervasive criticism of Bayesian and predictive processing frameworks concerns
their explanatory flexibility. Because these theories involve multiple free parameters
(prior distributions, likelihood functions, precision weightings), they can be fit to
11
almost any behavioral or neural data after the fact (Bowers and Davis, 2012; Jones and
Love, 2011).
This flexibility undermines falsifiability. When predictions fail, defenders can
always invoke different priors, alternative precision weightings, or approximations
that deviate from optimal inference. The theory becomes unfalsifiable—capable of
accommodating any outcome through parameter adjustment (Marcus and Davis,
2013).
Consider schizophrenia: predictive processing accounts explain hallucinations
as excessive precision on internally generated predictions (Adams et al., 2013), but
delusions as failures to update beliefs despite prediction errors. The theory thus
explains both excessive and insufficient belief updating as manifestations of the same
underlying framework, raising questions about what evidence could falsify these
accounts.
Alternative Explanations for Key Evidence
Much evidence cited in support of Bayesian and predictive processing admits
alternative explanations:
Cue Integration: While optimal cue integration appears to support Bayesian
inference (Ernst and Banks, 2002), simple weighted averaging mechanisms
without probabilistic representations can produce similar behavior (Landy,
Banks, and Knill, 2011). The match to Bayesian optimality may reflect task-
specific learning rather than general Bayesian principles.
Repetition Suppression: Neural adaptation to repeated stimuli, interpreted as
reduced prediction error (Summerfield et al., 2008), can equally be explained
by fatigue, habituation, or resource optimization without invoking predictions
(Grotheer and Kovács, 2016). Single-cell recordings show repetition
suppression even in early sensory areas where predictive coding accounts seem
implausible.
Contextual Effects: Prior knowledge effects on perception, often cited as
Bayesian priors, could reflect simpler associative mechanisms or learned
affordances without probabilistic computation (Bowers and Davis, 2012).
Associative learning can produce behavior superficially resembling Bayesian
inference without implementing probabilistic calculations.
12
Failures of Bayesian Optimality
Extensive research documents systematic deviations from Bayesian optimality in
human judgment and perception (Kahneman, 2011; Gigerenzer and Gaissmaier, 2011).
People show base-rate neglect, conjunction fallacies, and numerous other systematic
biases that violate Bayesian principles. While some deviations can be explained as
rational responses to computational constraints, many biases appear robustly
irrational even when computational costs are minimal.
Moreover, developmental research shows children often fail to integrate
evidence in Bayesian-optimal ways, even in simple tasks (Téglás et al., 2011). If
Bayesian inference is fundamental to brain function, why does it emerge slowly and
incompletely through development?
3.5 Conceptual and Theoretical Problems
The Representation Problem
Predictive processing claims the brain represents hierarchical generative models, but
what does this representation consist in? The theory remains vague about what neural
states count as representing probability distributions versus merely correlating with
environmental statistics (Gładziejewski, 2016).
This vagueness allows predictive processing to avoid empirical constraints.
Any neural activity that varies with environmental statistics can be interpreted as
representing a generative model, making the theory difficult to distinguish from
claims that neural activity simply responds to statistical regularities (Williams, 2018).
The Dark Room Problem
If organisms minimize prediction error, why don’t they seek dark, unchanging rooms
where sensory input is perfectly predictable (Friston, Thornton, and Clark, 2012)?
Proponents respond that organisms have homeostatic setpoints requiring regular
sensory inputs (hunger, thirst, etc.), but this response undermines claims that
prediction error minimization is fundamental—it shows organisms pursue goals that
sometimes increase prediction error.
Active inference attempts to resolve this by claiming organisms sample sensory
data to reduce uncertainty about hidden states (Friston et al., 2012). However, this
adds complexity and raises new questions about how organisms balance exploration
(seeking informative prediction errors) with exploitation (confirming existing
predictions).
13
The Explanatory Span Problem
As the predictive processing model has been extended to explain increasingly diverse
phenomena, it risks becoming so general that it explains everything and therefore
nothing (Klein, 2018). When a theory explains perception, action, attention,
consciousness, emotion, learning, development, and psychiatric disorders through the
same core mechanism, skepticism is warranted about whether genuine explanatory
work is being done versus post-hoc redescription.
Different phenomena may require different explanatory frameworks rather
than forced unification under a single principle. The drive for theoretical parsimony
may obscure important mechanistic differences between perceptual inference, motor
control, and high-level cognition.
3.6 Alternative Frameworks and Neglected Perspectives
Ecological and Action-First Approaches
Ecological psychology, in the tradition of J.J. Gibson, challenges the assumption that
perception requires inference from impoverished sensory data (Gibson, 1979;
Chemero, 2009). According to this view, ambient energy arrays contain rich
information that specifies environmental properties directly, without requiring
probabilistic inference.
Predictive processing assumes poverty of the stimulus—that sensory data is
ambiguous and requires top-down disambiguation. But ecological approaches argue
that organisms actively sample informative aspects of structured environments,
reducing inferential demands (Noë, 2006). The theory may overestimate the brain’s
inferential burden by underestimating environmental information structure.
Enactive and sensorimotor approaches similarly emphasize action and
embodiment over internal representation (Varela et al., 1991; O'Regan and Noë, 2001).
Rather than constructing internal models, organisms enact perceptual experience
through skilled sensorimotor engagement with environments. These approaches
question whether rich internal generative models are necessary or whether simpler
sensorimotor contingencies suffice.
Simple Heuristics and Fast-and-Frugal Cognition
Research on ecological rationality demonstrates that simple heuristics often
outperform complex Bayesian computations in realistic environments (Gigerenzer
and Gaissmaier, 2011). Fast-and-frugal heuristics—simple decision rules that ignore
14
information—frequently match or exceed Bayesian performance while requiring
vastly less computation.
This suggests that apparent Bayesian optimality in behavior may reflect
evolution selecting simple heuristics that perform well in natural environments, rather
than implementing general Bayesian inference machinery (Todd and Gigerenzer,
2012). The brain may consist of many specialized systems using simple rules rather
than a unified Bayesian inference engine.
Reinforcement Learning Alternatives
Standard reinforcement learning (RL) provides alternative accounts of learning and
decision-making without requiring probabilistic inference over generative models
(Sutton and Barto, 2018). While predictive processing proponents argue that RL is a
special case of active inference (Friston et al., 2009), critics maintain that standard RL
better captures actual neural mechanisms in dopaminergic systems and basal ganglia
(Niv, 2009).
RL models make specific predictions about neural signals (reward prediction
errors) that are empirically supported, whereas predictive processing's claims about
neural precision weighting and hierarchical prediction errors remain more speculative
(Rescorla, 2016).
3.7 Philosophical and Phenomenological Critiques
The Phenomenological Inadequacy Problem
Phenomenological philosophers argue that predictive processing fundamentally
mischaracterizes conscious experience (Ratcliffe, 2008; Gallagher and Allen, 2016).
Conscious perception feels like direct openness to the world, not like unconscious
inference from sensory data. The lived experience of perception includes a sense of
presence and engagement that seems incompatible with treating perception as
hypothesis testing.
Merleau-Ponty’s phenomenology emphasizes the primacy of perceptual
engagement over cognitive representation (Merleau-Ponty, 1945/2012). Perception
involves skilful bodily engagement with meaningful environments, not
computational inference over abstract representations. Predictive processing might
intellectualize perception, mistaking scientific models for lived experience.
15
The Hard Problem Remains
Despite claims that predictive processing illuminates consciousness (Hohwy, 2013;
Clark, 2016), “the hard problem of consciousness”—why there is subjective experience
at all—remains untouched (Chalmers, 1996). Explaining how prediction error
minimization generates particular patterns of neural activity does not explain why
these patterns should feel like anything.
Predictive processing might confuse explanations of cognitive function (access
consciousness, reportability) with explanations of phenomenal consciousness
(subjective experience). The theory addresses the former while leaving the latter
mysterious, despite suggestions that it offers progress on consciousness (Seth, 2021).
Idealism and Anti-Realism Worries
Some critics argue that predictive processing implies problematic idealism or anti-
realism about the external world (Bruineberg, Kiverstein, and Rietveld, 2018). If
perception constructs reality from internal models rather than detecting mind-
independent features, what grounds our confidence in realism?
While proponents argue that prediction error keeps internal models anchored
to reality (Clark, 2016), critics maintain that this response doesn't fully address the
problem. If all we access are our own predictions, how do we know these predictions
track a mind-independent world rather than merely achieving internal coherence?
3.8 Evaluating Neural Evidence
Ambiguous Neural Signals
• Neural signals interpreted as prediction errors could alternatively reflect:
• Novelty detection: Responses to unexpected stimuli without predictive coding.
• Attention effects: Enhanced processing of surprising events through different
mechanisms.
• Memory mismatch: Comparison with memory traces rather than predictions.
• Adaptation: Habituation to repeated stimuli without prediction.
Single-cell and population recordings rarely provide sufficient detail to distinguish
these alternatives (Heilbron and Chait, 2018). The interpretation of neural responses
16
as prediction errors often reflects theoretical commitment rather than empirical
necessity.
Missing Neural Evidence
Despite decades of research, several key predictions of predictive coding remain
unsupported:
Separate error and prediction units: Clear anatomical separation of error and
prediction neurons has not been consistently demonstrated (Keller and Mrsic-
Flogel, 2018).
Precision modulation mechanisms: Neural mechanisms for implementing
precision weighting remain unclear (Moran et al., 2013).
Hierarchical error propagation: Direct evidence for error signals propagating
up cortical hierarchies is limited (Walsh et al., 2020).
The theory makes specific architectural predictions about cortical microcircuits that
have not been confirmed. While some evidence is consistent with predictive coding,
alternative architectures could produce similar macroscopic patterns.
3.9 Methodological Concerns
Model Comparison Problems
Studies claiming to support Bayesian models often fail to compare them against
adequately-specified alternatives (Jones and Love, 2011). Non-Bayesian models could
fit the same data equally well or better if given equivalent flexibility and fitting
procedures.
Meta-analyses suggest that Bayesian models sometimes fit behavior through
parameter flexibility rather than capturing genuine probabilistic reasoning (Eberhardt
and Danks, 2011). More rigorous model comparison using techniques like cross-
validation, out-of-sample prediction, and comparison against deliberately constructed
alternative models is needed.
Publication Bias and Confirmatory Research
The field may suffer from publication bias favoring positive results supporting
Bayesian and predictive processing frameworks (Ioannidis, 2005). Studies showing
failures of Bayesian optimality or alternative explanations for supposedly supportive
evidence may be underrepresented in the literature.
17
Additionally, much research is confirmatory rather than exploratory—
designed to demonstrate Bayesian principles rather than rigorously test whether
Bayesian models outperform alternatives. This confirmatory emphasis can create an
illusion of stronger support than warranted.
3.10 Integration Challenges
Relationship to Neuroscience
The gap between high-level computational principles and detailed neural
mechanisms remains wide (Carandini, 2012). Merely claiming that the brain
“implements” Bayesian inference or predictive coding does not itself specify actual
neural algorithms and circuits. The theory risks remaining at Marr’s computational
level without successfully connecting to algorithmic and implementation levels.
Neuroscience has identified numerous specialized systems (sensory processing,
motor control, memory, attention) with distinct neural mechanisms. Predictive
processing's claim that all these systems share a common computational principle may
underestimate neural heterogeneity and specialization (Anderson, 2014).
Relationship to Evolution
Evolutionary considerations raise questions about whether unified Bayesian inference
would evolve versus collections of specialized mechanisms (Barrett and Kurzban,
2006). Natural selection typically produces specialized adaptations rather than
domain-general solutions. The claim that the brain implements general Bayesian
principles across domains may be evolutionarily implausible.
Moreover, evolution operates through satisficing rather than optimizing—
selecting “good enough” solutions rather than optimal ones. The brain's mechanisms
may reflect evolutionary tinkering more than principled optimization, contrary to
predictive processing’s emphasis on minimizing free energy (Godfrey-Smith, 1996).
3.11 Constructive Paths Forward
Limited Domain Applications
Rather than universal theories of brain function, Bayesian and predictive processing
approaches might be most useful as frameworks for understanding specific domains
where inference over internal models is plausible—perhaps aspects of perception,
certain types of learning, and some high-level cognitive processes (Rahnev and
Denison, 2018).
18
Affirming a more limited scope for the Bayesian theory would increase
empirical testability and allow more productive engagement with domain-specific
neural mechanisms. The frameworks could serve as useful tools for specific
applications without claiming to be fundamental theories of all brain function.
Integration with Alternative Approaches
Rather than asserting hegemony, predictive processing could be integrated with
ecological, enactive, and reinforcement learning approaches (Bruineberg et al., 2018).
Different brain systems may use different computational strategies, with some
implementing prediction-based inference while others use simpler heuristics, reactive
mechanisms, or action-based learning.
Pluralistic approaches that recognize mechanistic diversity may better capture
neural reality than attempts to subsume everything under unified principles
(Anderson, 2014).
Improved Empirical Rigor
Future research should employ:
• Rigorous model comparison against well-specified alternatives.
• Direct neural tests of specific mechanistic predictions.
• Cross-validation and out-of-sample prediction to test generalization.
• Adversarial collaboration between proponents and critics.
• Transparency about model flexibility and post-hoc adjustments.
These methodological improvements would strengthen the empirical foundations
and clarify the scope of valid applications.
4. Conclusion
Bayesian brain and predictive processing frameworks have generated valuable
research and theoretical insights, but face substantial challenges that should temper
enthusiasm about their status as fundamental theories of brain function.
Computational intractability, implementation mysteries, explanatory flexibility,
alternative explanations for supporting evidence, and philosophical difficulties all
raise serious questions.
19
Rather than universal theories, these approaches may be most useful as
heuristic frameworks for specific domains where probabilistic inference over
generative models is appropriate. Claims that prediction error minimization explains
all brain function, from basic perception to consciousness and psychopathology,
overreaches current evidence.
The field would benefit from greater recognition of these frameworks’
limitations, more rigorous empirical testing against well-specified alternatives, and
openness to pluralistic approaches that acknowledge mechanistic diversity in neural
systems. Bayesian and predictive processing theories have contributed importantly to
cognitive neuroscience, but their ultimate scope and validity remain open questions
requiring continued critical examination rather than uncritical acceptance.
Appendix: Computational Intractability in Bayesian Brain and
Predictive Processing Frameworks
As we have seen, a central claim of the Bayesian brain hypothesis and predictive
processing framework, is that predictive processing provides a computationally
tractable neural implementation of Bayesian inference. As Andy Clark put it,
It is thus a major virtue of the hierarchical predictive coding account that it
effectively implements a computationally tractable version of the so-called
Bayesian Brain Hypothesis. (Clark, 2013)
This tractability claim has been crucial for the framework's appeal, suggesting that the
rich Bayesian models developed in cognitive science might finally have a plausible
neural implementation story. However, the computational complexity analysis by
Kwisthout and van Rooij (2020) demonstrates that this central claim is fundamentally
mistaken. Their rigorous mathematical proofs reveal that each key subcomputation
postulated by predictive processing is computationally intractable, and that the
standard appeals to “approximation” fail to resolve these intractability problems. This
critique strikes at the heart of predictive processing’s claims to provide a unified
theory of cortical computation.
The Core Intractability Results
Kwisthout and van Rooij's analysis begins by formalizing the key computational
transformations that predictive processing frameworks postulate. These include
prediction (computing expected sensory inputs from hierarchical hypotheses), error
computation (calculating discrepancies between predictions and observations), and
explaining away prediction errors through various mechanisms including hypothesis
updating, model revision, active inference, and adding observations. By
20
characterizing these computations at Marr’s computational level—specifying their
input-output transformations independent of implementation details—the authors
can apply computational complexity theory to determine their inherent tractability.
The results are devastating for claims about predictive processing’s
computational efficiency. Prediction, the computation of marginal probability
distributions over prediction variables given distributions over hypothesis variables,
is proven NP-hard even when restricted to networks with only single binary
hypothesis and prediction variables. This means that no algorithm can compute
predictions in polynomial time for all possible inputs, unless the widely accepted
computational complexity assumption that P ≠ NP is false. The hierarchical structure
that predictive processing emphasizes does not alleviate this intractability—it merely
distributes intractable computations across multiple levels of the hierarchy.
Hypothesis updating, whether formalized as belief updating (computing full
posterior distributions via Bayesian conditioning) or belief revision (revising
hypotheses to minimize prediction error), is similarly intractable. The proof
demonstrates NP-hardness for both interpretations and for both “SUM” variants
(computing probability distributions) and “MAX” variants (computing most probable
joint value assignments). Crucially, the intractability of belief revision persists even
when the prediction error is arbitrarily small. This undermines a common intuition in
the predictive processing literature that small prediction errors should make inference
easier—the mathematical reality is that minimizing even tiny prediction errors
remains computationally intractable.
Model revision, the process of adjusting parameters in generative models to
accommodate unexpected observations, is proven NP-hard even when only a single
parameter probability is subject to revision. This result challenges claims about how
brains might learn hierarchical generative models through experience, as the
computational problem of determining which parameter adjustments minimize
prediction error is itself intractable. Similarly, active inference (selecting actions to
minimize prediction error) and adding observations (choosing which aspects of the
environment to sample) are both proven NP-hard, even with highly restricted action
repertoires or observation spaces. These results collectively demonstrate that every
major mechanism postulated by predictive processing for explaining away prediction
errors faces fundamental computational barriers.
Why Approximation Fails to Rescue the Theory
A natural response to these intractability results might be that brains implement
approximate rather than exact Bayesian inference, and that approximation could
render the computations tractable. Indeed, proponents of predictive processing
frequently invoke approximation as the key to neural plausibility. However,
21
Kwisthout and van Rooij systematically dismantle this defense by demonstrating that
approximation itself does not guarantee tractability.
The critical insight is that approximate Bayesian inference is also provably
intractable in general. Previous work by Kwisthout et al. (2011) and Dagum and Luby
(1993) established that approximate inference in Bayesian networks remains NP-hard,
meaning that even computing approximate posteriors cannot be done in polynomial
time for arbitrary networks. This includes the sampling-based approximation
methods commonly invoked in discussions of neural implementation. While
sampling algorithms can approximate Bayesian inference under certain assumptions,
they require super-polynomial time in the worst case and frequently in typical cases
for complex networks.
Heuristic approximation methods such as Laplace approximation or
variational mean field methods face their own severe limitations. These approaches
make simplifying assumptions—such as Gaussian probability distributions or
independence between variables—that fundamentally alter the computational
problem being solved. While such simplified problems may be tractable, they no
longer involve the structured representations that Bayesian cognitive modelers argue
are essential for explaining human cognition. A theory that achieves tractability by
assuming away representational structure has abandoned the very explanatory target
it claimed to address.
The implications are stark: predictive processing cannot appeal to
approximation as a general solution to intractability. If neural implementations of
predictive processing are to be tractable, they must satisfy severe structural constraints
on the generative models they encode. The burden shifts to specifying what these
constraints are and demonstrating that biological neural networks actually satisfy
them.
The Constraint Requirements for Tractability
Kwisthout and van Rooij's fixed-parameter tractability analysis reveals what
constraints would be sufficient to render predictive processing computations tractable.
A computation is fixed-parameter tractable if it can be computed in time f (k₁, k₂, ...,
kₘ) × n^a, where k₁, k₂, ..., kₘ are parameters of the input, n is the input size, a is a
constant, and f is some function of the parameters. Such algorithms can be efficient for
large inputs provided the parameters remain small.
The analysis identifies several parameters that must be constrained for
tractability. The treewidth of the Bayesian network—a graph-theoretic measure of
how locally connected the network structure is—must be small. The number of values
each variable can take must be small. The size of hypothesis and prediction spaces
22
must be small. Alternatively, for MAX variants (computing most probable
assignments rather than full distributions), the most probable hypothesis must have
very high probability, meaning near-deterministic inference with minimal uncertainty.
Concretely, prediction and hypothesis updating are computable in fixed-
parameter tractable time O(c^|Pred| × c^t × n) for prediction and O(c^|Hyp| × c^t ×
n) for hypothesis updating, where c is the maximum number of values per variable, t
is treewidth, and n is network size. For these algorithms to be efficient, c, |Pred|,
|Hyp|, and t must all be small. As a rough heuristic, parameters in the range 2-10
might be considered small, while values like 100-10,000 would be large. The
exponential dependence on these parameters means that even moderate increases
quickly lead to computational explosion.
These constraints are extraordinarily severe. Low treewidth requires highly
local connectivity patterns where variables interact only with small neighborhoods.
Small cardinality restricts representational richness—variables can encode only a
handful of possibilities. Small hypothesis and prediction spaces limit the number of
competing interpretations the system can consider. These restrictions are
fundamentally incompatible with the rich, structured, hierarchical representations
that both Bayesian cognitive models and predictive processing theorists claim are
necessary for explaining human cognition.
The Structured Representation Dilemma
This creates a devastating dilemma for predictive processing. The framework’s appeal
partly rests on its claimed ability to implement the sophisticated Bayesian models that
cognitive scientists have developed to explain reasoning, learning, perception, and
other cognitive capacities. These models typically involve structured representations
encoding complex relational information—hierarchical taxonomies, compositional
structure, causal relationships, and abstract schemas. Such representations are
precisely what give Bayesian cognitive models their explanatory power, allowing
them to capture how humans generalize from limited data, perform analogical
reasoning, and understand novel situations.
However, structured representations necessarily involve high-dimensional
hypothesis spaces, complex dependencies between variables (high treewidth), and
large cardinalities to encode rich information. These are exactly the properties that
render inference intractable. A Bayesian network encoding realistic knowledge about,
say, visual scene understanding, social cognition, or language comprehension will
inevitably have the complexity that leads to intractability.
Proponents might respond by invoking the simplified representational
assumptions that make some approximation methods tractable—Gaussian
23
distributions, independence assumptions, or other restrictions. But accepting these
simplifications abandons the structured representations that made Bayesian
approaches explanatorily powerful. Gaussian assumptions work for simple
continuous variables but cannot capture the discrete, compositional structure of
concepts, the hierarchical organization of knowledge, or the complex dependencies
that characterize real-world causal models. Independence assumptions eliminate the
very relational structure that structured representations are meant to encode.
The dilemma is this: either accept rich structured representations and face
intractability, or adopt simplified representations that are tractable but explanatorily
impoverished. Predictive processing cannot have it both ways: it cannot claim to
implement the sophisticated Bayesian models of cognition, while also maintaining
computational tractability. The mathematical proofs establish that these goals are
incompatible for unconstrained networks.
Implications for Neural Plausibility
The intractability results have profound implications for claims about neural
plausibility. The computational complexity analysis by Pecevski et al. (2011) on
spiking neural network implementations of Bayesian inference reveals that the
number of neurons required scales exponentially with network treewidth—
specifically, proportional to c^t. For networks with the structural complexity required
for cognitive modelling, this implies astronomical neural resource requirements far
exceeding what biological brains contain.
Moreover, as we noted above, no neural mechanisms have been clearly
identified for representing probability distributions or implementing the
computational operations required for Bayesian inference. Probabilistic population
codes, often invoked as candidate mechanisms, require precise tuning of neural
variability and correlation structures that may not exist in biological populations. The
theory requires neurons or neural populations to encode probability distributions,
update these distributions through Bayesian conditioning, marginalize over
intermediate variables, and compute prediction errors between distributions—all
without clear evidence for how neural circuits could accomplish these operations.
The fixed-parameter tractability analysis suggests specific empirical
predictions that could test whether biological networks satisfy the constraints
necessary for tractable inference. Is cortical connectivity sufficiently local to yield low
treewidth? Do neural representations encode only small numbers of discrete
possibilities? Do brain regions consider only small numbers of competing hypotheses
simultaneously? Are neural representations highly certain, effectively deterministic?
24
Current neuroscientific evidence suggests the opposite. Cortical networks
exhibit rich, distributed connectivity patterns rather than purely local organization.
Neural representations appear to encode higher-dimensional information spaces.
Brain activity shows considerable variability and uncertainty rather than near-
deterministic selection of single hypotheses. These observations suggest that
biological neural networks may not satisfy the severe constraints required for tractable
predictive processing.
From Universal Theory to Domain-Specific Tool
The intractability results force a fundamental reassessment of predictive processing’s
scope and status. The framework has been promoted as “a unified brain theory”
(Friston, 2010) explaining all cortical processing—from basic sensory perception to
high-level cognition, consciousness, and psychopathology. Clark characterized it as
potentially “the future of cognitive science,” offering a unified computational
principle underlying diverse cognitive phenomena (Clark, 2013).
However, the mathematical proofs demonstrate that this universal ambition is
impossible for unconstrained networks with structured representations. At most,
predictive processing might apply to narrowly constrained domains where the
required conditions hold—perhaps some aspects of low-level sensory processing
involving local computations over small state spaces with minimal uncertainty. Even
here, alternative explanations not invoking Bayesian inference might be equally or
more plausible.
What must be excluded from the framework’s scope includes precisely those
cognitive capacities that Bayesian models were developed to explain: planning and
reasoning over complex state spaces, learning rich generative models from limited
data, language comprehension and production involving compositional structure,
analogical reasoning requiring relational representations, and creative problem-
solving considering multiple competing hypotheses. These capacities require the
representational richness and computational flexibility that lead inexorably to
intractability.
The scope limitation is not just a minor adjustment but instead a fundamental
reconceptualization. Instead of a unified theory of cortical computation, predictive
processing becomes at most a domain-specific framework applicable where its severe
constraints happen to be satisfied. The burden shifts entirely to framework
proponents in order to specify precisely which cognitive domains satisfy these
constraints and to provide empirical evidence that neural implementations in those
domains actually exhibit the required properties.
25
Conceptual Problems Revealed
The intractability critique reveals deeper conceptual problems with predictive
processing’s theoretical structure. The tractability claim was not peripheral, but rather
central to the framework’s appeal. Much of the enthusiasm for predictive processing
stemmed from its promise finally to provide a computationally feasible story about
how brains implement the sophisticated Bayesian inference that cognitive models
postulate. Without this promise, the framework loses much of its theoretical
motivation.
Moreover, a troubling circularity emerges. Predictive processing claims to
explain how brains implement Bayesian inference, positioning itself as bridging
computational-level theories in cognitive science with neural implementation.
However, the implementation requires constraints so severe that they preclude the
very cognitive capacities that Bayesian models were developed to explain. The
framework thus undermines its own explanatory target—it cannot explain how brains
implement the computations it purports to explain.
Additionally, serious measurement problems arise. The computational-level
parameters that determine tractability—treewidth, variable cardinality, hypothesis
space size—do not directly correspond to measurable neural properties. How would
neuroscientists test whether cortical networks have treewidth less than 5? How would
they measure the effective size of hypothesis spaces encoded by neural populations?
The gap between abstract computational parameters and concrete neural
measurements remains vast, rendering the theory’s empirical testability questionable.
Evaluating Possible Responses
Defenders of predictive processing theories might offer several responses to the
intractability critique, but each faces serious difficulties.
The first possible response might be that brains use fundamentally different
algorithms than those analyzed by Kwisthout and van Rooij—algorithms that avoid
the intractability problems while still implementing something recognizable as
predictive processing. However, this response effectively abandons the framework's
explanatory claims. If the actual neural algorithms differ fundamentally from the
computational transformations that define predictive processing, then predictive
processing does not actually explain neural implementation. The response amounts
to saying “brains do something else,” which concedes the critique, while also offering
no alternative account.
A second possible response might invoke evolutionary adaptation, suggesting
that natural selection has discovered constraints and network structures that render
26
inference tractable. This is certainly possible, but represents an additional empirical
claim requiring demonstration rather than assumption. More problematically, the
required constraints may be incompatible with known cognitive capacities. If
tractability requires severely restricted representations, then evolutionary selection
for tractable inference would constrain cognitive abilities in ways that seem
inconsistent with observed human cognition. The burden of proof lies with
framework proponents to show both that the required constraints exist in biological
networks and that they are compatible with cognitive capacities.
And a third possible response appeals to environmental structure, suggesting
that the statistical regularities in natural environments might reduce effective
computational complexity. Real-world causal structures might have special
properties—sparsity, modularity, hierarchical organization—that make inference
tractable despite worst-case intractability. This is an interesting possibility deserving
investigation, but remains speculative without concrete evidence. Moreover,
computational complexity theory establishes that intractable problems remain
intractable for typical cases, not just worst cases. Appeals to environmental structure
require demonstrating that natural environments fall into the atypical cases where
inference becomes tractable—a strong empirical claim yet to be substantiated.
A Framework in Crisis
The computational intractability analysis reveals predictive processing to be a
framework that’s fundamentally in crisis. Its central computational claims—that it
provides tractable implementation of Bayesian inference, that prediction error
minimization enables efficient inference, that hierarchical structure resolves
computational problems—are demonstrably false for networks with the structured
representations required for cognitive modelling. The standard defense invoking
approximation fails, as approximate inference is itself intractable without severe
constraints.
What remains viable is dramatically limited. As has been said above, at most,
predictive processing might serve as a heuristic or metaphor for understanding certain
neural phenomena in narrowly constrained domains. It might describe some aspects
of neural processing where the required constraints happen to be satisfied. But it
cannot sustain claims to be a unified theory of cortical computation, a general
implementation mechanism for Bayesian cognitive models, or a comprehensive
account of perception, action, learning, and consciousness.
The path forward requires intellectual honesty about these limitations.
Proponents must abandon universalist rhetoric about unified brain theories and grand
explanatory scope. They must specify precisely which cognitive domains and neural
systems satisfy the constraints necessary for tractable inference. They must develop
27
testable empirical predictions about how neural implementations encode these
constraints. Most importantly, they must accept theoretical pluralism—
acknowledging that different brain systems likely use radically different
computational strategies rather than all implementing variants of predictive
processing.
The computational complexity objection does not prove that brains cannot
perform sophisticated inference or that Bayesian models of cognition are wrong. It
proves that predictive processing, as currently formulated, cannot be the general
mechanism by which brains implement such inference. The search for neural
implementations of probabilistic reasoning must look elsewhere, perhaps to hybrid
architectures combining multiple computational strategies, domain-specific
mechanisms exploiting particular environmental regularities, or entirely different
approaches to understanding neural computation. The sooner the field acknowledges
predictive processing’s fundamental theoretical limitations, the sooner it can pursue
more promising alternative frameworks for understanding how brains generate
intelligent behavior.
28
REFERENCES
(Adams et al., 2013). Adams, R.A. et al. “The Computational Anatomy of Psychosis.”
Frontiers in Psychiatry 4: 47.
(Alais and Burr, 2004). Alais, D., and Burr, D. “The Ventriloquist Effect Results from
Near-Optimal Bimodal Integration.” Current Biology 14, 3: 257-262.
(Anderson, 2014). Anderson, M.L. After Phrenology: Neural Reuse and the Interactive
Brain. MIT Press.
(Arora and Barak, 2009). Arora, S. and Barak, B. Computational Complexity: A Modern
Approach. Cambridge: Cambridge Univ. Press.
(Barrett and Kurzban, 2006). Barrett, H. C. and Kurzban, R. “Modularity in Cognition:
Framing the Debate.” Psychological Review 113, 3: 628-647.
(Barrett et al., 2004). Barrett, L.F. et al. “Arousal Focus and Interoceptive Sensitivity.”
In L. F. Barrett al. (eds.), Emotion and Consciousness. New York: Guilford Press. Pp.: 217-
244.
(Bastos et. al, 2012). Bastos, A.M. et. al. “Canonical Microcircuits for Predictive
Coding.” Neuron 76, 4: 695-711.
(Beck, et al. 2012). Beck, J.M. et al. “Not Noisy, Just Wrong: The Role of Suboptimal
Inference in Behavioral Variability.” Neuron 74, 1: 30-39.
(Bowers and Davis, 2012). Bowers, J.S., and Davis, C.J. “Bayesian Just-So Stories in
Psychology and Neuroscience.” Psychological Bulletin 138, 3: 389-414.
(Bruineberg, et al., 2018). Bruineberg, J. et al. “The Anticipating Brain is Not a Scientist:
The Free-Energy Principle from an Ecological-Enactive Perspective.” Synthese 195, 6:
2417-2444.
(Bubic et al., 2010). Bubic, A. et al. “Prediction, Cognition and the Brain.” Frontiers in
Human Neuroscience 4: 25.
(Carandini, 2012). Carandini, M. “From Circuits to Behavior: A Bridge too Far?”
Nature Neuroscience 15, 4: 507-509.
(Chalmers, 1996). Chalmers, D.J. The Conscious Mind. Oxford: Oxford Univ. Press.
29
(Chemero, 2009). Chemero, A. Radical Embodied Cognitive Science. Cambridge MA:
MIT Press.
(Clark, 2013). Clark, A. “Whatever Next? Predictive Brains, Situated Agents, and the
Future of Cognitive Science.” Behavioral and Brain Sciences 36, 3: 181-204.
(Clark, 2016). Clark, A. Surfing Uncertainty: Prediction, Action, and the Embodied Mind.
Oxford: Oxford Univ. Press.
(Clark et al., 2018). Clark, J.E. et al. “What is Mood? A Computational Perspective.”
Psychological Medicine 48, 14: 2277-2284.
(Corlett et al., 2009). Corlett, P.R. et al. “From Drugs to Deprivation: A Bayesian
Framework for Understanding Models of Psychosis.” Psychopharmacology 206, 4: 515-
530.
(Dagum and Luby, 1993). Dagum, P., and Luby, M. “Approximating Probabilistic
Inference in Bayesian Belief Networks is NP-hard.” Artificial Intelligence 60, 1: 141-153.
(den Ouden et al., 2012). den Ouden, H.E. et al. “How Prediction Errors Shape
Perception, Attention, and Motivation” Frontiers in Psychology 3, 548.
(Eberhardt and Danks, 2001). Eberhardt, F., and Danks, D. “Confirmation in the
Cognitive Sciences: The Problematic Case of Bayesian Models.” Minds and Machines
21, 3: 389-410.
(Edwards et al., 2012). Edwards, M.J. et al. “A Bayesian Account of ‘Hysteria’.” Brain.
135, 11: 3495-3512.
(Egner et al., 2010). Egner, T. et al. “Expectation and Surprise Determine Neural
Population Responses in the Ventral Visual Stream.” Journal of Neuroscience 30, 49:
16601-16608.
(Ernest and Banks, 2002). Ernst, M.O. and Banks, M.S. “Humans Integrate Visual and
Haptic Information in a Statistically Optimal Fashion.” Nature 415, 6870: 429-433.
(Feldman and Friston, 2010). Feldman, H., and Friston, K.J. “Attention, Uncertainty,
and Free-Energy.” Frontiers in Human Neuroscience 4: 215.
(Felleman and Van Essen, 1991). Felleman, D.J. and Van Essen, D.C. “Distributed
Hierarchical Processing in the Primate Cerebral Cortex.” Cerebral Cortex 1, 1: 1-47.
30
(Fletcher and Frith, 2009). Fletcher, P.C. and Frith, C.D. “Perceiving is Believing: A
Bayesian Approach to Explaining the Positive Symptoms of Schizophrenia.” Nature
Reviews Neuroscience 10, 1: 48-58.
(Friston, 2005). Friston, K. “A Theory of Cortical Responses.” Philosophical Transactions
of the Royal Society B. 360, 1456: 815-836.
(Friston, 2010). Friston, K. “The Free-Energy Principle: A Unified Brain Theory?”
Nature Reviews Neuroscience 11, 2: 127-138.
(Friston et al., 2009). Friston, K. et al. “Reinforcement Learning or Active Inference?”
PLoS ONE 4, 7: e6421.
(Friston et al., 2012). Friston, K. et al. “Free-Energy Minimization and the Dark-Room
Problem.” Frontiers in Psychology 3: 130.
(Friston et al., 2017). Friston, K. et al. “Active inference: A Process Theory.” Neural
Computation 29, 1: 1-49.
(Gallagher and Allen, 2016). Gallagher, S., and Allen, M. “Active Inference, Enactivism
and the Hermeneutics of Social Cognition.” Synthese 195, 6: 2627-2648.
(Garey and Johnson, 1979). Garey, M., and Johnson, D. Computers and Intractability: A
Guide to the Theory of NP-Completeness. W.H. Freeman and Co.
(Garrido et al., 2009). Garrido, M.I. et al. “The Mismatch Negativity: A Review of
Underlying Mechanisms.” Clinical Neurophysiology 120, 3: 453-463.
(Geisler et al., 2001). Geisler, W.S. et al. “Edge Co-Occurrence in Natural Images
Predicts Contour Grouping Performance.” Vision Research 41, 6: 711-724.
(Gibson, 1979). Gibson, J.J. The Ecological Approach to Visual Perception. Boston MA:
Houghton Mifflin.
(Gigerenzer and Gaissmaier, 2011). Gigerenzer, G., and Gaissmaier, W. “Heuristic
Decision Making.” Annual Review of Psychology 62: 451-482.
(Gładziejewski, 2016). Gładziejewski, P. “Predictive Coding and Representationalism.”
Synthese 193, 2: 559-582.
(Godfrey-Smith, 1996). Godfrey-Smith, P. Complexity and the Function of Mind in Nature.
Cambridge: Cambridge Univ. Press.
31
(Gopnik and Wellman, 2012). Gopnik, A., and Wellman, H. M. “Reconstructing
Constructivism: Causal Models, Bayesian Learning Mechanisms, and the Theory
Theory.” Psychological Bulletin 138, 6: 1085-1108.
(Gregory, 1980). Gregory, R. L. “Perceptions as Hypotheses.” Philosophical Transactions
of the Royal Society B. 290, 1038: 181-197.
(Griffiths et al., 2010). Griffiths, T. et al. “Probabilistic Models of Cognition: Exploring
Representations and Inductive Biases.” Trends in Cognitive Sciences 14, 8: 357-364.
(Grotheer and Kovács, 2016). Grotheer, M., and Kovács, G. “Can Predictive Coding
Account for Repetition Suppression?” Cortex 80: 113-124.
(Heilbron and Chait, 2018). Heilbron, M., and Chait, M. “Great Expectations: Is There
Evidence for Predictive Coding in Auditory Cortex?” Neuroscience 389: 54-73.
(Hesselmann et al., 2010). Hesselmann, G. et al. “Predictive Coding or Evidence
Accumulation? False Inference and Neuronal Fluctuations.” PLoS ONE 5, 3: e9926.
(Hohwy, 2012). Hohwy, J. “Attention and Conscious Perception in the Hypothesis
Testing Brain.” Frontiers in Psychology 3: 96.
(Hohwy, 2013). Hohwy, J. The Predictive Mind. Oxford: Oxford Univ. Press.
(Hohwy et al., 2008). Hohwy, J. et al. “Predictive Coding Explains Binocular Rivalry:
An Epistemological Review.” Cognition 108, 3: 687-701.
(Ioannidis, 2005). Ioannidis, J.P. “Why Most Published Research Findings are False.”
PLoS Medicine 2, 8: e124.
(Jones and Love, 2011). Jones, M., and Love, B.C. “Bayesian Fundamentalism or
Enlightenment? On the Explanatory Status and Theoretical Contributions of Bayesian
Models of Cognition.” Behavioral and Brain Sciences 34, 4: 169-188.
(Kahneman, 2011). Kahneman, D. Thinking, Fast and Slow. New York: Farrar, Straus
and Giroux.
(Kapur, 2003). Kapur, S. “Psychosis as a State of Aberrant Salience: A Framework
Linking Biology, Phenomenology, and Pharmacology in Schizophrenia.” American
Journal of Psychiatry 160, 1: 13-23.
(Keller and Mrsic-Flogel, 2018). Keller, G.B., and Mrsic-Flogel, T.D. “Predictive
Processing: A Canonical Cortical Computation.” Neuron 100, 2: 424-435.
32
(Klein, 2018). Klein, C. “What do Predictive Coders Want?” Synthese 195, 6: 2541-2557.
(Knill and Pouget, 2004). Knill, D.C., and Pouget, A. “The Bayesian Brain: The Role of
Uncertainty in Neural Coding and Computation.” Trends in Neurosciences 27, 2: 712-
719.
(Knill and Richards, Eds, 1996). Knill, D.C. and Richards, W. (eds.) Perception as
Bayesian Inference. Cambridge: Cambridge University Press.
(Kok et al., 2012). Kok, P. et al. “Less is More: Expectation Sharpens Representations
in the Primary Visual Cortex.” Neuron 75, 2: 265-270.
(Kwisthout and van Rooij, 2020). Kwisthout, J., and van Rooij, I. “Computational
Resource Demands of a Predictive Bayesian Brain.” Computational Brain and Behavior
3: 174-188.
(Kwisthout et al., 2011). Kwisthout, J. et al. “Bayesian Intractability is Not an Ailment
Approximation Can Cure.” Cognitive Science 35, 5: 779-784.
(Landy et al., 2011). Landy, M.S. et al. “Ideal-Observer Models of Cue Integration.” In
J. Trommershäuser, K. Kording, and M. S. Landy (eds.), Sensory Cue Integration.
Oxford: Oxford Univ. Press: 5-29.
(Lee and Mumford, 2003). Lee, T.S., and Mumford, D. “Hierarchical Bayesian
Inference in the Visual Cortex.” Journal of the Optical Society of America A 20, 7: 1434-
1448.
(Ma et al., 2006). Ma, W.J. et al. “Bayesian Inference with Probabilistic Population
Codes.” Nature Neuroscience 9, 11: 1432-1438.
(Mamassian andGoutcher, 2001). Mamassian, P. and Goutcher, R. “Prior Knowledge
on the Illumination Position.” Cognition 81, 1: B1-B9.
(Marcus and Davis, 2013). Marcus, G.F., and Davis, E. “How Robust are Probabilistic
Models of Higher-Level Cognition?” Psychological Science 24, 12: 2351-2360.
(Marr, 2010). Marr, D. Vision: A Computational Investigation into the Human
Representation and Processing of Visual Information. New York: W.H. Freeman.
(Medical Xpress, 2024). Netherlands Institute for Neuroscience. “How the Brain Plans
Ahead to Predict the World.” Medical Xpress. 30 October. Available online at URL =
<[Link]
33
(Merleau-Ponty, 1945/2012). Merleau-Ponty, M. Phenomenology of Perception. Trans, D.
Landes. London: Routledge.
(Moran, et al., 2013). Moran, R.J. “Free Energy, Precision and Learning: The Role of
Cholinergic Neuromodulation.” Journal of Neuroscience 33, 19: 8227-8236.
(Mumford, 1992). Mumford, D. “On the Computational Architecture of the Neocortex.
II. The Role of Cortico-Cortical Loops.” Biological Cybernetics 66, 3, 241-251.
(Niv, 2009). Niv, Y. “Reinforcement Learning in the Brain.” Journal of Mathematical
Psychology 53, 3: 139-154.
(Noë, 2006). Noë, A. Action in Perception. Cambridge MA: MIT Press.
(O'Regan and Noë, 2001). O'Regan, J. K., and Noë, A. “A Sensorimotor Account of
Vision and Visual Consciousness.” Behavioral and Brain Sciences 24, 5: 939-973.
(Orbán, G. et al., 2016). Orbán, G. et al. “Neural Variability and Sampling-Based
Probabilistic Representations in the Visual Cortex.” Neuron 92, 2: 530-543.
(Paulus and Stein, 2006). Paulus, M.P., and Stein, M.B. “An Insular View of Anxiety.”
Biological Psychiatry 60, 4: 383-387.
(Pecevski et al., 2011). Pecevski, D. et al. “Probabilistic Inference in General Graphical
Models Through Sampling in Stochastic Networks of Spiking Neurons.” PLoS
Computational Biology 7, 12: e1002211.
(Pellicano and Burr, 2012). Pellicano, E., and Burr, D. “When the World Becomes ‘Too
Real’: A Bayesian Explanation of Autistic Perception.” Trends in Cognitive Sciences 16,
10: 504-510.
(Rahnev and Denison, 2018). Rahnev, D., and Denison, R.N. “Suboptimality in
Perceptual Decision Making.” Behavioral and Brain Sciences 41: e223.
(Rao and Ballard, 1999). Rao, R.P. and Ballard, D.H. “Predictive Coding in the Visual
Cortex: A Functional Interpretation of Some Extra-Classical Receptive-Field Effects.”
Nature Neuroscience 2, 1: 79-87.
(Ratcliffe, 2008). Ratcliffe, M. “Touch and Situatedness.” International Journal of
Philosophical Studies 16, 3: 299-322.
34
(Rescorla, 2016). Rescorla, M. “Bayesian Perceptual Psychology.” In M. Matthen (ed.),
The Oxford Handbook of Philosophy of Perception. Oxford: Oxford Univ. Press. Pp. 694-
716.
(Sanborn, and Chater, 2016). Sanborn, A.N., and Chater, N. “Bayesian Brains Without
Probabilities.” Trends in Cognitive Sciences 20, 12: 883-893.
(Seth, 2021). Seth, A.K. Being You: A New Science of Consciousness. Boston MA: Dutton.
(Summerfield et al., 2008). “Neural Repetition Suppression Reflects Fulfilled
Perceptual Expectations.” Nature Neuroscience 11, 9: 1004-1006.
(Sutton and Barto, 2018). Sutton, R.S., and Barto, A.G. Reinforcement Learning: An
Introduction. 2nd edn., Cambridge MA: MIT Press.
(Téglás et al., 2011). Téglás, E. “Pure Reasoning in 12-Month-Old Infants as
Probabilistic Inference.” Science 332, 6033: 1054-1059.
(Tenenbaum, 2011). Tenenbaum, J.B. “How to Grow a Mind: Statistics, Structure, and
Abstraction.” Science 331: 1279-1285.
(Todd and Gigerenzer, 2012). Todd, P.M., and Gigerenzer, G. Ecological Rationality:
Intelligence in the World. Oxford: Oxford Univ. Press.
(Van de Cruys et al., 2014). Van de Cruys, S. et al. “Precise Minds in Uncertain Worlds:
Predictive Coding in Autism.” Psychological Review 121, 4: 649-675.
(van Rooij, 2008). van Rooij, I. “The Tractable Cognition Thesis.” Cognitive Science 32:
939-984.
(van Rooji et al., 2019). van Rooij, I. Cognition and Intractability: A Guide to Classical and
Parameterized Complexity Analysis. Cambridge: Cambridge Univ. Press.
(Varela et al., 1991). Varela, F.J. et al. The Embodied Mind: Cognitive Science and Human
Experience. Cambridge MA: MIT Press.
(Walsh et al., 2020). Walsh, K.S. et al. “Evaluating the Neurophysiological Evidence
for Predictive Processing as a Model of Perception.” Annals of the New York Academy of
Sciences 1464, 1: 242-268.
(Weiss, et al., 2002). Weiss, Y. “Motion Illusions as Optimal Percepts.” Nature
Neuroscience 5, 6: 598-604.
35
(Williams, 2018). Williams, D. “Predictive Processing and the Representation Wars.”
Minds and Machines 28, 1: 141-172.
(Winkler et al., 1996). Winkler, I. “Adaptive Modeling of the Unattended Acoustic
Environment Reflected in the Mismatch Negativity Event-Related Potential.” Brain
Research 742, 1-2: 239-252.
(Yuille and Kersten, 2006). Yuille, A., and Kersten, D. “Vision as Bayesian Inference:
Analysis by Synthesis?” Trends in Cognitive Sciences 10, 7: 301-308.
36