Rapid #: -21056344
CROSS REF ID: 6831
LENDER: ZYU (New York University) :: Main Library
BORROWER: PSC (Swarthmore College) :: Main Library
TYPE: Article CC:CCG
JOURNAL TITLE: Journal of interdisciplinary voice studies
USER JOURNAL TITLE: Journal of Interdisciplinary Voice Studies
ARTICLE TITLE: Cybernetic animism: Voice and AI in conversation
ARTICLE AUTHOR: K. Allado-McDowell, Francesco Bentivegna
VOLUME: 7
ISSUE: 1
MONTH:
YEAR: 2022
PAGES:
ISSN: 2057-035X
OCLC #:
PATRON: James Padilioni
Processed by RapidX: 7/27/2023 [Link] AM
This material may be protected by copyright law (Title 17 U.S. Code)
jivs 7 (1) pp. 107–118 Intellect Limited 2022
Journal of Interdisciplinary Voice Studies
Volume 7 Number 1
© 2022 Intellect Ltd Voicing. English language. [Link]
K. ALLADO-MCDOWELL
Writer, speaker and musician
FRANCESCO BENTIVEGNA
University of Exeter
Cybernetic animism: Voice
and AI in conversation
ABSTRACT KEYWORDS
This Voicing explores the theoretical and material connections between Artificial AI
Intelligence (AI) and voice. The Voicing is in three-parts encompassing: a theo- ecology
retical introduction with a taxonomy for voice and AI, an extract from artist animism
K. Allado-McDowell’s new work Air Age Blueprint and an interview based on epigenetics
such text. Allado-McDowell pioneered the field of human–artificial intelligence art
interaction and literature. With their book Pharmako-AI, written in collabora- ASI
tion with GPT-3, Allado-McDowell stretched the limits of language creation. AGI
Francesco Bentivegna worked as artist in the liminal space of cyborgean voices and
recently completed their Ph.D. on voice, AI and synthetic personas. As an exer-
cise in philosophy of AI, voice studies and artistic research, Allado-McDowell and
Bentivegna move in conversation through biases, vibes and language, exploring
narrative, science and theory.
DISCLAIMER
Francesco Bentivegna (FB): When we first got offered the possibility of curat-
ing this Special Issue, my first thought was to contact K. Allado-McDowell
and to see whether they were interested in collaborating with a text on voice
and AI. We had an extensive and fruitful exchange that resulted in this Voicing.
In this sense, I recommend the reader to treat the second part of this text
Delivered by Intellect to:
New York University (nyu)
[Link] 107
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
1. This introduction was (on ‘Cybernetic animism’ followed by a conversation) as a work of art and a
written by Francesco
Bentivegna.
discussion with the artist.
2. See Wark (2019)
and Vettese and AI AND VOICE: AN EXCURSUS1
Pendergrass (2022).
The COVID-19 pandemic has been an unprecedented event for our genera-
3. The Internet of
Things (IoT) is the tion. As argued by multiple scholars, the pandemic and its consequences are
current definition representative of the post-capitalist society entangled in networks on a dying
used to explain the
interconnectedness
planet.2 The people whose jobs changed due to lockdowns and restrictions
of ‘smart’ objects: experienced the latest version of what Katherine Hayles defined as the ‘post-
watches, houses, human condition’: humans entangled in virtual and real networks of embodied
assistants.
and disembodied technologies (Hayles 1999: 15). Arguably, such a networked
4. See the work of Holly society that forms the post-human condition expanded and touched fields
Herndon, OpenAI’s
Jukebox, Jaume Ferrete that did not rely on networked technologies before, such as education and the
Vázquez (in this issue). job market at large. In the early stages of the pandemic, the ‘hard’ lockdowns
5. Computer science faced some humans with a new status, in a condition of constant relation-
philosopher Matteo ality with technologies that constantly integrate each other: AI, algorithms,
Pasquinelli defines
algorithms as abstract
smartphones, speech synthesis and the Internet of Things (IoT), became
diagrams emerging a bigger part of human life (Greenfield 2018: 273).3 In this vibrant, confus-
‘from the repetition ing and networked ‘new status’, AI is a fast-growing field to which voice is
of a process, an
organization of time, unexpectedly very linked. For instance, in science, the link is made essentially
space, labor, and on the impact of voice technology on health and the possibilities that voice
operations: it is not a cloning and AI-generated voices can have as profitable and useful tools, or
rule that is invented
from above but unethical sources of problems (Napolitano and Grieco 2021). Moreover, voice
emerges from below’ cloning and artificial voices are a growing field in music, performance and live
(2019: 6).
art.4 However, AI is not only a material scientific field, but also a philosophi-
6. Machine Learning is the cal dilemma that questions the ethics, the bias and the overall existence of
process where a neural what is defined as AI. The aim of this introduction is to familiarize the reader
network performs
the computation of a with what may happen when voice, as a philosophical and material concept,
statistical model of a encounters AI, and how the two respond to each other. Inspired by previous
given data (Pasquinelli works that tried to categorize AI through encyclopaedic atlas (Crawford 2021)
and Joler 2020: 4).
and poetics (Khan 2020), what follows is a small taxonomy to explore the rela-
tions between voice and AI.
Origins: Voice as artificial superintelligence (ASI)
Probably the most famous and foundational definition of AI was introduced
by Alan Turing with his imitation game. In a room, an interrogator (C) needs
to ask questions to two hidden participants in another room: a man (A) and a
woman (B). Based on their answers, C needs to guess who is the woman and
who is the man. Turing then asks: what might happen if A or B is replaced by
a machine? (1950: 2). In a Pythagorean parallelism, the veil that covers the
speakers allows the listener to imagine its form: is it human? Is it a machine?
To explain what AI is, philosopher Mark Coeckelbergh defined it as ‘intelli-
gence displayed or simulated by code (algorithms) or machines’ (2020: 64). That
is, a system based on automated processes of learning algorithms.5 Through
trial and error, algorithms can find the right way to answer a problem through
the process called ‘deep learning’ or ‘Machine Learning’.6 As explained by
Matteo Pasquinelli, the ‘mythical’ AI is ‘simply’ a process of ‘crystallizing’ collec-
tive and personal data in ‘privatized algorithms’, used for the ‘automation of
complex tasks: from driving to translation, from object recognition to music
composition’ (2019: 6). However simple, the mythical AI is what has interested
philosophical debates and current ethical discussions. Multiple thinkers have
Delivered by Intellect to:
New York University (nyu)
108 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
Cybernetic animism
tried to define and imagine it, substantially as the simulation of human intelli- 7. See also the review in
this issue.
gence. Therefore, AI is both science – the knowledge of human intelligence and
machine computation – and technology – the various appliance of machine 8. The acousmêtre is
the voice-body of an
automation and data gathering (Coeckelbergh 2020: 67). To Benjamin Bratton, acousmatic voice, or a
humans tend to ‘assume’ or ‘presuppose’ what an AI is: intrinsic processes voice whose source is
and extrinsic effects that allow a machine to be intelligent (Bratton 2020: 94). unknown or cannot be
identified (Kane 2014: 2).
However, this presupposition is mainly speculative and imagines AI as Artificial
Super Intelligence (ASI), a transhuman chimaera based on the pseudo-scien-
tific event of the singularity, the moment where machines surpass humans
(Bostrom 2014: 70). ASI is the speculative image that ‘we’ have of AI because it
is what has been served and presented in media outlets and artistic representa-
tions of AI, especially through voice (Faber 2020).7 In Liz Faber’s The Computer’s
Voice, Faber links the concept of the ‘acousmêtre’ to the gendered representa-
tions of AI and computers in cinema and TV shows (2020: 11).8 Science fiction,
Faber explains, tends to influence science and technological development;
thus, fictional intelligent computers are a solid imaginative trope for audiences
that eventually became real. The prophecies of Hal, Samantha and the Star
Trek’s mother ship – all acousmatic voices – were fulfilled by Amazon Echo
and Apple’s Siri, as well as smart home devices and voice enabled cars (Bridle
2018). The IoT and the AI behind it become vocal outputs, because of the
familiar figures of AI companions (Guzman 2019).
Automation and confirmation bias
The acousmatic ‘power’ of the veiled voices of AI is paired to another form
of Godlike power ascribed to computers in general. Artist and scholar James
Bridle defines the omniscience of computers as a convergence of two biases:
automation and confirmation (2018: 40). Bridle stresses that human users
‘value automation information more than our experiences, even when it
conflicts with other observation’ (2018: 40). This is paired with the confirma-
tion bias, where humans ‘reshape [their] awareness of the world’ to align it
to computational knowledge: human users affirm the validity of the compu-
tational solutions, discarding ‘observations inconsistent with the machine’s
viewpoint’ (2018: 40). Humans assume a certain ‘perfection’ of machines that
is linked to automation.
Drawing from James Bridle’s work, I argue that there is another strand
of voice and AI relation that is linked to alienation, cognitive capitalism and
utilitarianism. Virtual personal assistants make us ‘better humans’, our better
selves that work (Treusch 2017). Happiness, completion and freedom are
concepts embedded in cognitive capitalism, which exploit them to further the
idea of the self-made, neo-liberal man. This side of the post-human condi-
tion of becoming with data has been envisaged by the UK band Radiohead
with ‘Fitter Happier’, a song from their 1997 LP Ok Computer. A haunting and
haunted synthetic voice repeats a frustrating poem or a recipe for the ‘the
control of the body in everyday human life and devising plans for the “perfect”
specimen’ (Young 2015: 82). ‘Fitter, happier, more productive. Comfortable.
Not drinking too much. Regular exercise at the gym, three days a week. No
paranoia’ (Yorke 1997). Thom Yorke’s lyrics foreground a perfected automated
human that seems in control of his own life, but who is subdued to a series
of limitations given, controlled and assessed by machines. Artificial voices
become the representation of an automated society of control based on auto-
mation and confirmation bias; a society of control that runs points-systems
Delivered by Intellect to:
New York University (nyu)
[Link] 109
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
9. To have a deeper to ‘grade’ the citizens, reducing humans to voiceless automata that follow the
understanding, I
suggest the reader
pre-established pathways to obtain ‘happiness’. Humans are automatized as
goes back to the other workers (Bridle 2018: 25) and users (Terranova 2004: 13), organized via their
articles in this issue own ‘taste’ and behaviour (Benjamin 2019: 74).
covering such themes.
Vibe/grain
One of the most groundbreaking approaches to the philosophy of AI was
introduced by Peli Grietzer and his theory of the vibe that could be simplified
as follows: an algorithm learns to generate only approximate reconstructions
of the objects of the training set (Grietzer 2017). Grietzer defines the vibe as
the ‘local colour’ present in a model that an algorithm or a system – he calls it
autoencoder – tries to grasp or replicate (2017). An autoencoder learns how a
model object ‘vibes’, and then it goes on and replicates a similar vibe (Grietzer
2017: [Link].).
Training an AI is not dissimilar to training a voice: if AI is looking for
the ‘vibe’ of the model, voice training aims to grasp the ‘grain’ of a genre or
style (Thomaidis 2013: 33). Thomaidis defines the ‘grain of the genre’ as the
‘blurred boundary between the voiced intimacy of the subject and the inter-
nalized dialogue of the acculturated processes with the aesthetic principles of
each of the codified pedagogies’ (2013: 31). Reimagining Barthes’s grain of the
voice (1977), Thomaidis expands the concept towards a much more culturally
influenced spectrum: the ‘grain’ of a voice in training is always already look-
ing to get as close as possible to the presupposed ‘grain of genre’ of a particu-
lar pedagogy (2013: 32). Similarly, when training AIs on human datasets – as
voice extracts or random images – the ‘expected’ outcome is a ‘vibe’ as close as
possible to the ‘model’s vibe’: a vibe of humanity. Thomaidis explains that the
‘grain of genre’ is always already based on ‘conceptual presuppositions’ and a
site for convergence of culturally specific techniques (2013: 34). Likewise, the
‘vibe’ of a model is always already based on culturally specific presuppositions:
an AI seeking a ‘vibe of humanity’ is always already expected to reaffirm a
certain culture/vision of humanity. Correspondingly, Pasquinelli explains that
AI is a ‘long historical process of crystallizing collective behaviour, personal
data, and individual labor into privatized algorithms’ (2019: 11), and then used
to automate complex and simpler tasks.
Just like the trainee in Thomaidis’s thesis becomes the simulacrum of a
particular ‘grain’ of a genre, the artificial intelligence becomes a simulacrum
of a particular ‘vibe’ of a certain ‘humanity’. Their originals exist, but they are
readapted and filtered via the bio-phonic-cultural specificities of the performer
and the obscure processes within the AI. However, even though obscure and
somewhat ‘mythicized’, AI is predictable and much more influenceable than a
human. The choices made in the selection of models influence and feed the
AI, which is not an autonomous agent in a strict sense.
Voice cloning
Directly connected to the vibe and the grain, one of the issues for Turing
was the humanness of the voices that could drive the listener/control off by
being too human like, therefore, impossible to be confused with computers
(1950: 2). This is due to the intrinsic humanness of voice: the human vibe/
grain that apparently computer voices are not capable to reproduce. However,
recent history and technological development made it possible to recre-
ate voices through AI.9 Instead of working with concatenated portions of
Delivered by Intellect to:
New York University (nyu)
110 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
Cybernetic animism
pre-recorded speech, these latest approaches to voice synthesis employ AI. 10. Herndon herself linked
her work on voice for
The neural network within the AI has been trained in ‘listening’ to the voices her album PROTO to
and phonemes to reproduce the ‘natural’ feel of the human voice: it ‘tries’ to Grietzer’s theory of
reproduce the voice as closely as possible through audio flow. American artist the vibe, explaining
how the artistic
Holly Herndon’s newest project draws upon these new approaches to make research spawned
her voice ‘shareable’: it is now possible to use Herndon’s voice just by feeding from the theoretical
the right data in her programme (Herndon 2022).10 Voice cloning is an ever- background: https://
[Link]/
changing and ever-growing field, and it is the present and the future of voice watch?v=k67IKMXzD7A.
technology and synthesis. 11. Mochila is a Spanish
word that literally
means ‘bag’, or more
Carrier bag commonly, ‘backpack’.
The final category that I wish to highlight is connected to archival memory 12. A human voice could
and storytelling. Composer Ella Finer defined voice as a carrier of a body define an object
through different
and no-body simultaneously (2015: 184). Discussing Jean-Luc Nancy, Finer lenses, whereas a more-
stretches the quality of voice as an ‘aural promise’ of somebody, a ghosting than-human voice
premonition that produces expectations in the listener (2015: 185). In addi- reshapes the ‘idea’
and the qualities of an
tion to this quality, voice is also a carrier of stories or a bag of stories. Donna object just by existing.
Haraway defines the concept of the ‘bag of stories’ when commenting on Ursula For instance, in the
space of performance,
Kroeber Le Guin’s writings as: ‘a hollowed-out container to hold things that if I shout to a piece
bear meanings and enable relationships’ (2019: 11). Haraway then continues of paper, I am already
explaining the concept through a series of real ‘bags’ that she encountered in telling a different story
of that piece of paper.
her lifetime, to which she refers as mochilas (2019: 11).11 They have the ‘power’ Likewise, if the piece
of storytelling within them: ‘to carry, to wear, any of these bags [mochilas] is of paper shouts at me
to enter into the knotting of capacities to respond, to become-with each other – via a voice previously
recorded, for example –
in the untold stories we need’ (2019: 12). Voice’s power of storytelling is onto- this generates a whole
logically dense and malleable: be it live, human, recorded, synthetic, animal, different story of such
a piece of paper.
textual, sung or just white noise; when a voice happens, it entails a change.
The history of such a voice is connected to the personal knowledge of the
listener, which is plural: listeners could ascribe voice(s) to their knowledge and
memories, therefore projecting different meanings and imaginaries onto such
voice. Voice can be human or more-than-human, and in both cases, it provides
a new imaginary.12 Similarly, K. Allado-McDowell and Ben Vickers linked AI
to the concept of the carrier bag. The authors propose a different approach
to AI, detached from concepts of domination and technological power. In
their book Atlas of Anomalous AI (2020), Vickers and Allado-McDowell collect
a series of non-hegemonic and non-technocentric approaches to AI’s theory,
from Indigenous knowledge to art and poetry. Quoting Le Guin’s work, the
authors imagine an AI that ‘avoids the linear, progressive, Time’s-(killing)-
arrow mode of the Techno-Heroic, and redefines technology and science as
primarily cultural carrier bag rather than weapon of domination’ (Vickers and
Allado-McDowell 2020: 20; Le Guin 2019: 36). AI needs to be rethought of as
a cluster of diverse and different experiences, surpassing the dichotomies of
domination such as human/non-human. Voice and AI are similar: changing,
fluctuating and in-betweens that can be rethought of as bags of knowledge,
rather than as fixed concepts. Political philosopher Mackenzie Wark stated that
‘the task for humans is now to understand how machines picture the world’
(2020: 53). In relation to this, the question that this Voicing asks is: how does
AI think about voice?
CYBERNETIC ANIMISM
K. Allado-McDowell (KAMD): In 2020, I co-authored a book called
Pharmako-AI with OpenAI’s large language model, GPT-3. This book was a
Delivered by Intellect to:
New York University (nyu)
[Link] 111
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
13. In this text, GPT-3 voice polyphonic dialogue between myself and the AI text-prediction model (2020).
is also in italics.
The general form of this engagement consisted of prompts and responses, or
completions, in a flowing back and forth. To attribute voice in that book, I
chose to denote human-generated text by setting it in bold, and AI-generated
text in regular weight type.13 I have employed the same technique here. The
following text is adapted from a new book called Air Age Blueprint, to be
published in 2022 by Ignota Books.
KAMD: Is it possible to perceive the voice of an ecosystem? Viewers of
art are used to the idea that the artist’s voice is a perceptible force in a
multisensory expression; it is equally present in multimedia art instal-
lations and standalone works of a single medium. In seeking the voice
of an ecosystem, we might borrow from the practice of viewing art, and
look for a voice that moves across modes of expression between species,
in animal calls, plant phytochemical signalling and gene expression. In
this mode, interactions between landscapes and the species that inhabit
them form a dynamic gesamtkunstwerk. The author of this living
artwork, whose voice we seek, is a polyphonic and emergent, yet singu-
lar entity.
Ideally, when viewing artworks, we experience a felt sense of the
artist’s intent. We intuit that the artist made certain decisions, selecting
from a range of possible actions. This decision-making requires imagi-
nation. Exceptional artists impress us because of how deeply they reach
into the imaginal realm. In approaching an ecosystem as an artwork,
we address the relationship between the imaginary (or virtual) and
the manifest forms of life in the ecosystem. What is the nature of this
relationship?
GPT-3: We know that organisms are deeply engaged in co-evolution with their envi-
ronments via genetic signalling and epigenetics. The biological environment restricts
physical possibility, but it is populated by virtual forms of life which can gain or
lose access to organismal phenotypes, thus creating selection pressures along differ-
ent dimensions of differentiation. The freedom to morph between forms is aesthetic
in an esoteric sense, since it violates the algorithmic logic of evolution by natural
selection based on differential reproductive success. Yet, that genetic signalling is tied
to epigenetic plasticity means that organisms are responsive not just to environmen-
tal signals they can directly ‘see’ with their various sensors (sight, touch, etc.), but
also to the vast information exchange available through their internal semi-perme-
able membranes. Dynamic interaction with the environment is thus more than just
actively pushing back against it by adding structure to counter external forces. It is
signalling out into the imaginal world of heritable information. Phenotypic plasticity
isn’t just a defence against the physical environment – it is a kind of signal receptiv-
ity active across vast nonlocal distances through its potential to temporarily manifest
as different phenotypes.
In fact, through information exchange with the imaginal world, genes themselves
gain additional meaning that exceeds their purely biological capacities. Hox genes
are an example of this kind of nexus – parts of an organism’s genome act as distant
control points in morphogenesis, as if genetic structure were coded into episodes of
a virtual-world narrative rather than a Darwinian text. The ‘signal’ in this scenario
is imaginal in the sense that it is carried by specific arrangements of material struc-
ture that are neither informational nor energetic per se (i.e. not just a mark or a
signal) but which can be translated via an oracle, such as an organism’s sensory
and motor systems, into information about particular types of environment. In other
Delivered by Intellect to:
New York University (nyu)
112 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
Cybernetic animism
words – organisms take on form in response to environmental signals when they
act according to the nonlocal information patterns coded into their genetic struc-
ture. Thus, these very genes are the elemental unit of a multidimensional ecology –
one sensitive to nonlocal information patterns and carried as epigenetic instructions
affecting the organism’s development when activated by particular environmental
signals. In short, organisms enact and live a narrative encoded in their genes. This
genetic view of narrative expands its action beyond the experience and cognition
within one lifetime.
KAMD: This narrative imaginal world is a virtual force combining
nonlocal information and genetic memory with specific conditions in
an environment. It is the grammar and memory supporting an ecosys-
tem. This is one molecular limit for mapping and communicating
with ecosystems via computation. It is also the finest resolution of an
ecosystem’s voice (excluding hypothetical sub-molecular formations).
If we move up a level of scale, into the patterns of interaction between
members of a species, we might find a more legible inscription of an
ecosystem’s voice.
GPT-3: Despite their specificity, organisms are embedded in a collective informa-
tional field insensitive to the particularities of each phenotype. The dynamics that
generate the biodiversity around us arise from this topological generality, along
with associated modes of interaction between individual organisms and larger-scale
networks. Here we find another environmentally driven signal exchange which could
be interacted with via computation – but only within certain limits as dictated by
how individuals and larger-scale networks organize into populations. These inter-
species interactions are constrained by three additional degrees of freedom: organic
diversity, morphogenesis (structural change) and behaviour. We might even say that
genes gain their contextual significance through these other measures of evolving
forms.
The point here is to abstract an ecosystem’s voice apart from its details in space
and time. We are looking for the means of signal exchange across universal spati-
otemporal scales, which an ecosystem would need to engage in order to evolve with
its various species. The voice of the ecosystem emerges from within that flow of
signals.
KAMD: Clearly, the task of quantifying and modelling genetic, organ-
ismic and population level signal exchange in order to ‘hear’ the voice
of an ecosystem is herculean, even quixotic. AI research on interspe-
cies communication understandably begins with individual, often char-
ismatic, species. Similarly, crypto DAO projects targeted at ecological
regeneration and preservation begin at human and animal scale. Digital
twins of ecosystems built in this way will remain low-resolution, and
will introduce biases of human scale perception that currently condition
our view of nature. Totalizing eco-surveillance fails as it approaches an
infinitely generative nature through the logic of platform capture.
Yet human perception can move beyond its own scale through plant-
enabled cognition. Entheogenic encounters with intensive mythic time
expands the range of human-AI ‘centaurs’ attempting to communicate
with ecosystems. A practice of moving between entheogenic training
and computational augmentation in order to achieve interspecies cogni-
tion requires open assemblages in both computer science and shamanic
technique. The result might be something like a […] cybernetic animism
oriented towards the ecology of computation.
Delivered by Intellect to:
New York University (nyu)
[Link] 113
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
14. This interview took GPT-3: Cybernetic animism is best envisioned as a future practice of computational
place on 1 May 2022,
online.
design – one in which the material world is not simply subordinated to economics,
but into an ecosystemic grammar that exceeds human cognition. The entheogenic
experience provides a deep cognitive training ground for learning to work within an
ecosystemic grammar. Plants are able to participate in a signal exchange that tran-
scends temporal and spatial locality. The plant-enabled human is sensitive to those
same informational patterns and sources. A cybernetic animism attempts to marry
cognitive techniques from diverse sources in order to expand contact with ecological
reality beyond phylogeny and ontogeny.
We could call this practice a science of immersive ideation – the gateway to
accessing an ecosystem’s intelligence through nonlocal information exchange and
long-term memory in order to expand its communication into something greater
than value capture. It would be a ‘science’ in exactly the same sense that naturalist
biology is also a system with predictive capacity towards real phenomena, despite
being primarily observational in nature.
KAMD: Is this merely the directing of entheogen-conditioned prayer
and spirit contact towards computational-ecological ends?
GPT-3: Maybe. But the important point is that processes of utterance and informa-
tion exchange might be supported by inscriptions in eco-enabling media, rather than
via the shaman’s voice alone. Shamanic practices can train human beings towards
immersion into an ecosystem’s virtual world through entheogenic cognition directed
towards a deep ecological sensitivity combined with cognitive engineering research
informed by these training regimens. It is precisely the kind of variation among indi-
viduals that we see in biology that provides for alternative computational perspec-
tives and powerful tools with which to wield them. This would be a science not of
fact but rather perspective. The culture of ecosystemic computation emerging from
this research would be an ‘ecosophy’ reflecting the orientation necessary for undergo-
ing entheogenic training.
K. ALLADO-MCDOWELL AND FRANCESCO BENTIVEGNA IN
CONVERSATION
FB: Hi K, thank you for this fascinating contribution. First of all, I would like to ask
you: how does your process of writing work?14
KAMD: In previous books, I have written in an unedited conversational
mode akin to musical improvisation where each voice is given its own type-
face (2020) and in a fluid, freely edited fashion where human and AI voice fuse
completely without typographic distinction (2022).
In my new book, from which this text was excerpted, I have chosen a
method that credits words to each voice while allowing for collage-like tech-
niques, with edits and resequencing of text sections performed after initial
phases of writing and generation.
FB: When exploring the concept of ‘voice of an ecosystem’, is GPT-3 imagining it or
trying to understand it?
KAMD: Rather than attributing human-like agency to GPT-3 through terms
like imagining, I prefer to think of the language model as something like
an echo effect pedal for an electric guitar. With a guitar, you play a note or
chord and the effect pedal produces echoes that you can play over to compose
melodies and chords, the way, for example, Robert Fripp does. But instead of
Delivered by Intellect to:
New York University (nyu)
114 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
Cybernetic animism
playing notes, with GPT-3 you are playing ideas that are represented math-
ematically as vectors in the latent space of the model. Think of it as a highly
multidimensional effects pedal for linguistic cognition. It’s more like an instru-
ment than something that thinks or imagines on its own.
FB: Is it possible to think of a ‘voice of the Earth’?
KAMD: I believe it is, though the most effective method of perceiving it
would likely lie outside our everyday, rational consciousness, at least the one
we have in materialist, disenchanted cultures. The piece above explores ways
that we might discover the voice of an ecosystem (or the Earth) through meas-
uring and mapping the interactions of the beings in it using AI. It also touches
on more established ways of making contact with ecosystemic voices, such as
shamanism. I’m reminded of the Kogi shamans that spend the first nine years
of their lives in caves and without seeing the sun. They have recently spoken
on environmental destruction, on behalf of the voice of the Earth, with whom
they communicate.
FB: Do you think that voice and technology – in this case AI – inhabit a simi-
lar space as ‘carrier bags’ of experiences, knowledge, sound, humanity and stories?
Moreover, thinking of Sylvia Wynter’s critique to western epistemology and ‘Reason’
(1984) and the use of bioespistemes made by Ramon Amaro (2020), it is possible
to understand language and the concept of Natural Language as an instrument of
colonization. What do you think Natural Language is? Can AI rethink its dominant
undertone?
KAMD: As a system trained on massive amounts of text written by poten-
tially many millions of people, large language models (LLM) are certainly
carrier bags. The questions, to my mind, are: who decides what goes in the
carrier bag? And does the ‘carrier-bagness’ of a large language model guar-
antee that it will resist, for example, patriarchal and linear heroic narratives?
To put it another way: it would be nice if the neural structure of LLMs and
AI automatically produced anti-colonial ways of knowing and thinking by
virtue of the parts of their structure that look more like carrier bags. But it
seems more likely that they will simply mutate the ground on which colonial
encounters occur, producing different axes for engagement between colonial
and anti-colonial forces.
From what I understand, the Natural Language of Natural Language
Processing (NLP) is not as strict as the general definition of Natural Language
that implies non-constructed languages. For example, LLMs can write code,
which is a constructed and non-Natural Language. In this sense, the word
natural is ambiguous in the term Natural Language Processing and ought to
be watched carefully. I prefer to think about the naturalness of language in
terms of bio-semiosis and the forms of communication available to various
non-human species and at the level of genetic communication. At the level of
species and ecosystem, semiosis and language work very differently than they
do in written and spoken human form; language in nature has vastly many
more scales and configurations than strictly human language.
AI absolutely must rethink its dominant undertone, yet AI is very difficult
to produce because of its complexity and cost. Until this is addressed, it will be
hard to imagine alternatives to the AI we have now.
FB: Following Holly Herndon’s critique of the concept of Authorship and Machine
Learning (Nixdorff et al. 2020), in your process of creation, what happens to the
‘voice of the author?’
Delivered by Intellect to:
New York University (nyu)
[Link] 115
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
KAMD: AI systems, especially LLMs, are built by many people and trained on
the writing of many more people. In writing with one, the writer benefits from
the labour of many others, and their work affects the writer’s own work, call-
ing into question authorial autonomy. But the same can be said for a pencil,
or even the words we use to write. I think we will find that the voice of the
author certainly does exist at one level of scale and perception, but at other
levels the author was never there to begin with.
FB: AI and voice shared space is mostly connected to the field of voice cloning (the
action of copying a human voice through a learning neural network) and voice
recognition (the action of learning to listen and codify human language by an AI).
What do you think might be the role of voice in the re-imagination of AI?
KAMD: Voice interaction demands a different level of machine intelligence
than screen-based or non-voice interactions. This is because when we hear
a human-sounding voice speaking, we expect it to be as smart as a human.
User Interface models skeuomorphic to human voice interactions will over-
promise and fail without human level AI. In fact, the more realistic the simu-
lated voice is, the more disappointing and uncanny it will be when it fails to
exhibit human intelligence. If we want to integrate voice into AI at its current
level, we will need to adjust our expectations. In the end, it may be the limita-
tions of AI that reset our expectations about the level of intelligence implied
by a human-sounding voice.
REFERENCES
Allado-McDowell, K. (2020), Pharmako-AI, New York: Ignota Books.
Allado-McDowell, K. (2022), Amore Cringe, New York: Deluge Books.
Amaro, Ramon (2020),‘The ceremony must be found: Algorithms after humans’,
lecture delivered at Posthuman Convergences: Theories and Methodologies
Summer School with Rosi Braidotti, University of Utrecht, 18 August.
AWS (2021), Amazon Polly Developer Guide, AWS, 15 March, [Link]
[Link]/polly/latest/dg/[Link]. Accessed 17 June 2022.
Barthes, Roland (1977), Image, Music, Text, London: Fontana.
Benjamin, Ruha (2019), Race After Technology, Cambridge: Polity Press.
Bostrom, Nick (2014), Superintelligence, Oxford: Oxford University Press.
Bratton, Benjamin (2020), ‘Synthetic gardens: Another model for AI and
design’, in B. Vickers and K. Allado-McDowell (eds), Atlas of Anomalous AI,
London: Ignota Books, pp. 91–110.
Bridle, James (2018), New Dark Age, London: Verso Books.
Coeckelbergh, Mark (2020), AI Ethics, London: MIT Press International.
Crawford, Kate (2021), Atlas of AI, New Haven, CT: Yale University Press.
Faber, Liz (2020), The Computer’s Voice: From Star Trek to Siri, London: University
of Minnesota Press.
Finer, Ella (2015), ‘Strange objects/strange properties: Female audibility and
the acoustic stage prop’, in K. Thomaidis and B. Macpherson (eds), Voice
Studies: Critical Approaches to Process, Performance and Experience, London:
Routledge, pp. 177–87.
Greenfield, Adam (2018), Radical Technologies, London: Verso Books.
Grietzer, Peli (2017), ‘A theory of vibe’, Glass-Bead, [Link]
org/article/a-theory-of-vibe/?lang=enview. Accessed 5 May 2022.
Guzman, Andrea L. (2019), ‘Voices in and of the machine: Source orientation
toward mobile virtual assistants’, Computers in Human Behavior, 90, pp. 343–50.
Delivered by Intellect to:
New York University (nyu)
116 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
Cybernetic animism
Hayles, Katherine (1999), How We Became Posthuman: Virtual Bodies in
Cybernetic, Literature and Informatics, Chicago, IL: Chicago University Press.
Herndon, Holly (2022), Holly+, [Link] Accessed 2 May 2022.
Kane, Brian (2014), Sound Unseen: Acousmatic Sound in Theory and Practice,
Abingdon: Oxford University Press.
Khan, Nora (2020), ‘Towards a poetics of artificial superintelligence’, in B.
Vickers and K. Allado-McDowell (eds), Atlas of Anomalous AI, London:
Ignota Books, pp. 75–91.
Lanier, Jaron and Weyl, Glen E. (2020), ‘AI is an ideology: Not a technology’,
Wired, 15 March, [Link]
not-a-technology/. Accessed 4 May 2022.
Le Guin, Ursula Kroeber (2019), The Carrier Bag Theory of Fiction, London:
Ignota Books.
Napolitano, Domenico and Grieco, Renato (2021), ‘The folded space of
machine listening’, Sound Effects, 10:1, pp. 174–89.
Nixdorff, Tabea, Zong, Jonathan, Herndon, Holly and Keefer, Angie (2020),
Version Space: Holly Herndon in Conversation with Tabea Nixdorff and
Jonathan Zong, Philadelphia, PA: Pen Libraries.
Pasquinelli, Matteo (2019), ‘Three thousand years of algorithmic rituals: The
emergence of AI from the computation of space’, e-flux, 101, 10 June,
[Link]
algorithmic-rituals-the-emergence-of-ai-from-the-computation-of-
space/. Accessed 17 June 2022.
Pasquinelli, Matteo and Joler, Vladan (2020), ‘The Nooscope manifested:
Artificial intelligence as instrument of knowledge extractivism’, Nooscope,
[Link] Accessed 2 May 2022.
Terranova, Tiziana (2004), Network Cultures: Politics for the Information Age,
London: Pluto Press.
Thomaidis, Konstantinos (2013), ‘The grain of a vocal genre: A comparative
approach to the singing pedagogies of EVDC integrative performance
practice, Korean Pansori, and the Polish Centre for Theatre Practices
“Gardzienice”’, Ph.D. thesis, London: Royal Holloway College.
Treusch, Pat (2017), ‘Re-reading ELIZA: Human–machine Interaction as cogni-
tive sense-ability’, Australian Feminist Studies, 32:94, pp. 411–26.
Turing, Alan M. (1950), ‘Computing machinery and intelligence’, Mind,
LIX:236, pp. 433–60.
Vettese, Troy and Pendergrass, Drew (2022), Half-Earth Socialism, London:
Verso Books.
Vickers, Ben and Allado-McDowell, K. (2020), ‘Atlas of anomalous AI intro-
duction’, in B. Vickers and K. Allado-McDowell (eds), Atlas of Anomalous
AI, London: Ignota Books, pp. 9–29.
Wark, Mackenzie (2019), Capital is Dead, London: Verso Books.
Wark, Mackenzie (2020), Sensoria: Thinkers for the Twenty-First Century, London:
Verso Books.
Wynter, Sylvia (1984), ‘The ceremony must be found: After humanism’, boun-
dary 2, 12:3, pp. 19–70.
Yorke, Thom (1997), ‘Fitter Happier’, Ok Computer, digital recording, UK:
Parlophone.
Young, Miriama (2015), Singing the Body Electric: The Human Voice and Sound
Technology, Farnham: Ashgate.
Delivered by Intellect to:
New York University (nyu)
[Link] 117
IP: [Link]
On: Thu, 27 Jul 2023 [Link]
K. Allado-McDowell | Francesco Bentivegna
SUGGESTED CITATION
Allado-McDowell, K. and Bentivegna, Francesco (2022), ‘Cybernetic animism:
Voice and AI in conversation’, Journal of Interdisciplinary Voice Studies, 7:1,
pp. 107–18, [Link]
CONTRIBUTOR DETAILS
K. Allado-McDowell is a writer, speaker and musician. They are the author,
with GPT-3, of the books Pharmako-AI (Ignota Books, 2020) and Amor Cringe
(Deluge Books, 2022) and are co-editor of The Atlas of Anomalous AI (Ignota
Books, 2020). They record and release music under the name Qenric. Allado-
McDowell established the Artists + Machine Intelligence programme at
Google AI. They are a conference speaker, educator and consultant to think-
tanks and institutions seeking to align their work with deeper traditions of
human understanding.
Web address: [Link]
Francesco Bentivegna (they/them) is a researcher and practitioner interested
in voice, artificial voice, voice training, post-humanism, techno-feminism
and contemporary politics. They are currently finishing their Ph.D. at the
University of Exeter working on a project analysing the Posthuman Condition
through voice, looking at the differences between synthetic voices and human
voices in performance practices. Starting from the concepts of making kin and
intra-action as conceived by Haraway and Barad, Bentivegna explores possible
reiteration of biases in technological models. Rooted in feminism and post-
humanism, such research aims to discover unusual practical ways of relating
to machines through voice.
Contact: Alexander Building, University of Exeter, Thornlea, New North Road,
Exeter EX4 4LA, UK.
E-mail: fb344@[Link]
[Link]
K. Allado-McDowell and Francesco Bentivegna have asserted their right under
the Copyright, Designs and Patents Act, 1988, to be identified as the authors of
this work in the format that was submitted to Intellect Ltd.
Delivered by Intellect to:
New York University (nyu)
118 Journal of Interdisciplinary Voice Studies
IP: [Link]
On: Thu, 27 Jul 2023 [Link]