theLev Manovich, The Language of New Media [PDF, 32.4 MB]128,849 Words:
I
Lev Manovich
The Language of New Media
II
To Norman Klein / Peter Lunenfeld / Vivian Sobchack
III
Table of Contents
Prologue: Vertov’s Dataset.................................................................................VI
Acknowledgments........................................................................................XXVII
Introduction.........................................................................................................30
A Personal Chronology...........................................................................30
Theory of the Present..............................................................................32
Mapping New Media: the Method..........................................................34
Mapping New Media: Organization.......................................................36
The Terms: Language, Object, Representation......................................38
I. What is New Media?........................................................................................43
Principles of New Media..............................................................................49
1. Numerical Representation...................................................................49
2. Modularity..........................................................................................51
3. Automation.........................................................................................52
4. Variability...........................................................................................55
5. Transcoding........................................................................................63
What New Media is Not...............................................................................66
Cinema as New Media............................................................................66
The Myth of the Digital..........................................................................68
The Myth of Interactivity........................................................................70
II. The Interface...................................................................................................75
The Language of Cultural Interfaces.........................................................80
Cultural Interfaces...................................................................................80
Printed Word...........................................................................................83
Cinema....................................................................................................87
HCI: Representation versus Control.......................................................94
The Screen and the User..............................................................................99
A Screen's Genealogy.............................................................................99
The Screen and the Body......................................................................105
IV
Representation versus Simulation.........................................................111
III. The Operations...........................................................................................115
Menus, Filters, Plug-ins.............................................................................120
The Logic of Selection..........................................................................120
“Postmodernism” and Photoshop.........................................................124
From Object to Signal...........................................................................126
Compositing................................................................................................130
From Image Streams to Modular Media...............................................130
The Resistance to Montage...................................................................134
Archeology of Compositing: Cinema...................................................138
Archeology of Compositing: Video......................................................141
Digital Compositing..............................................................................143
Compositing and New Types of Montage...........................................145
Teleaction....................................................................................................150
Representation versus Communication................................................150
Telepresence: Illusion versus Action....................................................152
Image-Instruments................................................................................155
Telecommunication..............................................................................156
Distance and Aura.................................................................................158
IV. The Illusions................................................................................................162
Synthetic Realism and its Discontents......................................................168
Technology and Style in Cinema..........................................................168
Technology and Style in Computer Animation....................................171
The icons of mimesis............................................................................177
Synthetic Image and its Subject................................................................180
Georges Méliès, the father of computer graphics.................................180
Jurassic Park and Socialist Realism......................................................181
Illusion, Narrative and Interactivity........................................................185
V. The Forms.....................................................................................................190
Database......................................................................................................194
The Database Logic..............................................................................194
Data and Algorithm..............................................................................196
Database and Narrative.........................................................................199
Paradigm and Syntagm.........................................................................202
V
A Database Complex............................................................................205
Database Cinema: Greenaway and Vertov...........................................207
Navigable space..........................................................................................213
Doom and Myst....................................................................................213
Computer Space....................................................................................219
The Poetics of Navigation.....................................................................223
The Navigator and the Explorer............................................................231
Kino-Eye and Simulators......................................................................234
EVE and Place......................................................................................240
VI. What is Cinema?.........................................................................................244
Digital Cinema and the History of a Moving Image...............................249
Cinema, the Art of the Index................................................................249
A Brief Archeology of Moving Pictures...............................................251
From Animation to Cinema..................................................................252
Cinema Redefined.................................................................................253
From Kino-Eye to Kino-Brush.............................................................259
New Language of Cinema..........................................................................260
Cinematic and Graphic: Cinegratography............................................260
New Temporality: Loop as a Narrative Engine....................................264
Spatial Montage....................................................................................269
Cinema as an Information Space..........................................................273
Cinema as a Code.................................................................................276
NOTES...............................................................................................................279
VI
Prologue: Vertov’s Dataset
The avant-garde masterpiece A Man With a Movie Camera completed by Russian
director Dziga Vertov in 1929 will serve as our guide to the language of new
media.This prologue consists of a number of stills from the film. Each still is
accompanied by quote from the text summarizing a particular principle of new
media. The number in brackets indicates a page from which the quote is taken.
The prologue thus acts as a visual index to some of the book's ideas.
VII
1.
[figure 1]
(87) ”A hundred years after cinema's birth, cinematic ways of seeing the world, of
structuring time, of narrating a story, of linking one experience to the next, are
being extended to become the basic ways in which computer users access and
interact with all cultural data. In this way, the computer fulfills the promise of
cinema as a visual Esperanto which pre-occupied many film artists and critics in
the 1920s, from Griffith to Vertov. Indeed, millions of computer users
communicate with each other through the same computer interface. And, in
contrast to cinema where most of its ‘users’ were able to ‘understand’ cinematic
language but not ‘speak’ it (i.e., make films), all computer users can ‘speak’ the
language of the interface. They are active users of the interface, employing it to
perform many tasks: send email, organize their files, run various applications, and
so on.”
VIII
2.
[figure 2] [figure 3] [figure 4] [figure 5]
(91) “The incorporation of virtual camera controls into the very hardware of a
game consoles is truly a historical event. Directing the virtual camera becomes as
important as controlling the hero's actions… the computer games are returning to
"The New Vision" movement of the 1920s (Moholy-Nagy, Rodchenko, Vertov
and others), which foregrounded new mobility of a photo and film camera, and
made unconventional points of view the key part of their poetics.
IX
3.
[figure 6] [figure 7] [figure 8] [figure 9]
(140) “Editing, or montage, is the key twentieth technology for creating fake
realities. Theoreticians of cinema have distinguished between many kinds of
montage but, for the purposes of sketching the archeology of the technologies of
simulation leading to digital compositing, I will distinguish between two basic
techniques. The first technique is temporal montage: separate realities form
consecutive moments in time. The second technique is montage within a shot. It is
the opposite of the first: separate realities form contingent parts of a single
image… examples [of montage within a shot] include the superimposition of a
few images and multiple screens used by the avant-garde filmmakers in the
1920’s (for instance, superimposed images in Vertov’s Man with a Movie Camera
and a three-part screen in Gance Abel’s 1927 Napoléon).
X
4.
[figure 10] [figure 11] [figure 12]
(140) “As theorized by Vertov, through [temporal] montage, film can overcome
its indexical nature, presenting a viewer with objects which never existed in
reality.”
XI
5.
[figure 13] [figure 14]
(147) “While the dominant use of digital compositing is to create a seamless
virtual space, it does not have to be subordinated to this goal. The borders
between different worlds do not have to be erased; the different spaces do not
have to be matched in perspective, scale and lighting; the individual layers can
retain their separate identity rather then being merged into a single space; the
different worlds can clash semantically rather than form a single universe.”
XII
6.
[figure 15] [figure 16] [figure 17] [figure 18] [figure 19]
(158) “The cameraman, whom Benjamin compares to a surgeon, ‘penetrates
deeply into its [reality] web’; his camera zooms in order to ‘pray an object from
its shell.’ With its new mobility, glorified in such films as A Man with the Movie
Camera, the camera can be anywhere, and, with its superhuman vision, it can
obtain a close-up of any object... Along with disregarding the scale, the unique
locations of the objects are discarded as well as their photographs brought
together within a single picture magazine or a film newsreel, the forms which fit
in with the demand of mass democratic society for ‘the universal equality of
things.’”
XIII
7.
[figure 20] [figure 21]
(160) “Modernization is accompanied by the process of disruption of physical
space and matter, the process which privileges interchangeable and mobile signs
over the original objects and relations…The concept of modernization fits equally
well Benjamin's account of film and Virilio's account of telecommunication, the
latter just being a more advanced stage in this continual process of turning objects
into mobile signs. Before, different physical locations met within a single
magazine spread or a film newsreel; now, they meet within a single electronic
screen.”
XIV
8.
[figure 22] [figure 23]
(183) “Whose vision is it? It is the vision of a computer, a cyborg, a automatic
missile. It is a realistic representation of human vision in the future when it will be
augmented by computer graphics and cleansed from noise. It is the vision of a
digital grid. Synthetic computer-generated image is not an inferior representation
of our reality, but a realistic representation of a different reality.”
XV
9.
[figure 24]
(209) “Along with Greenaway, Dziga Vertov can be thought of as a major
‘database filmmaker’ of the twentieth century. Man with a Movie Camera is
perhaps the most important example of database imagination in modern media
art.”
XVI
10.
[figure 25] [figure 26] [figure 27]
(210) “Just as new media objects contain a hierarchy of levels (interface —
content; operating system — application; Web page — HTML code; high-level
programming language — assembly language — machine language), Vertov's
film consists of at least three levels. One level is the story of a cameraman filming
material for the film. The second level is the shots of an audience watching the
finished film in a movie theater. The third level is this film, which consists from
footage recorded in Moscow, Kiev and Riga and is arranged according to a
progression of one day: waking up — work — leisure activities. If this third level
is a text, the other two can be thought of as its meta-texts.”
XVII
11.
[figure 28] [figure 29] [figure 30] [figure 31] [figure 32] [figure 33] [figure 34]
(211) ”If a ‘normal’ avant-garde film still proposes a coherent language different
from the language of mainstream cinema, i.e. a small set of techniques which are
repeated, Man with a Movie Camera never arrives at anything like a well-defined
language. Rather, it proposes an untamed, and apparently endless unwinding of
cinematic techniques, or, to use contemporary language, ‘effects’, as cinema's
new way of speaking.”
XVIII
12.
[figure 35] [figure 36]
(212) ”And this is why Vertov’s film has a particular relevance to new media. It
proves that it is possible to turn “effects” into a meaningful artistic language. Why
in the case of Witney's computer films and music videos the effects are just
effects, while in the hands of Vertov they acquire meaning? Because in Vertov's
film they are motivated by a particular argument, this being that the new
techniques to obtain images and manipulate them, summed up by Vertov in his
term "kino-eye," can be used to decode the world. As the film progresses,
"straight" footage gives way to manipulated footage; newer techniques appear one
after one, reaching a roller coaster intensity by the film's end, a true orgy of
cinematography. It is as though Vertov re-stages his discovery of the kino-eye for
us. Along with Vertov, we gradually realize the full range of possibilities offered
by the camera. Vertov's goal is to seduce us into his way of seeing and thinking,
to make us share his excitement, his gradual process of discovery of film's new
language. This process of discovery is film's main narrative and it is told through
a catalog of discoveries being made. Thus, in the hands of Vertov, a database, this
normally static and "objective" form, becomes dynamic and subjective. More
importantly, Vertov is able to achieve something which new media designers and
artists still have to learn — how to merge database and narrative merge into a new
form.”
XIX
13.
[figure 37] [figure 38] [figure 39]
(226) “If modern visual culture exemplified by MTV can be thought of as a
Mannerist stage of cinema, its perfected techniques of cinematography, mise-en-
scene and editing self-consciously displayed and paraded for its own sake,
Waliczky's film presents an alternative response to cinema’s classical age, which
is now behind us. In this meta-film, the camera, part of cinema’s apparatus,
becomes the main character (in this we may connect The Forest to another meta-
film, A Man with a Movie Camera).”
XX
14.
[figure 40] [figure 41] [figure 42] [figure 43]
(236) “Vertov stands half-way between Baudelaire's flâneur and computer user:
no longer just a pedestrian walking through a street, but not yet Gibson’s data
cowboy who zooms through pure data armed with data mining algorithms. In his
research on what can be called “kino-eye interface,” Vertov systematically tried
different ways to overcome what he thought were the limits of human vision. He
mounted cameras on the roof of a building and a moving automobile; he slowed
and speed up film speed; he superimposed a number of images together in time
and space (temporal montage and montage within a shot). A Man with a Movie
Camera is not only a database of city life in the 1920s, a database of film
techniques, and a database of new operations of visual epistemology, but it is also
a database of new interface operations which together aim to go beyond a simple
human navigation through a physical space.”
XXI
15.
[figure 44] [figure 45]
(258) “Avant-garde aesthetic strategies became embedded in the commands and
interface metaphors of computer software. The avant-garde became materialized
in a computer. Digital cinema technology is a case in point. The avant-garde
strategy of collage reemerged as a "cut and paste" command, the most basic
operation one can perform on digital data. The idea of painting on film became
embedded in paint functions of film editing software. The avant-garde move to
combine animation, printed texts and live action footage is repeated in the
convergence of animation, title generation, paint, compositing and editing systems
into single all-in-one packages.”
XXII
16.
[figure 46] [figure 47]
(265) “Cinema's birth from a loop form was reenacted at least once during its
history. In one of the sequences of A Man with a Movie Camera, Vertov shows
us a cameraman standing in the back of a moving automobile. As he is being
carried forward by an automobile, he cranks the handle of his camera. A loop, a
repetition, created by the circular movement of the handle, gives birth to a
progression of events -- a very basic narrative which is also quintessentially
modern: a camera moving through space recording whatever is in its way.”
XXIII
17.
[figure 48]
(266) “Can the loop be a new narrative form appropriate for the computer age? It
is relevant to recall that the loop gave birth not only to cinema but also to
computer programming. Programming involves altering the linear flow of data
through control structures, such as ‘if/then’ and ‘repeat/while’; the loop is the
most elementary of these control structures…. As the practice of computer
programming illustrates, the loop and the sequential progression do not have to be
thought as being mutually exclusive. A computer program progresses from start to
end end by executing a series of loops.”
XXIV
18.
[figure 49] [figure 50] [figure 51]
(270) “Spatial montage represents an alternative to traditional cinematic temporal
montage, replacing its traditional sequential mode with a spatial one. Ford's
assembly line relied on the separation of the production process into a set of
repetitive, sequential, and simple activities. The same principle made computer
programming possible: a computer program breaks a tasks into a series of
elemental operations to be executed one at a time. Cinema followed this logic of
industrial production as well. It replaced all other modes of narration with a
sequential narrative, an assembly line of shots which appear on the screen one at a
time. A sequential narrative turned out to be particularly incompatible with a
spatial narrative which played a prominent role in European visual culture for
centuries.”
XXV
19.
[figure 52]
(271) “Since the Xerox Park Alto workstation, GUI used multiple windows. It
would be logical to expect that cultural forms based on moving images will
eventually adopt similar conventions… We may expect that computer-based
cinema will eventually have to follow the same direction — especially when the
limitations of communication bandwidth will disappear, while the resolution of
displays will significantly increase, from the typical 1-2K in 2000 to 4K, 8K or
beyond. I believe that the next generation of cinema — broadband cinema — will
add multiple windows to its language.”
XXVI
20.
[figure 53]
[figure 54]
[figure 55]
(273) “If HCI is an interface to computer data, and a book is interface to text,
cinema can be thought of an interface to events taking place in 3D space. Just as
painting before it, cinema presented us with familiar images of visible reality —
interiors, landscapes, human characters — arranged within a rectangular frame.
The aesthetics of these arrangements ranges from extreme scarcity to extreme
density… It would be only a small leap to relate this density of “pictorial
displays” to the density of contemporary information displays such as Web
portals which may contain a few dozen hyperlinked elements; or the interfaces of
popular software packages which similarly present the user with dozens
commands at once.”
XXVII
Acknowledgments
Special thanks: Doug Sery, my editor at MIT whose support and continuos
encouragment made this book possible; Mark Tribe, who read the manuscript in
its entarity, offered numerous suggestestions and helped me with the last stage of
manuscript preparation; Rochelle Feinstein, for everything.
This book would not exist without all the friends, colleagues and
institutions committed to new media art and theory. I am grateful to all of them
for ongoing exchange, and intellectual and emotional support.
For providing inspiring places to work: Mondrian Hotel (West Hollywood,
Los Angeles), The Standard (West Hollywood, Los Angeles), Fred Segal (West
Hollywood, Los Angeles), Del Mar Plaza (Del Mar, CA), Gitano (Nolita, NYC),
Space Untitled (Soho, New York), The Royal Library (Stockholm), De Jaren
(Amsterdam).
Administrative support: Department of Visual Arts, University of
California, San Diego; Department of Cinema Studies, Stockholm University;
Center for User-centered Interface Design, Royal Institute of Technology,
Stockholm.
Word processor: Microsoft Word.
Web browser: Netscape Navigator, Internet Explorer.
Favorite search engine: www.hotbot.com
Favorite moving image format: QuickTime
HTML editor: Netscape Communicator, Macromedia Dreamweather.
OS: Windows 98.
Hardware: SONY PCG505FX laptop.
Mobile phone: Nokia.
The principal editing of his book was done between July 1998 and
November 1999 in La Jolla and Del Mar, California; Los Angeles; New York;
Stockholm, Helsinki, and Amsterdam.
While significant parts of this book have been written anew, it have drawn
on material from a number of previously published articles. Sometimes only a
part of an article made it into the final manuscript; in other cases, its parts ended
up in different chapters of the book; in yet other case, a whole article became the
basis for one of the sections. In the following I list the articles which were used as
material for the book. Many of them were reprinted and translated into other
languages; here I list the first instance of publication in English. Also, it has been
my practice for a number of years to post any new writing I do to Nettime 1and
Rhizome2, the two important Internet email lists devoted to discussions of new
media art, criticism and politics. This helped me to receive immediate feedback
on my work and also provided me with a sense of community interested in my
XXVIII
work. Therefore, most of the articles appeared on these two email lists before
being published in more traditional print venues such as journals and anthologies
or in Internet journals.
"Assembling Reality: Myths of Computer Graphics." In Afterimage 20,
no. 2 (September 1992): 12-14.
"Paradoxes of Digital Photography." In Photography After Photography,
edited by v. Amelunxen, Stefan Iglhaut, Florian Rötzer, 58-66 (Münhen: Verlag
der Kunst, 1995).
"To Lie and to Act: Potemkin's Villages, Cinema and Telepresence." In
Mythos Information -- Welcome to the Wired World. Ars Electronica 95, edited
by Karl Gebel and Peter Weibel, 343-353 (Vienna and New York: Springler-
Verlag, 1995).
"Reading Media Art." (In German translation) in Mediagramm 20 (ZKM /
Zentrum für Kunst und Medientechnologie Karlsruhe, 1995): 4-5.
"Archeology of a Computer Screen." In NewMediaLogia (Moscow: Soros
Center for the Contemporary Art, 1996).
"Distance and Aura." In _SPEED_: Technology, Media, Society 1.4
(http://www.arts.ucsb.edu/~speed/1.4/), 1996.
"Cinema and Digital Media." In Perspektiven der Medienkunst /
Perspectives of Media Art, edited by Jeffrey Shaw and Hans Peter Schwarz
(Cantz Verlag Ostfildern, 1996.)
"What is Digital Cinema?" In Telepolis (www.ix.de/tp) (Munich: Verlag
Heinz Heise, 1996).
"The Aesthetics of Virtual Worlds: Report from Los Angeles." In
Telepolis (www.ix.de/tp) (Munich: Verlag Heinz Heise, 1996).
"On Totalitarian Interactivity." In RHIZOME (http://www.rhizome.com),
1996.
"Behind the Screen / Russian New Media." In art / text 58 (August -
October 1997): 40-43.
"Cinema as a Cultural Interface." In W3LAB
(http://gsa.rutgers.edu/maldoror/techne/w3lab-entry.html), 1998.
"Database as a Symbolic Form." In RHIZOME (www.rhizome.com),
1998.
“Navigable Space.” (In German translation) in
ONSCREEN/OFFSCREEN - Grenzen, Übergänge und Wandel des filmischen
Raumes, eds. Hans Beller, Martin Emele u. Michael Schuster (Cantz Verlag
Stuttgart, 1999).
"Cinema by Numbers: ASCII Films by Vuk Cosic." In Vuk Cosic:
Contemporary ASCII (Ljubljana, Slovenija: forthcoming).
(http://www.vuk.org/ascii/)
"New Media: a User's Guide" in NET.CONDITION (ZKM / Zentrum für
XXIX
Kunst und Medientechnologie Karlsruhe and The MIT Press, forthcoming).
30
Introduction
A Personal Chronology
Moscow, 1975. Although my ambition is to become a painter, I enroll in the so-
called “mathematical” (“matematicheskaya”) high school which in addition to a
regular curriculum has courses in calculus and computer programming. The
programming course lasts two years during which we never see a computer. Our
teacher uses a black board to explain the concepts of computer programming.
First we learn a computer language invented in Soviet Union in the late 1950s.
The language has a wonderful Cold War name: Peace-1 (Mir-1.) Later we learn a
more standard high-level language: ALGOL-60. For two years, we write
computer programs in our notebooks. Our teaches grades them and returns them
back with corrections: missed end of the loop statement; undeclared variable;
forgotten semi-colon. At the end of the two-year course we are taken—just
once—to a data processing center, which normally requires clearance to enter. I
enter my program into a computer but it does not run: since I never saw a
computer keyboard before, I use capital O whenever I need to input zero.
The same year, 1975, I start taking private lessons in classical drawing,
which also last two years. Moscow Architectural Institute entrance exams include
a test in which the applicants have to complete a drawing of an antique cast in
eight hours. To get the top grade one has to produce a drawing which not only
looks like the cast and has perfect perspective but also has perfect shading. Which
means that all shadows and surfaces are defined completely through shading, so
all lines originally used to define them disappear. Hundreds of hours spent in front
of a drawing board pay off: I get an A on the exam, even though out of eight
possible casts I get the most difficult one: Venera. It is more difficult because, in
contrast to casts of male heads such as Socrates, it does not have well-defined
facets; the surfaces join smoothly together as though constructed using a spline
modeling program. (Later I learn that, during the 1970s, computer scientists were
working on the same problem: how to produce smoothly shaded images of 3D
objects in a computer. The standard rendering algorithm still used today was
invented at the University of Utah in 1975—the same year I started my drawing
lessons.3
New York, 1985. It is early morning and I am sitting in front of a
Tetronics terminal in Midtown Manhattan. I have just finished my night shift at
Digital Effects, one of the first companies in the world devoted to producing 3D
computer animation for film and television. The company worked on Tron and
produced computer animation for all of the major television networks; my job was
31
to operate the Harris-500 mainframe, used to compute animations, and also the
PDP-11, which controlled Dicomed film recorder, used to output animation on
35mm film. After a few months I am able to figure out company’s proprietary
computer graphics software written in APL, and am now working on my first
images. I would like to produce a synthetic image of an antique cast, but it turns
out to be impossible. The software is only able to create 3D objects out of
primitive geometric forms such as cubes, cylinders and spheres; a decade would
have to pass before one could go on the Internet and download tens of thousands
of ready-made 3D models of all kinds of objects . So I settle for a composition
made out of these primitive forms. Tetronics is a vector rather than raster
terminal, which means that it does not update its screen in real time. Each time I
make a change in my program or simply change a point of view, I hit the enter
key and wait while the computer redraws the lines, one by one. I wonder why I
had to spend years learning to draw images in perspective when a computer could
do it in seconds. A few of the images I create are exhibited in shows of computer
art in New York. But this is heyday of post-modernism, the art market is hot,
paintings by young New York artists are selling for hundreds of thousands of
dollars, and the art world has little interest in computer animation or even
computer art.
Linz, Austria, 1995. I am at Ars Electronica, the world’s most prestigious
annual computer art festival. This year it drops the “Computer Graphics”
category, replacing it with the new “net art” category, signaling a new stage in the
evolution of modern culture and media. The computer, which since the early 60s
was used as a production tool, has become a universal media machine: a tool used
not only for production, but also for storage, distribution and playback. The
World Wide Web crystallized this new condition; on the level of language, it was
recognized already around 1990 when the term “digital media” came to be used
along with “computer graphics.” At the same time, along with existing cultural
forms, during the 1990s computers came to host an array of new forms: Web sites
and computer games, hypermedia CD-ROMs and interactive installations—in
short, “new media.” And if in 1985 I had to write a long computer program in a
specialized computer language just to put a picture of a shaded cube on a
computer screen, ten years later I can choose from a number of inexpensive,
menu-based 3D software tools which run on ordinary PCs and which come with
numerous ready-made 3D models, including detailed human figures and heads.
What else can be said about 1995? The Soviet Union, where I was born,
no longer exists. With its demise, the tensions which animated creative
imaginations both in the East and the West—between freedom and confinement,
between interactivity and predetermination, between consumerism in the West
and something which thinkers and artists in the East called “spirituality”— had
disappeared. What came in their place? A triumph of consumerism, commercial
culture (based on stereotypes and limited clichés), mega-corporations which laid
claims on such basic categories as space, time and the future (“Where do You
32
Want to Go Today?” ads by Microsoft; Internet Time by Swatch which breaks 24
hours into 1,000 Swatch ‘beats’; “You will” ads by AT&T), and something which
thinkers and artists call “globalization” (a term at least as elusive as
“spirituality”).
When I visited St. Petersburg in 1995 to participate in small computer art
festival called “In Search of Third Reality,” I saw a curious performance, which
may be a good parable of globalization. Like the rest of the festival, the
performance took place in the Planetarium. The Director of the Planetarium,
forced like everybody else to make his own living in the new Russian economic
order (or lack thereof), rented the Planetarium to conference organizers. Under the
black semi-spherical ceiling with mandatory models of planets and stars, a young
artist was methodically painting an abstract painting. Probably trained in the same
classical style as I was earlier, he was no Pollock; cautiously and systematically,
he made careful brushstrokes on the canvas in front of him. On his hand he wore a
Nintendo Dataglove, which in 1995 was a common media object in the West but a
rare sight in St. Petersburg. The Dataglove was transmitting the movements of his
hand to a small electronic synthesizer, assembled in the laboratory of some
Moscow institute. The music coming out of the synthesizer served as an
accompaniment to two dancers, a male and a female. Dressed in Isidora Dunkan
like clothing, they improvised a “modern dance” in front of the older and,
apparently, completely puzzled audience. Classical art, abstraction and a Nintendo
Dataglove; electronic music and early twentieth century modernism; discussions
of virtual reality (VR) in a Planetarium located in this classical city which, like
Venice, is obsessed with its past—what for me, coming from the West, were
incompatible historical and conceptual layers were here composited together, with
the Nintendo Dataglove being just one layer in this mix.
What had also come by 1995 was Internet—the most material and visible
sign of globalization. And, by the end of the decade, it has also become clear that
the gradual computerization of culture will eventually transform all of it. So, to
invoke the old Marxist model of base and superstructure, if the economic base of
modern society from the 1950s onward started to shift toward a service and
information economy, becoming by the 1970s a so-called “post-industrial society”
(Daniel Bell), and then later a “network society” (Manual Castells), by the 1990s
the superstructure started to feel the full impact of this change.4 If the “post-
modernism” of the 1980s was the first, preliminary echo of this shift still to
come—still weak, still possible to ignore—the 1990s’ rapid transformation of
culture into e-culture, of computers into universal culture carriers, of media into
new media, demanded that we rethink our categories and models.
The year is 2005…
Theory of the Present
33
I wish that somebody, in 1895, 1897 or at least in 1903, had realized the
fundamental significance of cinema's emergence and produced a comprehensive
record of the new medium's emergence: interviews with the audiences; a
systematic account of the narrative strategies, scenography and camera positions
as they developed year by year; an analysis of the connections between the
emerging language of cinema and different forms of popular entertainment which
coexisted with it. Unfortunately, such records do not exist. Instead, we are left
with newspaper reports, diaries of cinema's inventors, programs of film showings
and other bits and pieces—a set of random and unevenly distributed historical
samples.
Today we are witnessing the emergence of a new medium—the meta-
medium of the digital computer. In contrast to a hundred years ago, when cinema
was coming into being, we are fully aware of the significance of this new media
revolution. And yet I am afraid that future theorists and historians of computer
media will be left with not much more than the equivalents of newspaper reports
and film programs left from cinema's first decades. They will find that the
analytical texts from our era are fully aware of the significance of computer's
takeover of culture yet, by and large, mostly contain speculations about the future
rather than a record and a theory of the present. Future researchers will wonder
why the theoreticians, who already had plenty of experience analyzing older
cultural forms, did not try to describe computer media's semiotic codes, modes of
address, and audience reception patterns. Having painstakingly reconstructed how
cinema emerged out of preceding cultural forms (panorama, optical toys, peep
shows), why didn't they attempt to construct a similar genealogy for the language
of computer media at the moment when it was just coming into being, while the
elements of previous cultural forms going into its making were still clearly
visible, still recognizable before melting into a new unity? Where were the
theoreticians at the moment when the icons and the buttons of multimedia
interfaces were like wet paint on a just-completed painting, before they became
universal conventions and thus slipped into invisibility? Where were they at the
moment when the designers of Myst were debugging their code, converting
graphics to 8-bit and massaging QuickTime clips? Or at the historical moment
when a young 20-something programmer at Netscape took the chewing gum out
of his mouth, sipped warm Coke out of the can—he was at a computer for 16
hours straight, trying to meet a marketing deadline—and, finally satisfied with its
small file size, saved a short animation of stars moving across the night sky? This
animation was to appear in the upper right corner of Netscape Navigator, thus
becoming the most widely seen moving image sequence ever until the next
release of the software
The following is an attempt at both a record, and a theory, of the present.
Just as film historians traced the development of film language during cinema's
first decades, I aim to describe and understand the logic driving the development
34
of the language of new media. (I am not claiming that there is a single language of
new media; rather, I use it as an umbrella term to refer to a number of various
conventions used by designers of new media objects to organize data and
structure user’s experience.) It is tempting to extend this parallel a little further
and to speculate whether today this new language is already getting closer to
acquiring its final and stable form, just as film language acquired its "classical"
form during the 1910's. Or it may be that the 1990's are more like the 1890's,
because the computer media language of the future will be entirely different from
the one used today.
Does it make sense to theorize the present when it seems to be changing so
fast? It is a hedged bet. If subsequent developments prove my theoretical
projections correct, I win. But even if the language of computer media develops in
a different direction than the one suggested by the present analysis the analysis
presented here will become a record of possibilities which were heretofore
unrealized, of a horizon which was visible to us today but later became
unimaginable.
We no longer think of the history of cinema as a linear march towards a
single possible language, or as a progression towards perfect verisimilitude. On
the contrary, we have come to see its history as a succession of distinct and
equally expressive languages, each with its own aesthetic variables, each new
language closing off some of the possibilities of the previous one (a cultural logic
not dissimilar to Thomas Kuhn's analysis of scientific paradigms.)5 Similarly,
every stage in the history of computer media offers its own aesthetic
opportunities, as well as its own imagination of the future: in short, its own
"research paradigm." Each paradigm is modified or even abandoned at the next
stage. In this book I wanted to record the "research paradigm" of new media
during its first decade, before it slips into invisibility.
Mapping New Media: the Method
In this book I analyze the language of new media by placing it within the history
of modern visual and media cultures. What are the ways in which new media
relies on older cultural forms and languages and what are the ways in which it
breaks with them? What is unique about how new media objects create the
illusion of reality, address the viewer, and represent space and time? How do
conventions and techniques of old media—such as the rectangular frame, mobile
viewpoint and montage—operate in new media? If we are to construct an
archeology which will connect new computer-based techniques of media creation
with previous techniques of representation and simulation, where should we
locate the essential historical breaks?
35
To answer these questions, I look at several areas of new media: Web
sites, virtual worlds6, VR, multimedia, computer games, interactive installations,
computer animation, digital video, cinema, and human-computer interfaces.
While the book's main emphasis is on theoretical and historical arguments, I also
analyze many key new media objects created during the field’s history, from such
American commercial classics as Myst and Doom, Jurassic Park and Titanic, to
the works of international new media artists and collectives such as ART+COM,
antirom, jodi.org, George Legrady, Olga Lialina, Jeffrey Shaw, and Tamas
Waliczky.
The computerization of culture not only leads to the emergence of new
cultural forms such as computer games and virtual worlds; it redefines existing
ones such as photography and cinema. I therefore also investigate the effects of
the computer revolution on visual culture at large. How does the shift to
computer-based media redefine the nature of static and moving images? What is
the effect of computerization on the visual languages used by our culture? What
are the new aesthetic possibilities which become available to us?
In answering these questions, I draw upon the histories of art,
photography, video, telecommunication, design and, last but not least, the key
cultural form of the twentieth century—cinema. The theory and history of cinema
serve as the key conceptual “lens” though which I look at new media. The book
explores the following topics:
• the parallels between cinema history and the history of new media;
• the identity of digital cinema;
• the relations between the language of multimedia and nineteenth century pre-
cinematic cultural forms;
• the functions of screen, mobile camera and montage in new media as
compared to cinema;
• the historical ties between new media and avant-garde film.
Along with film theory, this book draws its theoretical frameworks from both the
humanities and the sciences, utilizing art history, literary theory, media studies,
social theory, and computer science. Its overall method could be called "digital
materialism." Rather than imposing some a priori theory from above, I build a
theory of new media from the ground up. I scrutinize the principles of computer
hardware and software, and the operations involved in creating cultural objects on
a computer, in order to uncover a new cultural logic at work.
Most writings on new media are full of speculation about the future. This
book analyses new media as it has actually developed up until this point, at the
same time pointing to directions for new media artists and designers which have
not been yet explored. It is my hope that the theory of new media developed here
36
can act not only as an aid to help understand the present, but also as a grid for
practical experimentation. For example, “Theory of Cultural Interfaces” section
analyzes how the interfaces of new media objects are being shaped by three
cultural traditions: print, cinema and human-computer interface. By describing the
elements of these traditions which are already used in new media, this analysis
points towards other elements and their combinations which are still waiting to be
experimented with. “Compositing” section provides another set of directions for
experiments by outlining a number of new types of montage. Yet another
direction is discussed in “Database” were I suggest that new media narratives can
explore the new compositional and aesthetic possibilities offered by a computer
database.
While this book does not speculate about the future, it does contain an
implicit theory of how new media will develop. This is the advantage of placing
new media within a larger historical perspective. We begin to see the long
trajectories which lead to new media in its present state; and we can extrapolate
these trajectories into the future. The section “Principles of New Media” describes
four key trends which, in my view, are shaping the development of new media
over time: modularity, automation, variability and transcoding.
Of course we don’t have to blindly accept these trends. Understanding the
logic which is shaping the evolution of new media language allows us to develop
different alternatives. Just as avant-garde filmmakers throughout cinema's
existence offered alternatives to its particular narrative audio-visual regime, the
task of avant-garde new media artists today is to offer alternatives to the existing
language of computer media. This can be better accomplished if we have a theory
of how "mainstream" language is structured now and how it is evolving over time.
Mapping New Media: Organization
This book aims to contribute to the emerging field of new media studies (other
names which have been already used to describe it are “digital studies” and
“digital culture”) by providing one potential map of what the field can be. If a
textbook of literary theory may, for instance, have chapters on narrative and
voice, and a textbook of film studies may discuss cinematography and editing,
this book proposes that new media theory requires the definition and refinement
of separate categories such as interface and operations.
I’ve divided the book into a number of chapters, each chapter covering
one key concept or problem. My overall argument—that we should approach new
media in relation to other visual cultural forms and put it in historical
perspective—affects my approach to each problem, but it does not drive the
overall structure of the book. Instead, concepts developed in earlier chapters
become building blocks for analyses in later chapters. In ordering the chapters, I
37
also considered textbooks in other established fields relevant to new media, such
as film studies, narratology and art history; much as a textbook on film may begin
with film technology and end up with film genres, this book progresses from the
material foundations of new media to its forms.
One could also draw an analogy between the “bottom-up” approach I use
here and the organization of computer software. A computer program written by a
programmer undergoes a series of translations: high-level computer language is
compiled into executable code, which is then converted by an assembler into
binary code. In this book I follow this order in reverse, advancing from the level
of binary code to the level of a computer program, and then moves further to
consider the logic of new media objects driven by these programs:
I. “What is New Media”: the digital “medium” itself, its material and logical
organization.
II. “The Interface”: the human-computer interface; the operating system
(OS).
III. “The Operations”: software applications which run on top of the OS, their
interfaces and typical operations.
IV. “The Illusions”: appearance, and the new logic of digital images created
and accessed using software applications.
V. “The Forms”: the commonly used conventions for organizing a new media
object as as a whole.
The last chapter “What is Cinema?” mirrors the book’s beginning. Chapter I
points out that many of its allegedly unique principles can be already found in
cinema. The subsequent chapters continue this perspective of using film history
and theory to analyze new media. Having discussed different levels of new media
— the interface, the operations, the illusion, and the forms — in this chapter I turn
my conceptual “lens” around to look at how computerization changes cinema
itself. I first analyze the identity of digital cinema by placing it within a history of
a moving image and then discuss how computerization offers new opportunities
for the development of film language.
At the same time, the last chapter continues “bottom-up” trajectory of the
book as a whole. If Chapter V looks at organization of new cultural objects, such
as Web sites, hypermedia CD-ROMs and virtual worlds, which are all the
“children” of a computer, Chapter 6 considers the effects of a computerization on
a older cultural form which exists, so to speak, “outside” of computer culture
proper — cinema.
Each chapter begins with a short introduction which discusses its concept
(or “level”) and summarizes the arguments developed in individual sections. For
example, Chapter II, "The Interface," begins with a general discussion of the
importance of the concept of the interface in new media. The two sections of
Chapter II then look at different aspects of new media interfaces: their reliance on
38
the conventions of other media and the relationship between the body of the user
and the interface.
The Terms: Language, Object, Representation
In putting the word “language” into the title of the book, I did not want to suggest
that there is some “single” language of new media or that we need to return to the
structuralist phase of semiotics in understanding new media. However, given that
most studies of new media and cyber culture focus on its sociological, economic
and political dimensions, I felt justified in using the word “language” to signal the
different focus of this work: the emergent conventions, the recurrent design
patterns, and the key forms of new media. I considered using the words
“aesthetics” and “poetics” instead of “language,” eventually deciding against
them. Aesthetics implies a set of oppositions which I would like to avoid—
between art and mass culture, between the beautiful and the ugly, between the
valuable and the unimportant. Poetics also brings with it undesirable connotations.
Continuing the project of Russian formalists of the 1910s, poetics was defined in
the 1920s as a study of specific properties of particular arts, such as narrative
literature. In Introduction to Poetics (1968) literary scholar Tzvetan Todorov
writes:
In contradistinction to the interpretation of particular works, it [poetics]
does not seen to name meaning, but aims at a knowledge of the general
laws that preside over the birth of each work. But in contrasdistinction to
such sciences as psychology, sociology, etc., it seeks these laws within
literature itself. Poetics is therefore an approach to literature at once
“abstract” and “internal.”7
In contrast to such “internal” aproach, in describing conventions, elements and
forms of new media, I neither claim that they are unique to new media, nor do I
consider it useful to look at it in isolation from other areas of culture. On the
contrary, this book aims to situate new media in relation to a number of other
areas of culture, both past and present:
• other arts and media traditions: their visual languages, their strategies for
organizing information and the viewer’s experience;
• the material properties of the computer; the ways in which it is used in modern
society; the structure of its interface and key software applications;
• contemporary visual culture: the internal organization, iconography,
iconology and viewer experience of various visual sites in our culture: fashion
39
and advertising, supermarkets and fine art objects, television programs and
publicity banners, offices and techno clubs;
• contemporary information culture.
The concept “information culture,” which is my term, can be thought of as a
parallel to another, already familiar concept—visual culture. It includes the ways
in which different cultural sites and objects present information: airport and train
stations displays; road signs; television on-screen menus; graphic layouts of
television news; the layouts of books, newspapers and magazines; the interior
designs of banks, hotels and other commercial and leisure spaces; the interfaces of
planes and cars and, last but not least, the interfaces of computer operating
systems (Windows, MAC OS, UNIX) and software applications (Word, Excel,
PowerPoint, Eudora, Navigator, RealPlayer, Filemaker, Photoshop, etc.).
Extending the parallels with visual culture, information culture also includes
historical methods for organizing and retrieving information (analogs of
iconography) as well as patterns of user interaction with information objects and
displays.
Another word which needs to be commented on is “object.” Throughout
the book, I use the term “new media object,” rather than “product,” “artwork,”
“interactive media,” or other possible terms. A new media object may be a still
digital image, a digitally composited film, a virtual 3D environment, a computer
game, a self-contained hypermedia DVD, a hypermedia Web site, or the Web as a
whole. The term thus fits with my aim of describing the general principles of new
media which would hold true across all media types, all forms of organization and
all scales. I also use “object” to emphasize that my concern is with the culture at
large rather than with new media art alone. Moreover, “Object” is a standard term
in the computer science and industry, used to emphasize the modular nature of
object-oriented programming languages such as C++ and Java, object-oriented
databases and the OLE technology used in Microsoft Office products. Thus it also
serves my purpose to adopt the terms and paradigms of computer science for a
theory of computerized culture. (See “Principles of New Media” for an
elaboration of this idea).
In addition, I hope to activate connotations which accompanied the use of
the word “object” by the Russian avant-garde artists of the 1920s. Russian
Constructivists and Productivists referred to their creations as objects (“vesh,”
“construktsia,” “predmet”) rather than works of art. Like their Bauhaus
counterparts, they wanted to take on the roles of industrial designers, graphic
designers, architects, clothing designers and so on, rather than remain fine artists
producing one-of-a kind works for museums or private collections. The word
pointed toward the model of industrial mass production rather than the traditional
artist’s studio, and it implied the ideals of rational organization of labor and
engineering efficiency which artists wanted to bring into their own work.
40
In the case of “new media objects,” all these connotations are worth invoking. In
the world of new media, the boundary between art and design is fuzzy at best. On
the one hand, many artists make their living as commercial designers; on the other
hand, professional designers are typically the ones who really push the language
of new media forward by being engaged in systematic experimentation and also
by creating new standards and conventions. The second connotation, that of
industrial production, also holds true for new media. Many new media projects
are put together by large teams of people (although, in contrast to the studio
system of the classical Hollywood era, single producers or small teams of just a
few people are also common). Many new media objects, such as popular games or
software applications, sell millions of copies. Yet another feature of the new
media field which unites it with big industry is the strict adherence to various
hardware and software standards.8
Finally, and most importantly, I use the word object to activate the concept
of laboratory experimentation practiced by the avant-garde of the 1920s. Today,
as more and more artists are turning to new media, few are willing to undertake
systematic, laboratory-like research into its elements, and basic compositional,
expressive and generative strategies. Yet this is exactly the kind of research
which was undertaken by Russian and German avant-garde artists of the 1920’s in
places like Vkhutemas9 and Bauhaus in relation to the new media of their time:
photography, film, new print technologies , telephony. Today, those few who are
able to resist the temptation to immediately create an “interactive CD-ROM,” or
to make a feature length “digital film,” and instead are able to focus on
determining the new media equivalent of a shot, a sentence, a word, or even a
letter, are rewarded with amazing findings.
A third term which is used throughout the book and which needs to be
commented upon is “representation.” In using this term I wanted to invoke
complex and nuanced understanding of the functioning of cultural objects
developed in humanities over the last decades. New media objects are no different
in that respect. Thus, any new media object — a Web site, a computer game, a
digital image, and so on — represents, as well as helps to construct, some outside
referent: a physically existing object, historical information presented in other
documents, a system of categories currently employed by culture as a whole or by
some social groups or interests. As it is the case with all cultural representations,
new media representations are also always biased. They represent / construct
some features of physical reality at the expenses of others, one world view among
many, one possible system of categories among numerous others possible.
In this book I suggest that not only individual new media objects, but also
the interfaces, both of an operating system and of commonly used software
applications, also act as representations. That is, by organizing data in particular
ways and by making it possible to access it in particular ways, they privilege
particular models of the world and of the human subject. For instance, the two key
41
ways to organize computer data commonly used today — a hierarchical file
system (Graphical User Interface from 1984 Macintosh onward) and a “flat,” non-
hierarchical network of hyperlinks (1990s World Wide Web) — represent the
world in two fundamentally different and, in fact, opposing ways. Hierarchical
file system assumes that the world can be reduced to a logical and hierarchical
order, where every object has a distinct and well defined place. World Wide Web
assumes that every object has the same importance as any other, and that
everything is, or can be connected to everything else. Interfaces also privilege
particular modes of data access traditionally associated with particular arts and
media technologies. For instance, the World Wide Web of the 1990s
foregrounded page as a basic unit of data organization (regardless of which media
types it contains), while Acrobat software applies uses metaphor of “video
playback” to text-based documents. Thus interfaces act as “representations” of
older cultural forms and media, privileging some at the expense of others. This
idea will be developed further in “Cultural Interfaces” section where I will
analyze the role of cinematic and print conventions in new media.
In describing the language of new media I found it useful to use the term
“representation” in oppositions to other terms. Depending which term it is
opposed to, the meaning of “representation” changes. Since these oppositions are
introduced in different sections of the book, here I summarize all of them:
(1) Representation — simulation (“Screen” section). Here representation
refers to various screen technologies such as post-Renaissance painting, film,
radar and television. I define screen as a rectangular surface which frames a
virtual world and which exists within the physical world of a viewer without
completely blocking her visual field. Simulation refers to technologies which aim
to completely “immerse” the viewer within the virtual universe: Baroque Jesuit
churches, nineteenth century panorama, twentieth century movie theaters.
(2) Representation — control (“Cultural Interfaces” section). Here I
oppose an image as a representation of an illusionary fictional universe and an
image as a simulation of a control panel (for instance, GUI with its different icons
and menus) which allows the user to control a computer. This new type of image
can be called image-interface. The opposition representation — control
corresponds to an opposition between depth and surface: a computer screen as a
window into an illusionistic space versus computer screen as a flat control panel.
(3) Representation — action (“Teleaction” section.) This is the opposition
between technologies for creating illusions (fashion, realist paintings, dioramas,
military decoys, film montage, digital compositing) and representational
technologies used to enable action, i.e. to allow the viewer to manipulate reality
through representations (maps, architectural drawings, x-ray, telepresence). I refer
to images produced by later technologies as image-instruments.
(4) Representation — communication (“Teleaction” section.) This is the
opposition between representational technologies (film, audio and video magnetic
42
tape, digital storage formats) and real-time communication technologies, i.e.
everything which begins with “tele” (telegraph, telephone, telex, television,
telepresence). Representational technologies allow for creation of traditional
aesthetic objects, i.e. something which is fixed in space or time and which refers
to some referent(s) outside itself. By foregrounding the importance of person-to-
person telecommunication, and “tele”-cultural forms in general which do not
produce any objects, new media forces us to reconsider the traditional equation
between culture and objects.
(5) Visual illusionism — simulation (introduction to “Illusions” chapter).
Illusionism here refers both to representation and simulation as these terms are
used in “Screen” section. Thus, illusionism combines traditional techniques and
technologies which aim to create visual resemblance of reality: perspectival
painting, cinema, panorama, etc. Simulation refers to various computer methods
for modeling other aspects of reality beyond its visual appearance: movement of
physical objects, shape changes over time in natural phenomena (water surface,
smoke), motivations, behavior, speech and language comprehension in human
beings.
(6) Representation — information (introduction to “Forms” chapter). This
opposition refers to two opposing goals of new media design: (1) immersing users
in an imaginary fictional universe similar to traditional fiction; giving users
efficient access to a body of information (for instance, a search engine Web site
or an online encyclopedia.)
43
I. What is New Media?
What is new media? We may begin answering this question by listing the
categories which are commonly discussed under this topic in popular press:
Internet, Web sites, computer multimedia, computer games, CD-ROMs and DVD,
virtual reality. Is this all new media is? For instance, what about television
programs which are shot on digital video and edited on computer workstations?
Or what about feature films which use 3D animation and digital compositing?
Shall we count these as new media? In this case, what about all images and text-
image compositions — photographs, illustrations, layouts, ads — which are also
created on computers and then printed on paper? Where shall we stop?
As can be seen from these examples, the popular definition of new media
identifies it with the use of a computer for distribution and exhibition, rather than
with production. Therefore, texts distributed on a computer (Web sites and
electronic books) are considered to be new media; texts distributed on paper are
not. Similarly, photographs which are put on a CD-ROM and require a computer
to view them are considered new media; the same photographs printed as a book
are not.
Shall we accept this definition? If we want to understand the effects of
computerization on culture as a whole, I think it is too limiting. There is no
reason to privilege computer in the role of media exhibition and distribution
machine over a computer used as a tool for media production or as a media
storage device. All have the same potential to change existing cultural languages.
And all have the same potential to leave culture as it is.
The last scenario is unlikely, however. What is more likely is that just as
the printing press in the fourteenth century and photography in the nineteenth
century had a revolutionary impact on the development of modern society and
culture, today we are in the middle of a new media revolution -- the shift of all of
our culture to computer-mediated forms of production, distribution and
communication. This new revolution is arguably more profound than the previous
ones and we are just beginning to sense its initial effects. Indeed, the introduction
of printing press affected only one stage of cultural communication -- the
distribution of media. In the case of photography, its introduction affected only
one type of cultural communication -- still images. In contrast, computer media
revolution affects all stages of communication, including acquisition,
manipulating, storage and distribution; it also affects all types of media -- text,
still images, moving images, sound, and spatial constructions.
How shall we begin to map out the effects of this fundamental shift? What
are the ways in which the use of computers to record, store, create and distribute
media makes it “new”?
44
In section “Media and Computation” I show that new media represents a
convergence of two separate historical trajectories: computing and media
technologies. Both begin in the 1830's with Babbage's Analytical Engine and
Daguerre's daguerreotype. Eventually, in the middle of the twentieth century, a
modern digital computer is developed to perform calculations on numerical data
more efficiently; it takes over from numerous mechanical tabulators and
calculators already widely employed by companies and governments since the
turn of the century. In parallel, we witness the rise of modern media technologies
which allow the storage of images, image sequences, sounds and text using
different material forms: a photographic plate, a film stock, a gramophone record,
etc. The synthesis of these two histories? The translation of all existing media into
numerical data accessible for computers. The result is new media: graphics,
moving images, sounds, shapes, spaces and text which become computable, i.e.
simply another set of computer data. In “Principles of New Media” I look at the
key consequences of this new status of media. Rather than focusing on familiar
categories such as interactivity or hypermedia, I suggest a different list. This list
reduces all principles of new media to five: numerical representation, modularity,
automation, variability and cultural transcoding. In the last section, “What New
Media is Not,” I address other principles which are often attributed to new media.
I show that these principles can already be found at work in older cultural forms
and media technologies such as cinema, and therefore they are by themselves are
not sufficient to distinguish new media from the old.
45
How Media Became New
On August 19, 1839, the Palace of the Institute in Paris was completely full with
curious Parisians who came to hear the formal description of the new
reproduction process invented by Louis Daguerre. Daguerre, already well-known
for his Diorama, called the new process daguerreotype. According to a
contemporary, "a few days later, opticians' shops were crowded with amateurs
panting for daguerreotype apparatus, and everywhere cameras were trained on
buildings. Everyone wanted to record the view from his window, and he was
lucky who at first trial got a silhouette of roof tops against the sky."10 The media
frenzy has begun. Within five months more than thirty different descriptions of
the techniques were published all around the world: Barcelona, Edinburg, Halle,
Naples, Philadelphia, Saint Petersburg, Stockholm. At first, daguerreotypes of
architecture and landscapes dominated the public's imagination; two years later,
after various technical improvements to the process, portrait galleries were
opened everywhere — and everybody rushed in to have their picture taken by a
new media machine.11
In 1833 Charles Babbage started the design for a device he called the
Analytical Engine. The Engine contained most of the key features of the modern
digital computer. The punch cards were used to enter both data and instructions.
This information was stored in the Engine's memory. A processing unit, which
Babbage referred to as a "mill," performed operations on the data and wrote the
results to memory; final results were to be printed out on a printer. The Engine
was designed to be capable of doing any mathematical operation; not only would
it follow the program fed into it by cards, but it would also decide which
instructions to execute next, based upon intermediate results. However, in contrast
to the daguerreotype, not even a single copy of the Engine was completed. So
while the invention of this modern media tool for the reproduction of reality
impacted society right away, the impact of the computer was yet to be measured.
Interestingly, Babbage borrowed the idea of using punch cards to store
information from an earlier programmed machine. Around 1800, J.M. Jacquard
invented a loom which was automatically controlled by punched paper cards. The
loom was used to weave intricate figurative images, including Jacquard's portrait.
This specialized graphics computer, so to speak, inspired Babbage in his work on
the Analytical Engine, a general computer for numerical calculations. As Ada
Augusta, Babbage's supporter and the first computer programmer, put it, "the
Analytical Engine weaves algebraical patterns just as the Jacquard loom weaves
flowers and leaves."12 Thus, a programmed machine was already synthesizing
images even before it was put to process numbers. The connection between the
Jacquard loom and the Analytical Engine is not something historians of
46
computers make much of, since for them computer image synthesis represents just
one application of the modern digital computer among thousands of others; but
for a historian of new media it is full of significance.
We should not be surprised that both trajectories — the development of
modern media, and the development of computers — begin around the same time.
Both media machines and computing machines were absolutely necessary for the
functioning of modern mass societies. The ability to disseminate the same texts,
images and sounds to millions of citizens thus assuring that they will have the
same ideological beliefs was as essential as the ability to keep track of their birth
records, employment records, medical records, and police records. Photography,
film, the offset printing press, radio and television made the former possible while
computers made possible the latter. Mass media and data processing are the
complimentary technologies of a modern mass society; they appear together and
develop side by side, making this society possible.
For a long time the two trajectories run in parallel without ever crossing
paths. Throughout the nineteenth and the early twentieth century, numerous
mechanical and electrical tabulators and calculators were developed; they were
gradually getting faster and their use was became more wide spread. In parallel,
we witness the rise of modern media which allows the storage of images, image
sequences, sounds and text in different material forms: a photographic plate, film
stock, a gramophone record, etc.
Let us continue tracing this joint history. In the 1890s modern media took
another step forward as still photographs were put in motion. In January of 1893,
the first movie studio — Edison's "Black Maria" — started producing twenty
seconds shorts which were shown in special Kinetoscope parlors. Two years later
the Lumière brothers showed their new Cinématographie camera/projection
hybrid first to a scientific audience, and, later, in December of 1895, to the paying
public. Within a year, the audiences in Johannesburg, Bombay, Rio de Janeiro,
Melbourne, Mexico City, and Osaka were subjected to the new media machine,
and they found it irresistible.13 Gradually the scenes grew longer, the staging of
reality before the camera and the subsequent editing of its samples became more
intricate, and the copies multiplied. They would be sent to Chicago and Calcutta,
to London and St. Petersburg, to Tokyo and Berlin and thousands and thousands
of smaller places. Film images would soothe movie audiences, who were too
eager to escape the reality outside, the reality which no longer could be
adequately handled by their own sampling and data processing systems (i.e., their
brains). Periodic trips into the dark relaxation chambers of movie theaters became
a routine survival technique for the subjects of modern society.
The 1890s was the crucial decade, not only for the development of media,
but also for computing. If individuals' brains were overwhelmed by the amounts
of information they had to process, the same was true of corporations and of
government. In 1887, the U.S. Census office was still interpreting the figures from
47
the 1880 census. For the next 1890 census, the Census Office adopted electric
tabulating machines designed by Herman Hollerith. The data collected for every
person was punched into cards; 46, 804 enumerators completed forms for a total
population of 62,979,766. The Hollerith tabulator opened the door for the
adoption of calculating machines by business; during the next decade electric
tabulators became standard equipment in insurance companies, public utilities
companies, railroads and accounting departments. In 1911, Hollerith's Tabulating
Machine company was merged with three other companies to form the
Computing-Tabulating-Recording Company; in 1914 Thomas J. Watson was
chosen as its head. Ten years later its business tripled and Watson renamed the
company the International Business Machines Corporation, or IBM.14
We are now in the new century. The year is 1936. This year the British
mathematician Alan Turing wrote a seminal paper entitled "On Computable
Numbers." In it he provided a theoretical description of a general-purpose
computer later named after its inventor the Universal Turing Machine. Even
though it was only capable of four operations, the machine could perform any
calculation which can be done by a human and could also imitate any other
computing machine. The machine operated by reading and writing numbers on an
endless tape. At every step the tape would be advanced to retrieve the next
command, to read the data or to write the result. Its diagram looks suspiciously
like a film projector. Is this a coincidence?
If we believe the word cinematograph, which means "writing movement,"
the essence of cinema is recording and storing visible data in a material form. A
film camera records data on film; a film projector reads it off. This cinematic
apparatus is similar to a computer in one key respect: a computer's program and
data also have to be stored in some medium. This is why the Universal Turing
Machine looks like a film projector. It is a kind of film camera and film projector
at once: reading instructions and data stored on endless tape and writing them in
other locations on this tape. In fact, the development of a suitable storage medium
and a method for coding data represent important parts of both cinema and
computer pre-histories. As we know, the inventors of cinema eventually settled on
using discrete images recorded on a strip of celluloid; the inventors of a computer
— which needed much greater speed of access as well as the ability to quickly
read and write data — came to store it electronically in a binary code.
In the same year, 1936, the two trajectories came even closer together.
Starting this year, and continuing into the Second World War, German engineer
Konrad Zuse had been building a computer in the living room of his parents'
apartment in Berlin. Zuse's computer was the first working digital computer. One
of his innovations was program control by punched tape. The tape Zuse used was
actually discarded 35 mm movie film.15
One of these surviving pieces of this film shows binary code punched over
the original frames of an interior shot. A typical movie scene — two people in a
48
room involved in some action — becomes a support for a set of computer
commands. Whatever meaning and emotion was contained in this movie scene
has been wiped out by its new function as a data carrier. The pretense of modern
media to create simulation of sensible reality is similarly canceled; media is
reduced to its original condition as information carrier, nothing else, nothing
more. In a technological remake of the Oedipal complex, a son murders his father.
The iconic code of cinema is discarded in favor of the more efficient binary one.
Cinema becomes a slave to the computer.
But this is not yet the end of the story. Our story has a new twist — a
happy one. Zuse's film, with its strange superimposition of the binary code over
the iconic code anticipates the convergence which gets underway half a century
later. The two separate historical trajectories finally meet. Media and computer —
Daguerre's daguerreotype and Babbage's Analytical Engine, the Lumière
Cinématographie and Hollerith's tabulator — merge into one. All existing media
are translated into numerical data accessible for the computers. The result:
graphics, moving images, sounds, shapes, spaces and text become computable,
i.e. simply another set of computer data. In short, media becomes new media.
This meeting changes both the identity of media and of the computer
itself. No longer just a calculator, a control mechanism or a communication
device, a computer becomes a media processor. Before the computer could read a
row of numbers outputting a statistical result or a gun trajectory. Now it can read
pixel values, blurring the image, adjusting its contrast or checking whether it
contains an outline of an object. Building upon these lower-level operations, it can
also perform more ambitious ones: searching image databases for images similar
in composition or content to an input image; detecting shot changes in a movie; or
synthesizing the movie shot itself, complete with setting and the actors. In a
historical loop, a computer returned to its origins. No longer just an Analytical
Engine, suitable only to crunch numbers, the computer became Jacqurd's loom —
a media synthesizer and manipulator.
49
Principles of New Media
The identity of media has changed even more dramatically. Below I summarize
some of the key differences between old and new media. In compiling this list of
differences I tried to arrange them in a logical order. That is, the principles 3-5 are
dependent on the principles 1-2. This is not dissimilar to axiomatic logic where
certain axioms are taken as staring points and further theorems are proved on their
basis.
Not every new media object obeys these principles. They should be
considered not as some absolute laws but rather as general tendencies of a culture
undergoing computerization. As the computerization affects deeper and deeper
layers of culture, these tendencies will manifest themselves more and more.
1. Numerical Representation
All new media objects, whether they are created from scratch on computers or
converted from analog media sources, are composed of digital code; they are
numerical representations. This has two key consequences:
1.1. New media object can be described formally (mathematically). For
instance, an image or a shape can be described using a mathematical function.
1.2. New media object is a subject to algorithmic manipulation. For
instance, by applying appropriate algorithms, we can automatically remove
"noise" from a photograph, improve its contrast, locate the edges of the shapes, or
change its proportions. In short, media becomes programmable.
When new media objects are created on computers, they originate in numerical
form. But many new media objects are converted from various forms of old
media. Although most readers understand the difference between analog and
digital media, few notes should be added on the terminology and the conversion
process itself. This process assumes that data is originally continuos, i.e. “the axis
or dimension that is measured has no apparent indivisible unit from which it is
composed.”16 Converting continuos data into a numerical representation is called
digitization. Digitization consists from two steps: sampling and quantization.
First, data is sampled, most often at regular intervals, such as the grid of pixels
used to represent a digital image. Technically, a sample is defined as “a
measurement made at a particular instant in space and time, according to a
specified procedure.” The frequency of sampling is referred to as resolution.
Sampling turns continuos data into discrete data. This is data occurring in distinct
units: people, pages of a book, pixels. Second, each sample is quantified, i.e.
50
assigned a numerical vale drawn from a defined range (such as 0-255 in the case
of a 8-bit greyscale image).17
While some old media such as photography and sculpture is truly
continuos, most involve the combination of continuos and discrete coding. One
example is motion picture film: each frame is a continuos photograph, but time is
broken into a number of samples (frames). Video goes one step further by
sampling the frame along the vertical dimension (scan lines). Similarly, a
photograph printed using a halftone process combine discrete and continuos
representations. Such photograph consist from a number of orderly dots (i.e.,
samples), however the diameters and areas of dots vary continuously.
As the last example demonstrates, while old media contains level(s) of
discrete representation, the samples were never quantified. This quantification of
samples is the crucial step accomplished by digitization. But why, we may ask,
modern media technologies were often in part discrete? The key assumption of
modern semiotics is that communication requires discrete units. Without discrete
units, there is no language. As Roland Barthes has put it, “language is, as it were,
that which divides reality (for instance the continuos spectrum of the colors is
verbally reduced to a series of discontinuous terms).18 In postulating this,
semioticians took human language as a prototypical example of a communication
system. A human language is discrete on most scales: we speak in sentences; a
sentence is made from words; a word consists from morphemes, and so on. If we
are to follow the assumption that any form of communication requires discrete
representation, we may expect that media used in cultural communication will
have discrete levels. At first this explanation seems to work. Indeed, a film
samples continuos time of human existence into discrete frames; a drawing
samples visible reality into discrete lines; and a printed photograph samples it into
discrete dots. This assumption does not universally work, however: photographs,
for instance, do not have any apparent units. (Indeed, in the 1970s semiotics was
criticized for its linguistic bias, and most semioticians came to recognize that
language-based model of distinct units of meaning can’t be applied to many kinds
of cultural communication.) More importantly, the discrete units of modern media
are usually not the units of meanings, the way morphemes are. Neither film
frames not the halftone dots have any relation to how film or a photographs affect
the viewer (except in modern art and avant-garde film — think of paintings by
Roy Lichtenstein and films of Paul Sharits — which often make the “material”
units of media into the units of meaning.)
The more likely reason why modern media has discrete levels is because it
emerges during Industrial Revolution. In the nineteenth century, a new
organization of production known as factory system gradually replaced artisan
labor. It reached its classical form when Henry Ford installed first assembly line
in his factory in 1913. The assembly line relied on two principles. The first was
standardization of parts, already employed in the production of military uniforms
51
in the nineteenth century. The second, never principle, was the separation of the
production process into a set of repetitive, sequential, and simple activities that
could be executed by workers who did not have to master the entire process and
could be easily replaced.
Not surprisingly, modern media follows the factory logic, not only in
terms of division of labor as witnessed in Hollywood film studios, animation
studios or television production, but also on the level of its material organization.
The invention of typesetting machines in the 1880s industrialized publishing
while leading to standardization of both type design and a number and types of
fonts used. In the 1890s cinema combined automatically produced images (via
photography) with a mechanical projector. This required standardization of both
image dimensions (size, frame ratio, contrast) and of sampling rate of time (see
“Digital Cinema” section for more detail). Even earlier, in the 1880s, first
television systems already involved standardization of sampling both in time and
in space. These modern media systems also followed the factory logic in that once
a new “model” (a film, a photograph, an audio recording) was introduced,
numerous identical media copies would be produced from this master. As I will
show below, new media follows, or actually, runs ahead of a quite a different
logic of post-industrial society — that of individual customization, rather that of
mass standardization.
2. Modularity
This principle can be called "fractal structure of new media.” Just as a fractal has
the same structure on different scales, a new media object has the same modular
structure throughout. Media elements, be it images, sounds, shapes, or behaviors,
are represented as collections of discrete samples (pixels, polygons, voxels,
characters, scripts). These elements are assembled into larger-scale objects but
they continue to maintain their separate identity. The objects themselves can be
combined into even larger objects -- again, without losing their independence. For
example, a multimedia "movie" authored in popular Macromedia Director
software may consist from hundreds of still images, QuickTime movies, and
sounds which are all stored separately and are loaded at run time. Because all
elements are stored independently, they can be modified at any time without
having to change Director movie itself. These movies can be assembled into a
larger "movie," and so on. Another example of modularity is the concept of
“object” used in Microsoft Office applications. When an object is inserted into a
document (for instance, a media clip inserted into a Word document), it continues
to maintain its independence and can always be edited with the program used
originally to create it. Yet another example of modularity is the structure of a
HTML document: with the exemption of text, it consists from a number of
52
separate objects — GIF and JPEG images, media clips, VRML scenes,
Schockwave and Flash movies -- which are all stored independently locally
and/or on a network. In short, a new media object consists from independent parts
which, in their turn, consist from smaller independent parts, and so on, up to the
level of smallest “atoms” such as pixels, 3D points or characters.
World Wide Web as a whole is also completely modular. It consists from
numerous Web pages, each in its turn consisting from separate media elements.
Every element can be always accessed on its own. Normally we think of elements
as belonging to their corresponding Web sites, but this just a convention,
reinforced by commercial Web browsers. Netomat browser which extract
elements of a particular media type from different Web pages (for instance, only
images) and display them together without identifying the Web sites they come
from, highlights for us this fundamentally discrete and non-hierarchical
organization of the Web (see introduction to “Interface” chapter for more on this
browser.)
In addition to using the metaphor of a fractal, we can also make an
analogy between modularity of new media and the structured computer
programming. Structural computer programming which became standard in the
1970s involves writing small and self-sufficient modules (called in different
computer languages subroutines, functions, procedures, scripts) which are
assembled into larger programs. Many new media objects are in fact computer
programs which follow structural programming style. For example, most
interactive multimedia applications are programs written in Macromedia
Director’s Lingo. A Lingo program defines scripts which control various repeated
actions, such as clicking on a button; these scripts are assembled into larger
scripts. In the case of new media objects which are not computer programs, an
analogy with structural programming still can be made because their parts can be
accessed, modified or substituted without affecting the overall structure of an
object. This analogy, however, has its limits. If a particular module of a computer
program is deleted, the program would not run. In contrast, just as it is the case
with traditional media, deleting parts of a new media object does not render its
meaningless. In fact, the modular structure of new media makes such deletion and
substitution of parts particularly easy. For example, since a HTML document
consists from a number of separate objects each represented by a line of HTML
code, it is very easy to delete, substitute or add new objects. Similarly, since in
Photoshop the parts a digital image are usually placed on separate layers, these
parts can be deleted and substituted with a click of a button.
3. Automation
53
Numerical coding of media (principle 1) and modular structure of a media object
(principle 2) allow to automate many operations involved in media creation,
manipulation and access. Thus human intentionally can be removed from the
creative process, at least in part.19
The following are some of the examples of what can be called “low-
level” automation of media creation, in which the computer user modifies or
creates from scratch a media object using templates or simple algorithms. These
techniques are robust enough so that they are included in most commercial
software for image editing, 3D graphics, word processing, graphic layout, and so
on. Image editing programs such as Photoshop can automatically correct scanned
images, improving contrast range and removing noise. They also come with filters
which can automatically modify an image, from creating simple variations of
color to changing the whole image as though it was painted by Van Gog, Seurat
or other brand-name artist. Other computer programs can automatically generate
3D objects such as trees, landscapes, human figures and detailed ready-to-use
animations of complex natural phenomena such as fire and waterfalls. In
Hollywood films, flocks of birds, ant colonies and crowds of people are
automatically created by AL (artificial life) software. Word processing, page
layout, presentation and Web creation programs come with "agents" which can
automatically create the layout of a document. Writing software helps the user to
create literary narratives using formalized highly conventions genre convention.
Finally, in what maybe the most familiar experience of automation of media
generation to most computer users, many Web sites automatically generate Web
pages on the fly when the user reaches the site. They assemble the information
from the databases and format it using generic templates and scripts.
The researchers are also working on what can be called “high-level”
automation of media creation which requires a computer to understand, to a
certain degree, the meanings embedded in the objects being generated, i.e. their
semantics. This research can be seen as a part of a larger initiative of artificial
intelligence (AI). As it is well known, AI project achieved only very limited
success since its beginnings in the 1950s. Correspondingly, work on media
generation which requires understanding of semantics is also in the research stage
and is rarely included in commercial software. Beginning in the 1970s, computers
were often used to generate poetry and fiction. In the 1990s, the users of Internet
chat rooms became familiar with bots -- the computer programs which simulate
human conversation. The researchers at New York University showed a “virtual
theater” composed of a few “virtual actors” which adjust their behavior in real-
time in response to user’s actions.20 The MIT Media Lab developed a number of
different projects devoted to “high-level” automation of media creation and use: a
“smart camera” which can automatically follow the action and frame the shots
given a script;21 ALIVE, a virtual environment where the user interacted with
54
animated characters;22 a new kind of human-computer interface where the
computer presents itself to a user as an animated talking character. The character,
generated by a computer in real-time, communicates with user using natural
language; it also tries to guess user’s emotional state and to adjust the style of
interaction accordingly.23
The area of new media where the average computer user encountered AI
in the 1990s was not, however, human-computer interface, but computer games.
Almost every commercial game includes a component called AI engine. It stands
for part of the game’s computer code which controls its characters: car drivers in a
car race simulation, the enemy forces in a strategy game such as Command and
Conquer, the single enemies which keep attacking the user in first-person shooters
such as Quake. AI engines use a variety of approaches to simulate human
intelligence, from rule-based systems to neural networks. Like AI expert systems,
these characters have expertise in some well-defined but narrow area such as
attacking the user. But because computer games are highly codified and rule-
based, these characters function very effectively. That is, they effectively respond
to whatever few things the user are allowed to ask them to do: run forward, shoot,
pick up an object. They can’t do anything else, but then the game does not
provide the opportunity for the user to test this. For instance, in a martial arts
fighting game, I can’t ask questions of my opponent, nor do I expect him or her to
start a conversation with me. All I can do is to “attack” my opponent by pressing
a few buttons; and within this highly codified situation the computer can “fight”
me back very effectively. In short, computer characters can display intelligence
and skills only because the programs put severe limits on our possible interactions
with them. Put differently, the computers can pretend to be intelligent only by
tricking us into using a very small part of who we are when we communicate with
them. So, to use another example, at 1997 SIGGRAPH convention I was playing
against both human and computer-controlled characters in a VR simulation of
some non-existent sport game. All my opponents appeared as simple blobs
covering a few pixels of my VR display; at this resolution, it made absolutely no
difference who was human and who was not.
Along with “low-level” and “high-level” automation of media creation,
another area of media use which is being subjected to increasing automation is
media access. The switch to computers as means to store and access enormous
amount of media material, exemplified by the by “media assets” stored in the
databases of stock agencies and global entertainment conglomerates, as well as by
the public “media assets” distributed across numerous Web sites, created the need
to find more efficient ways to classify and search media objects. Word processors
and other text management software for a long time provided the abilities to
search for specific strings of text and automatically index documents. UNIX
operating system also always included powerful commands to search and filter
text files. In the 1990s software designers started to provide media users with
55
similar abilities. Virage introduced Virage VIR Image Engine which allows to
search for visually similar image content among millions of images as well as a
set of video search tools to allow indexing and searching video files.24 By the end
of the 1990s, the key Web search engines already included the options to search
the Internet by specific media such as images, video and audio.
The Internet, which can be thought of as one huge distributed media
database, also crystallized the basic condition of the new information society:
over-abundance of information of all kind. One response was the popular idea of
software “agents” designed to automate searching for relevant information. Some
agents act as filters which deliver small amounts of information given user's
criteria. Others are allowing users to tap into the expertise of other users,
following their selections and choices. For example, MIT Software Agents Group
developed such agents as BUZZwatch which “distills and tracks trends, themes,
and topics within collections of texts across time” such as Internet discussions and
Web pages; Letizia, “a user interface agent that assists a user browsing the World
Wide Web by… scouting ahead from the user's current position to find Web
pages of possible interest”; and Footprints which “uses information left by other
people to help you find your way around.”25
By the end of the twentieth century, the problem became no longer how to
create a new media object such as an image; the new problem was how to find the
object which already exists somewhere. That is, if you want a particular image,
chances are it is already exists -- but it may be easier to create one from scratch
when to find the existing one. Beginning in the nineteenth century, modern
society developed technologies which automated media creation: a photo camera,
a film camera, a tape recorder, a video recorder, etc. These technologies allowed
us, over the course of one hundred and fifty years, to accumulate an
unprecedented amount of media materials: photo archives, film libraries, audio
archives…This led to the next stage in media evolution: the need for new
technologies to store, organize and efficiently access these media materials. These
new technologies are all computer-based: media databases; hypermedia and other
ways of organizing media material such the hierarchical file system itself; text
management software; programs for content-based search and retrieval. Thus
automation of media access is the next logical stage of the process which was
already put into motion when a first photograph was taken. The emergence of new
media coincides with this second stage of a media society, now concerned as
much with accessing and re-using existing media as with creating new one.26
(See “Database” section for more on databases).
4. Variability
56
A new media object is not something fixed once and for all but can exist in
different, potentially infinite, versions. This is another consequence of numerical
coding of media (principle 1) and modular structure of a media object (principle
2). Other terms which are often used in relation to new media and which would be
appropriate instead of “variable” is “mutable” and “liquid.”
Old media involved a human creator who manually assembled textual,
visual and/or audio elements into a particular composition or a sequence. This
sequence was stored in some material, its order determined once and for all.
Numerous copies could be run off from the master, and, in perfect correspondence
with the logic of an industrial society, they were all identical. New media, in
contrast, is characterized by variability. Instead of identical copies a new media
object typically gives rise to many different versions. And rather being created
completely by a human author, these versions are often in part automatically
assembled by a computer. (The already quoted example of Web pages
automatically generated from databases using the templates created by Web
designers can be invoke here as well.) Thus the principle of variability is closely
connected to automation.
Variability would also will not be possible without modularity. Stored
digitally, rather than in some fixed medium, media elements maintain their
separate identity and can be assembled into numerous sequences under program
control. In addition, because the elements themselves are broken into discrete
samples (for instance, an image is represented as an array of pixels), they can be
also created and customized on the fly.
The logic of new media thus corresponds to the post-industrial logic of
"production on demand" and "just in time" delivery which themselves were made
possible by the use of computers and computer networks in all stages of
manufacturing and distribution. Here "culture industry" (the term was originally
coined by Theodor Adorno in the 1930s) is actually ahead of the rest of the
industry. The idea that a customer determines the exact features of her car at the
showroom, the data is then transmitted to the factory, and hours later the new car
is delivered, remains a dream, but in the case of computer media, it is reality.
Since the same machine is used as a showroom and a factory, i.e., the same
computer generates and displays media -- and since the media exists not as a
material object but as data which can be sent through the wires with the speed of
light, the customized version created in response to user’s input is delivered
almost immediately. Thus, to continue with the same example, when you access a
Web site, the server immediately assembles a customized Web page.
Here are some particular cases of the variability principle (most of them
will be discussed in more detail in later chapters):
4.1. Media elements are stored in a media database; a variety of end-user
objects which vary both in resolution, in form and in content can be generated,
either beforehand, or on demand, from this database. At first, we may think that
this is simply a particular technological implementation of variability principle,
57
but, as I will show in “Database” section, in a computer age database comes to
function as a cultural form of its own. It offers a particular model of the world and
of the human experience. It also affects how the user conceives of data which it
contains.
4.2. It becomes possible to separate the levels of "content" (data) and
interface. A number of different interfaces can be created to the same data. A new
media object can be defined as one or more interfaces to a multimedia database
(see introduction to “Interface” chapter and “Database” section for more
discussion of this principle).27
4.3. The information about the user can be used by a computer program to
automatically customize the media composition as well as to create the elements
themselves. Examples: Web sites use the information about the type of hardware
and browser or user's network address to automatically customize the site which
the user will see; interactive computer installations use information about the
user's body movements to generate sounds, shapes, and images, or to control
behaviors of artificial creatures.
4.4. A particular case of 4.3 is branching-type interactivity (sometimes
also called menu-based interactivity.) This term refers to programs in which all
the possible objects which the user can visit form a branching tree structure.
When the user reaches a particular object, the program presents her with choices
and let her pick. Depending on the value chosen, the user advances along a
particular branch of the tree. For instance, in Myst each screen typically contains
a left and a right button, clicking on the button retrieves a new screen, and so on.
In this case the information used by a program is the output of user's cognitive
process, rather than the network address or body position. (See “Menus, Filters,
Plug-ins” for more discussion of this principle.)
4.5. Hypermedia is another popular new media structure, which
conceptually is close to branching-type interactivity (because quite often the
elements are connected using a branch tree structure). In hypermedia, the
multimedia elements making a document are connected through hyperlinks. Thus
the elements and the structure are independent of each other --rather than hard-
wired together, as in traditional media. World Wide Web is a particular
implementation of hypermedia in which the elements are distributed throughout
the network . Hypertext is a particular case of hypermedia which uses only one
media type — text. How does the principle of variability works in this case? We
can conceive of all possible paths through a hypermedia document as being
different versions of it. By following the links the user retrieves a particular
version of a document.
4.6. Another way in which different versions of the same media objects
are commonly generated in computer culture is through periodic updates.
Networks allow the content of a new media object to be periodically updating
while keeping its structure intact. For instance, modern software applications can
58
periodically check for updates on the Internet and then download and install these
updates, sometimes without any actions from the user. Most Web sites are also
periodically updated either manually or automatically, when the data in the
databases which drives the sites changes. A particularly interesting case of this
“updateability” feature is the sites which update some information, such as such
as stock prices or weather, continuosly.
4.7. One of the most basic cases of the variability principle is scalability,
in which different versions of the same media object can be generated at various
sizes or levels of detail. The metaphor of a map is useful in thinking about the
scalability principle. If we equate a new media object with a physical territory,
different versions of this object are like maps of this territory, generated at
different scales. Depending on the scale chosen, a map provides more or less
detail about the territory. Indeed, different versions of a new media object may
vary strictly quantitatively, i.e. in the amount of detail present: for instance, a full
size image and its icon, automatically generated by Photoshop; a full text and its
shorter version, generated by “Autosummarize” command in Microsoft Word 97;
or the different versions which can be created using “Outline” command in Word.
Beginning with version 3 (1997), Apple’s QuickTime format also made possible
to imbed a number of different versions which differ in size within a single
QuickTime movie; when a Web user accesses the movie, a version is
automatically selected depending on connection speed. Conceptually similar
technique called “distancing” or “level of detail” is used in interactive virtual
worlds such as VRML scenes. A designer creates a number of models of the
same object, each with progressively less detail. When the virtual camera is close
to the object, a highly detailed model is used; if the object is far away, a lesser
detailed version is automatically substituted by a program to save unnecessary
computation of detail which can’t be seen anyway.
New media also allows to create versions of the same object which differ
from each other in more substantial ways. Here the comparison with maps of
diffident scales no longer works. The examples of commands in commonly used
software packages which allow to create such qualitatively different versions are
“Variations” and “Adjustment layers” in Photoshop 5 and “writing style” option
in Word’s “Spelling and Grammar” command. More examples can be found on
the Internet were, beginning in the middle of the 1990s, it become common to
create a few different versions of a Web site. The user with a fast connection can
choose a rich multimedia version while the user with a slow connection can settle
for a more bare-bones version which loads faster.
Among new media artworks, David Blair’s WaxWeb, a Web site which is
an “adaptation” of an hour long video narrative, offers a more radical
implementation of the scalability principle. While interacting with the narrative,
the user at any point can change the scale of representation, going from an image-
based outline of the movie to a complete script or a particular shot, or a VRML
59
scene based on this shot, and so on.28 Another example of how use of scalability
principle can create a dramatically new experience of an old media object is
Stephen Mamber’s database-driven representation of Hitchock’s Birds. Mamber’s
software generates a still for every shot of the film; it then automatically
combines all the stills into a rectangular matrix. Every cell in the matrix
corresponds to a particular shot from the film. As a result, time is spatialized,
similar to how it was done in Edisons’s early Kinetoscope cylinders (see “The
Myths of New Media.”) Spatializing the film allows us to study its different
temporal structures which would be hard to observe otherwise. As in WaxWeb,
the user can at any point change the scale of representation, going from a
complete film to a particular shot.
As can be seen, the principle of variability is a useful in allowing us to
connect many important characteristics of new media which on first sight may
appear unrelated. In particular, such popular new media structures as branching
(or menu) interactivity and hypermedia can be seen as particular instances of
variability principle (4.4 and 4.5, respectively). In the case of branching
interactivity, the user plays an active role in determining the order in which the
already generated elements are accessed. This is the simplest kind of interactivity;
more complex kinds are also possible where both the elements and the structure
of the whole object are either modified or generated on the fly in response to
user's interaction with a program. We can refer to such implementations as open
interactivity to distinguish them from the closed interactivity which uses fixed
elements arranged in a fixed branching structure. Open interactivity can be
implemented using a variety of approaches, including procedural and object-
oriented computer programming, AI, AL, and neural networks.
As long as there exist some kernel, some structure, some prototype which
remains unchanged throughout the interaction, open interactivity can be thought
of as a subset of variability principle. Here useful analogy can be made with
theory of family resemblance by Witgenstein, later developed into the influential
theory of prototypes by cognitive psychologist Eleonor Rosh. In a family, a
number of relatives will share some features, although no single family member
may posses all of the features. Similarly, according to the theory of prototypes,
the meanings of many words in a natural language derive not through a logical
definition but through a proximity to certain prototype.
Hypermedia, the other popular structure of new media, can also be seen
as a particular case of the more general principle of variability. According to the
definition by Halacz and Swartz, hypermedia systems “provide their users with
the ability to create, manipulate and/or examine a network of information-
containing nodes interconnected by relational links.”29 Since in new media the
individual media elements (images, pages of text, etc.) always retain their
individual identity (the principle of modularity), they can be "wired" together into
more than one object. Hyperlinking is a particular way to achieve this wiring. A
60
hyperlink creates a connection between two elements, for example between two
words in two different pages or a sentence on one page and an image in another,
or two different places within the same page. The elements connected through
hyperlinks can exist on the same computer or on different computers connected
on a network, as in the case of World Wide Web.
If in traditional media the elements are "hardwired" into a unique structure
and no longer maintain their separate identity, in hypermedia the elements and the
structure are separate from each other. The structure of hyperlinks -- typically a
branching tree - can be specified independently from the contents of a document.
To make an analogy with grammar of a natural language as described in Noam
Chomsky’s early linguistic theory,30 we can compare a hypermedia structure
which specifies the connections between the nodes with a deep structure of a
sentence; a particular hypermedia text can be then compared with a particular
sentence in a natural language. Another useful analogy is with computer
programming. In programming, there is clear separation between algorithms and
data. An algorithm specifies the sequence of steps to be performed on any data,
just as a hypermedia structure specifies a set of navigation paths (i.e., connections
between the nodes) which potentially can be applied to any set of media objects.
The principle of variability also exemplifies how, historically, the changes
in media technologies are correlated with changes the social change. If the logic
of old media corresponded to the logic of industrial mass society, the logic of new
media fits the logic of the post-industrial society which values individuality over
conformity. In industrial mass society everybody was supposed to enjoy the same
goods -- and to have the same beliefs. This was also the logic of media
technology. A media object was assembled in a media factory (such as a
Hollywood studio). Millions of identical copies were produced from a master and
distributed to all the citizens. Broadcasting, cinema, print media all followed this
logic.
In a post-industrial society, every citizen can construct her own custom
lifestyle and "select" her ideology from a large (but not infinite) number of
choices. Rather than pushing the same objects/information to a mass audience,
marketing now tries to target each individual separately. The logic of new media
technology reflects this new social logic. Every visitor to a Web site automatically
gets her own custom version of the site created on the fly from a database. The
language of the text, the contents, the ads displayed — all these can be
customized by interpreting the information about where on the network the user is
coming from; or, if the user previously registered with the site, her personal
profile can be used for this customization. According to a report in USA Today
(November 9, 1999), “Unlike ads in magazines or other real-world publications,
‘banner’ ads on Web pages change wit every page view. And most of the
companies that place the ads on the Web site track your movements across the
Net, ‘remembering’ which ads you’ve seen, exactly when you saw them, whether
61
you clicked on them, where you were at the time and the site you have visited just
before.”31
More generally, every hypertext reader gets her own version of the
complete text by selecting a particular path through it. Similarly, every user of an
interactive installation gets her own version of the work. And so on. In this way
new media technology acts as the most perfect realization of the utopia of an ideal
society composed from unique individuals. New media objects assure users that
their choices — and therefore, their underlying thoughts and desires — are
unique, rather than pre-programmed and shared with others. As though trying to
compensate for their earlier role in making us all the same, today descendants of
the Jacqurd's loom, the Hollerith tabulator and Zuse's cinema-computer are now
working to convince us that we are all unique.
The principle of variability as it is presented here is not dissimilar to how
the artist and curator Jon Ippolito uses the same concept.32 I believe that we differ
in how we use the concept of variability in two key respects. First, Ippolito uses
variability to describe a characteristic shared by recent conceptual and some
digital art, while I see variability as a basic condition of all new media. Second,
Ippolito follows the tradition of conceptual art where an artist can vary any
dimension of the artwork, even its content; my use of the term aims to reflect the
logic of mainstream culture where versions of the object share some well-defined
“data.” This “data” which can be a well-known narrative (Psycho), an icon (Coca-
Cola sign), a character (Mickey Mouse) or a famous star (Madonna), is referred in
media industry as “property.” Thus all cultural projects produced by Madonna
will be automatically united by her name. Using the theory of prototypes, we can
say that the property acts as a prototype, and different versions are derived from
this prototype. Moreover, when a number of versions are being commercially
released based on some “property”, usually one of these versions is treated as the
source of the “data,” with others positioned as being derived from this source.
Typically the version which is in the same media as the original “property” is
treated as the source. For instance, when a movie studio releases a new film,
along with a computer game based on it, along with products tie-ins, along with
music written for the movie, etc., usually the film is presented as the “base” object
from which other objects are derived. So when George Lucas releases a new Star
Wars movie, it refers back to the original property — the original Star Wars
trilogy. This new movie becomes the “base” object and all other media objects
which are released along with refer to this object. Conversely, when computer
games such as Tomb Rider are re-made into movies, the original computer game
is presented as the “base” object.
While I deduced the principle of variability from more basic principles of
new media — numerical representation (1) and modularity of information (2) —
it can also be seen as a consequence of computer’s way of to represent data and
model the world itself: as variables rather than constants. As new media theorist
62
and architect Marcos Novak notes, a computer — and computer culture in its
wake — substitute every constant by a variable.33 In designing all functions and
data structures, a computer programmer tries to always use variables rather than
constants. On the level of human-computer interface, this principle means that the
user is given many options to modify the performance of a program of a media
object, be it a computer game, a Web site, a Web browser, or the operating system
itself. The user can change the profile of a game character, modify how the
folders appear on the desktop, how files are displayed, what icons are used, etc. If
we apply this principle to culture at large, it would mean that every choice
responsible for giving a cultural object a unique identity can potentially remain
always open. Size, degree of detail, format, color, shape, interactive trajectory,
trajectory through space, duration, rhythm, point of view, the presence or absence
of particular characters, the development of the plot — to name just a few
dimensions of cultural objects in different media — all these can be defined as
variables, to be freely modified by a user.
Do we want, or need, such freedom? As the pioneer of interactive
filmmaking Graham Weinbren argued in relation to interactive media, making a
choice involves a moral responsibility.34 By passing these choices to the user, the
author also passes the responsibility to represent the world and the human
condition in it. (This is paralleled by the use of phone or Web-based automated
menu systems by all big companies to handle their customers; while the
companies are doing this in the name of “choice” and “freedom,” one of the
effects of this automation is that labor to be done is passed from company’s
employees to the customer. If before a customer would get the information or buy
the product by interacting with a company employee, now she has to spend her
own time and energy in navigating through numerous menus to accomplish the
same result.) The moral anxiety which accompanies the shift from constants to
variables, from tradition to choices in all areas of life in a contemporary society,
and the corresponding anxiety of a writer who has to portray it, is well rendered in
this closing passage of a short story written by a contemporary American writer
Rick Moody (the story is about the death of his sister):35
I should fictionalize it more, I should conceal myself. I should consider the
responsibilities of characterization, I should conflate her two children into one, or
reverse their genders, or otherwise alter them, I should make her boyfriend a
husband, I should explicate all the tributaries of my extended family (its
remarriages, its internecine politics), I should novelize the whole thing, I should
make it multigenerational, I should work in my forefathers (stonemasons and
newspapermen), I should let artifice create an elegant surface, I should make the
events orderly, I should wait and write about it later, I should wait until I’m not
angry, I shouldn’t clutter a narrative with fragments, with mere recollections of
63
good times, or with regrets, I should make Meredith’s death shapely and
persuasive, not blunt and disjunctive, I shouldn’t have to think the unthinkable, I
shouldn’t have to suffer, I should address her here directly (these are the ways I
miss you), I should write only of affection, I should make our travels in this
earthy landscape safe and secure, I should have a better ending, I shouldn’t say
her life was short and often sad, I shouldn’t say she had demons, as I do too.
5. Transcoding
Beginning with the basic, “material” principles of new media — numeric coding
and modular organization — we moved to more “deep” and far reaching ones —
automation and variability. The last, fifth principle of cultural transcoding aims to
describe what in my view is the most substantial consequence of media’s
computerization. As I have suggested, computerization turns media into computer
data. While from one point of view computerized media still displays structural
organization which makes sense to its human users — images feature
recognizable objects; text files consist from grammatical sentences; virtual spaces
are defined along the familiar Cartesian coordinate system; and so on — from
another point of view, its structure now follows the established conventions of
computer's organization of data. The examples of these conventions are different
data structures such as lists, records and arrays; the already mentioned substitution
of all constants by variables; the separation between algorithms and data
structures; and modularity.
The structure of a computer image is a case in point. On the level of
representation, it belongs to the side of human culture, automatically entering in
dialog with other images, other cultural “semes” and “mythemes.” But on another
level, it is a computer file which consist from a machine-readable header,
followed by numbers representing RGB values of its pixels. On this level it enters
into a dialog with other computer files. The dimensions of this dialog are not the
image’s content, meanings or formal qualities, but file size, file type, type of
compression used, file format and so on. In short, these dimensions are that of
computer’s own cosmogony rather than of human culture.
Similarly, new media in general can be thought of as consisting from two
distinct layers: the “cultural layer” and the “computer layer.” The examples of
categories on the cultural layer are encyclopedia and a short story; story and plot;
composition and point of view; mimesis and catharsis, comedy and tragedy. The
examples of categories on the computer layer are process and packet (as in data
packets transmitted through the network); sorting and matching; function and
variable; a computer language and a data structure.
Since new media is created on computers, distributed via computers,
stored and archived on computers, the logic of a computer can be expected to
64
significant influence on the traditional cultural logic of media. That is, we may
expect that the computer layer will affect the cultural layer. The ways in which
computer models the world, represents data and allows us to operate on it; the key
operations behind all computer programs (such as search, match, sort, filter); the
conventions of HCI — in short, what can be called computer’s ontology,
epistemology and pragmatics — influence the cultural layer of new media: its
organization, its emerging genres, its contents.
Of course what I called a computer layer is not itself fixed but is changing
in time. As hardware and software keep evolving and as the computer is used for
new tasks and in new ways, this layer is undergoing continuos transformation.
The new use of computer as a media machine is the case in point. This use is
having an effect on computer’s hardware and software, especially on the level of
the human-computer interface which looks more and more like the interfaces of
older media machines and cultural technologies: VCR, tape player, photo camera.
In summary, the computer layer and media/culture layer influence each
other. To use another concept from new media, we can say that they are being
composited together. The result of this composite is the new computer culture: a
blend of human and computer meanings, of traditional ways human culture
modeled the world and computer’s own ways to represent it.
Throughout the book, we will encounter many examples of the principle
of transcoding at work. For instance, “The Language of Cultural Interfaces”
section will look at how conventions of printed page, cinema and traditional HCI
interact together in the interfaces of Web sites, CD-ROMs, virtual spaces and
computer games.
“Database” section will discuss how a database, originally a computer technology
to organize and access data, is becoming a new cultural form of its own. But we
can also reinterpret some of the principles of new media already discussed above
as consequences of the transcoding principle. For instance, hypermedia can be
understood as one cultural effect of the separation between a algorithm and a data
structure, essential to computer programming. Just as in programming algorithms
and data structures exist independently of each other, in hypermedia data is
separated from the navigation structure. (For another example of the cultural
effect of algorithm—data structure dichotomy see “Database” section.) Similarly,
the modular structure of new media can be seen as an effect of the modularity in
structural computer programming. Just as a structural computer program consist
from smaller modules which in their turn consist from even smaller modules, a
new media object as a modular structure, as I explained in my discussion of
modularity above.
In new media lingo, to “transcode” something is to translate it into another
format. The computerization of culture gradually accomplishes similar
transcoding in relation to all cultural categories and concepts. That is, cultural
categories and concepts are substituted, on the level of meaning and/or the
language, by new ones which derive from computer’s ontology, epistemology and
65
pragmatics. New media thus acts as a forerunner of this more general process of
cultural re-conceptualization.
Given the process of “conceptual transfer” from computer world to culture
at large, and given the new status of media as computer data, what theoretical
framework can we use to understand it? Since on one level new media is an old
media which has been digitized, it seems appropriate to look at new media using
the perspective of media studies. We may compare new media and old media,
such as print, photography, or television. We may also ask about the conditions of
distribution and reception and the patterns of use. We may also ask about
similarities and differences in the material properties of each medium and how
these affect their aesthetic possibilities.
This perspective is important, and I am using it frequently in this book; but it is
not sufficient. It can't address the most fundamental new quality of new media
which has no historical precedent — programmability. Comparing new media to
print, photography, or television will never tell us the whole story. For while from
one point of view new media is indeed another media, from another is simply a
particular type of computer data, something which is stored in files and databases,
retrieved and sorted, run through algorithms and written to the output device. That
the data represents pixels and that this device happened to be an output screen is
besides the point. The computer may perform perfectly the role of the Jacquard
loom, but underneath it is fundamentally Babbage's Analytical Engine - after all,
this was its identity for one hundred and fifty years. New media may look like
media, but this is only the surface.
New media calls for a new stage in media theory whose beginnings can be
traced back to the revolutionary works of Robert Innis and Marshall McLuhan of
the 1950s. To understand the logic of new media we need to turn to computer
science. It is there that we may expect to find the new terms, categories and
operations which characterize media which became programmable. From media
studies, we move to something which can be called software studies; from media
theory — to software theory. The principle of transcoding is one way to start
thinking about software theory. Another way which this book experiments with is
using concepts from computer science as categories of new media theory. The
examples here are “interface” and “database.” And, last but not least, I follow the
analysis of “material” and logical principles of computer hardware and software
in this chapter with two chapters on human-computer interface and the interfaces
of software applications use to author and access new media objects.
66
What New Media is Not
Having proposed a list of the key diffirences between new and old media, I now
would like to address other potential candidates, which I have ommitted.The
following are some of the popularly held notions about the difference between
new and old media which this section will subject to scrutiny:
1. New media is analog media converted to a digital representation. In
contrast to analog media which is continuos, digitally encoded media
is discrete.
2. All digital media (text, still images, visual or audio time data, shapes,
3D spaces) share the same the same digital code. This allows diffirent
media types to be displayed using one machine, i.e., a computer, which
acts as a multimedia display device.
3. New media allows for random access. In contrast to film or videotape
which store data sequentially, computer storage devices make possible
to access any data element equally fast.
4. Digitization involves inevitable loss of information. In contrast to an
analog representation, a digitally encoded representation contains a
fixed amount of information.
5. In contrast to analog media where each successive copy loses quality,
digitally encoded media can be copied endlessly without degradation.
6. New media is interactive. In contrast to traditional media where the
order of presentation was fixed, the user can now interact with a media
object. In the process of interaction the user can choose which
elements to display or which paths to follow, thus generating a unique
work. Thus the user becomes the co-author of the work.
Cinema as New Media
If we place new media new media within a longer historical perspective, we will
see that many of these principles are not unique to new media and can be already
found in older media technologies. I will illustrate this by using the example of
the technology of cinema.
(1). “New media is analog media converted to a digital representation. In contrast
to analog media which is continuos, digitally encoded media is discrete.”
Indeed, any digital representation consists from a limited number of
samples. For example, a digital still image is a matrix of pixels — a 2D sampling
of space. However, as I already noted, cinema was already based on sampling —
the sampling of time. Cinema sampled time twenty four times a second. So we
67
can say that cinema already prepared us for new media. All that remained was to
take this already discrete representation and to quantify it. But this is simply a
mechanical step; what cinema accomplished was a much more difficult
conceptual break from the continuous to the discrete.
Cinema is not the only media technology which, emerging towards the end
of the nineteenth century, employed a discrete representation. If cinema sampled
time, fax transmission of images, starting in 1907, sampled a 2D space; even
earlier, first television experiments (Carey, 1875; Nipkow, 1884) already involved
sampling of both time and space.36 However, reaching mass popularity much
earlier than these other technologies, cinema is the first to make the principle of a
discrete representation of the visual a public knowledge.
(2). “All digital media (text, still images, visual or audio time data, shapes, 3D
spaces) share the same the same digital code. This allows diffirent media types to
be displayed using one machine, i.e., a computer, which acts as a multimedia
display device.”
Before computer multimedia became commonplace around 1990,
filmmakers were already combining moving images, sound and text (be it
intertitles of the silent era or the title sequences of the later period) for a whole
century. Cinema thus was the original modern "multimedia." We can also much
earlier examples of multiple-media displays, such as Medieval illuminated
manuscripts which combined text, graphics and representational images.
(3). “New media allows for random access. In contrast to film or videotape which
store data sequentially, computer storage devices make possible to access any data
element equally fast.”
For example, once a film is digitized and loaded in the computer memory,
any frame can be accessed with equal ease. Therefore, if cinema sampled time but
still preserved its linear ordering (subsequent moments of time become
subsequent frames), new media abandons this "human-centered" representation
altogether — in order to put represented time fully under human control. Time is
mapped onto two-dimensional space, where it can be managed, analyzed and
manipulated more easily.
Such mapping was already widely used in the nineteenth century cinema
machines. The Phenakisticope, the Zootrope, the Zoopraxiscope, the Tachyscope,
and Marey's photographic gun were all based on the same principle -- placing a
number of slightly different images around the perimeter of a circle. Even more
striking is the case of Thomas Edison's first cinema apparatus. In 1887 Edison and
his assistant, William Dickson, began experiments to adopt the already proven
technology of a phonograph record for recording and displaying of motion
pictures. Using a special picture-recording camera, tiny pinpoint-size photographs
were placed in spirals on a cylindrical cell similar in size to the phonography
68
cylinder. A cylinder was to hold 42,000 images, each so small (1/32 inch wide)
that a viewer would have to look at them through a microscope.37 The storage
capacity of this medium was twenty-eight minutes -- twenty-eight minutes of
continuous time taken apart, flattened on a surface and mapped into a two-
dimensional grid. (In short, time was prepared to be manipulated and re-ordered,
something which was soon to be accomplished by film editors.)
The Myth of the Digital
Discrete representation, random access, multimedia -- cinema already contained
these principles. So they cannot help us to separate new media from old media.
Let us continue interrogating these principles. If many principles of new media
turn out to be not so new, what about the idea of digital representation? Surely,
this is the one idea which radically redefines media? The answer is not so strait
forward. This idea acts as an umbrella for three unrelated concepts: analog-to-
digital conversion (digitization), a common representational code, and numerical
representation. Whenever we claim that some quality of new media is due to its
digital status, we need to specify which out of these three concepts is at work. For
example, the fact that different media can be combined into a single digital file is
due to the use of a common representational code; whereas the ability to copy
media without introducing degradation is an effect of numerical representation.
Because of this ambiguity, I try to avoid using the word “digital” in this
book. “Principles of New Media” focused on the concept of numerical
representation as being the really crucial one out of these three. Numerical
representation tuns media into computer data thus making it programmable. And
this indeed radically changes what media is.
In contrast, as I will show below, the alleged principles of new media
which are often deduced from the concept of digitization — that analog-to-digital
conversion inevitably results in a loss of information and that digital copies are
identical to the original — turn out not to hold under closer examination. That is,
although these principles are indeed logical consequence of digitization, they do
not apply to concrete computer technologies the way they are currently used.
(4). “Digitization involves inevitable loss of information. In contrast to an analog
representation, a digitally encoded representation contains a fixed amount of
information.”
In his important study of digital photography The Reconfigured Eye,
William Mitchell explains this as follows: "There is an indefinite amount of
information in a continuous-tone photograph, so enlargement usually reveals
more detail but yields a fuzzier and grainier picture... A digital image, on the other
hand, has precisely limited spatial and tonal resolution and contains a fixed
69
amount of information."38 From a logical point of view, this principle is a correct
deduction from the idea of digital representation. A digital image consists of a
finite number of pixels, each having a distinct color or a tonal value, and this
number determines the amount of detail an image can represent. Yet in reality this
difference does not matter. By the end of the 1990s, even cheap consumer
scanners were capable of scanning images at resolutions of1200 or 2400 pixels
per inch. So while a digitally stored image is still comprised of a finite number of
pixels, at such resolution it can contain much finer detail than it was ever possible
with traditional photography. This nullifies the whole distinction between an
"indefinite amount of information in a continuous-tone photograph" and a fixed
amount of detail in a digital image. The more relevant question is how much
information in an image can be useful to the viewer. By the end of new media
first decade, technology has already reached the point where a digital image can
easily contain much more information than anybody would ever want.
But even the pixel-based representation, which appears to be the very
essence of digital imaging, cannot be taken for granted. Some computer graphics
software have bypassed the main limitation of the traditional pixel grid -- fixed
resolution. Live Picture, an image editing program, converts a pixel-based image
into a set of mathematical equations. This allows the user to work with an image
of virtually unlimited resolution. Another paint program Matador makes possible
painting on a tiny image which may consist of just a few pixels as though it were
a high-resolution image (it achieves this by breaking each pixel into a number of
smaller sub-pixels). In both programs, the pixel is no longer a "final frontier"; as
far as the user is concerned, it simply does not exist. Texture mapping algorithms
make the notion of a fixed resolution meaningless in a different way. They often
store the same image at a number of different resolution. During rendering the
texture map of arbitrary resolution is produced by interpolating between two
images which are closest to this resolution. (The similar technique is used by
virtual world software which stores the number of versions of a singular object at
different degree of detail.) Finally, certain compression techniques eliminate
pixel-based representation altogether, instead representing an image via different
mathematical constructs (such as transforms.)
(5). “In contrast to analog media where each successive copy loses quality,
digitally encoded media can be copied endlessly without degradation.”
Mitchell summarizes this as follows: "The continuous spatial and tonal
variation of analog pictures is not exactly replicable, so such images cannot be
transmitted or copied without degradation... But discrete states can be replicated
precisely, so a digital image that is a thousand generations away from the original
is indistinguishable in quality from any one of its progenitors."39 Therefore, in
digital culture, "an image file can be copied endlessly, and the copy is
70
distinguishable from the original by its date since there is no loss of quality."40
This is all true -- in principle. However, in reality, there is actually much more
degradation and loss of information between copies of digital images than
between copies of traditional photographs. A single digital image consists of
millions of pixels. All of this data requires considerable storage space in a
computer; it also takes a long time (in contrast to a text file) to transmit over a
network. Because of this, the software and hardware used to acquire, store,
manipulate, and transmit digital images uniformly rely on lossy compression --
the technique of making image files smaller by deleting some information. The
example of lossy compression technique is JPEG format used to store still images
and MPEG, used to store digital video on DVD. The technique involves a
compromise between image quality and file size -- the smaller the size of a
compressed file, the more visible are the visual artifacts introduced in deleting
information. Depending on the level of compression, these artifacts range from
barely noticeable to quite pronounced.
One may argue that this situation is temporary and once cheaper computer
storage and faster networks become commonplace, lossy compression will
disappear. However, presently the trend is quite the reverse with lossy
compression becoming more and more the norm for representing visual
information. If a single digital image already contains a lot of data, this amount
increases dramatically if we want to produce and distribute moving images in a
digital form (one second of video, for instance, consists of 30 still images). Digital
television with its hundreds of channels and video on-demand services, the
distribution of full-length films on DVD or over Internet, fully digital post-
production of feature films -- all of these developments are made possible by
lossy compression. It will be a number of years before the advances in storage
media and communication bandwidth will eliminate the need to compress audio-
visual data. So rather than being an aberration, a flaw in the otherwise pure and
perfect world of the digital, where even a single bit of information is never lost,
lossy compression is the very foundation of computer culture, at least for now.
Therefore, while in theory computer technology entails the flawless replication of
data, its actual use in contemporary society is characterized by the loss of data,
degradation, and noise; the noise which is often even stronger than that of
traditional analog media.
The Myth of Interactivity
We have only one principle still remaining from the original list: interactivity. As
with “digital,” I avoid using the word “interactive” in this book without qualifying
it,. for the same reason -- I find the concept to be too broad to be truly useful.
71
Used in relation to computer-based media, the concept of interactivity is a
tautology. Modern human-computer interface (HCI) is by its very definition
interactive. In contrast to earlier interfaces such as batch processing, modern HCI
allows the user to control the computer in real-time by manipulating information
displayed on the screen. Once an object is represented in a computer, it
automatically becomes interactive. Therefore, to call computer media interactive
is meaningless -- it simply means stating the most basic fact about computers.
Rather than evoking this concept by itself, in this book I use a number of
other concepts, such as menu-based interactivity, salability, simulation, image-
interface, and image-instrument, to describe different kinds of interactive
structures and operations. The already used distinction between “closed” and
“open” interactivity is just one example of this approach.
While it is relatively easy to specify different interactive structures used in
new media object, it is much more difficult to theoretically deal with user
experiences of these structures. This remains to be one of the most difficult
theoretical questions raised by new media. Without pretending to have a complete
answer, I would like to address some aspects of this question here.
All classical, and even more so modern art, was already "interactive" in a
number of ways. Ellipses in literary narration, missing details of objects in visual
art and other representational "shortcuts" required the user to fill-in the missing
information.41 Theater, painting and cinema also relied on the techniques of
staging, composition and cinematography to orchestrate viewer's attention over
time, requiring her to focus on different parts of the display. With sculpture and
architecture, the viewer had to move her whole body to experience the spatial
structure.
Modern media and art pushed each of these techniques further, putting
new cognitive and physical demands on the viewer. Beginning in the 1920s new
narrative techniques such as film montage forced the audiences to quickly bridge
mental gaps between unrelated images. New representational style of semi-
abstraction which, along with photography, became the “international style” of
modern visual culture, required the viewer to reconstruct the represented objects
from the bare minimum -- a contour, few patches of color, shadows cast by the
objects not represented directly. Finally, in the 1960s, continuing where Futurism
and Dada left of, new forms of art such as happenings, performance and
installation turned art explicitly participational. This, according to some new
media theorists, prepared the ground for interactive computer installations which
appeared in the 1980s.42
When we use the concept of “interactive media” exclusively in relation to
computer-based media, there is danger that we interpret "interaction" literally,
equating it with physical interaction between a user and a media object (pressing a
button, choosing a link, moving the body), at the sake of psychological
interaction. The psychological processes of filling-in, hypothesis forming, recall
72
and identification, which are required for us to comprehend any text or image at
all, are mistakenly identified with an objectively existing structure of interactive
links.43
This mistake is not new; on the contrary, it is a structural feature of history
of modern media. The literal interpretation of interactivity is just the latest
example of a larger modern trend to externalize of mental life, the process in
which media technologies -- photography, film, VR -- have played a key role.44
Beginning in the nineteenth century, we witness recurrent claims by the users and
theorists of new media technologies, from Francis Galton (the inventor of
composite photography in the 1870s) to Hugo Munsterberg, Sergei Eisenstein
and, recently, Jaron Lanier, that these technologies externalize and objectify the
mind. Galton not only claimed that "the ideal faces obtained by the method of
composite portraiture appear to have a great deal in common with...so-called
abstract ideas" but in fact he proposed to rename abstract ideas "cumulative
ideas."45 According to Münsterberg, who was a Professor of Psychology at
Harvard University and an author of one of the earliest theoretical treatments of
cinema entitled The Film: A Psychological Study (1916), the essence of films lies
in its ability to reproduce, or "objectify" various mental functions on the screen:
"The photoplay obeys the laws of the mind rather than those of the outer
world."46 In the 1920s Eisenstein was speculating about how film can be used to
externalize — and control — thinking. As an experiment in this direction, he
boldly conceived a screen adaptation of Marx's Capital. "The content of
CAPITAL (its aim) is now formulated: to teach the worker to think dialectically,"
Eisenstein writes enthusiastically in April of 1928.47 In accordance with the
principles of "Marxist dialectics" as canonized by the official Soviet philosophy,
Eisenstein planned to present the viewer with the visual equivalents of thesis and
anti-thesis so that the viewer can then proceed to arrive at synthesis, i.e. the
correct conclusion, pre-programmed by Eisenstein.
In the 1980s, Jaron Lanier, a California guru of VR, similarly saw VR
technology as capable of completely objectifying, better yet, transparently
merging with mental processes. His descriptions of its capabilities did not
distinguish between internal mental functions, events and processes, and
externally presented images. This is how, according to Lanier, VR can take over
human memory: "You can play back your memory through time and classify your
memories in various ways. You'd be able to run back through the experiential
places you've been in order to be able to find people, tools."48 Lanier also claimed
that VR will lead to the age of "post-symbolic communication," communication
without language or any other symbols. Indeed, why should there be any need for
linguistic symbols, if everybody, rather than being locked into a "prison-house of
language" (Fredric Jameson49), will happily live in the ultimate nightmare of
73
democracy -- the single mental space which is shared by everybody, and where
every communicative act is always ideal (Jurgen Habermas50). This is Lanier's
example of how post-symbolic communication will function: "you can make a
cup that someone else can pick when there wasn't a cup before, without having to
use a picture of the word "cup."51 Here, as with the earlier technology of film, the
fantasy of objectifying and augmenting consciousness, extending the powers of
reason, goes hand in hand with the desire to see in technology a return to the
primitive happy age of pre-language, pre-misunderstanding. Locked in virtual
reality caves, with language taken away, we will communicate through gestures,
body movements, and grimaces, like our primitive ancestors...
The recurrent claims that new media technologies externalize and
objectify reasoning, and that they can be used to augment or control it, are based
on the assumption of the isomorphism of mental representations and operations
with external visual effects such as dissolves, composite images, and edited
sequences. This assumption is shared not just by modern media inventors, artists
and critics but also by modern psychologists. Modern psychological theories of
the mind, from Freud to cognitive psychology, repeatedly equate mental processes
with external, technologically generated visual forms. Thus Freud in The
Interpretation of Dreams (1900) compared the process of condensation with one
of Francis Galton's procedures which became especially famous: making family
portraits by overlaying a different negative image for each member of the family
and then making a single print.52 Writing in the same decade, the American
psychologist Edward Titchener opened the discussion of the nature of abstract
ideas in his textbook of psychology by noting that "the suggestion has been made
that an abstract idea is a sort of composite photograph, a mental picture which
results from the superimposition of many particular perceptions or ideas, and
which therefore shows the common elements distinct and the individual elements
blurred."53 He then proceeds to consider the pros and cons of this view. We
should not wonder why Titchener, Freud and other psychologists take the
comparison for granted rather than presenting it as a simple metaphor --
contemporary cognitive psychologists also do not question why their models of
the mind are so similar to the computer workstations on which they are
constructed. The linguist George Lakoff asserted that "natural reasoning makes
use of at least some unconscious and automatic image-based processes such as
superimposing images, scanning them, focusing on part of them"54 while the
psychologist Philip Johnson-Laird proposed that logical reasoning is a matter of
scanning visual models.55 Such notions would have been impossible before the
emergence of television and computer graphics. These visual technologies made
operations on images such as scanning, focusing, and superimposition seem
natural.
74
What to make of this modern desire to externalize the mind? It can be
related to the demand of modern mass society for standardization. The subjects
have to be standardized, and the means by which they are standardized need to be
standardized as well. Hence the objectification of internal, private mental
processes, and their equation with external visual forms which can be easily
manipulated, mass produced, and standardized on its own. The private and
individual is translated into the public and becomes regulated.
What before was a mental process, a uniquely individual state, now
became part of a public sphere. Unobservable and interior processes and
representations were taken out of individual heads and put outside -- as drawings,
photographs and other visual forms. Now they could be discussed in public,
employed in teaching and propaganda, standardized, and mass-distributed. What
was private became public. What was unique became mass-produced. What was
hidden in an individual's mind became shared.
Interactive computer media perfectly fits this trend to externalize and
objectify mind’s operations. The very principle of hyperlinking, which forms the
basis of much of interactive media, objectifies the process of association often
taken to be central to human thinking. Mental processes of reflection, problem
solving, recall and association are externalized, equated with following a link,
moving to a new page, choosing a new image, or a new scene. Before we would
look at an image and mentally follow our own private associations to other
images. Now interactive computer media asks us instead to click on an image in
order to go to another image. Before we would read a sentence of a story or a line
of a poem and think of other lines, images, memories. Now interactive media asks
us to click on a highlighted sentences to go to another sentence. In short, we are
asked to follow pre-programmed, objectively existing associations. Put
diffidently, in what can be read as a new updated version of French philosopher
Louis Althusser's concept of "interpellation," we are asked to mistake the
structure of somebody's else mind for our own.56
This is a new kind of identification appropriate for the information age of
cognitive labor. The cultural technologies of an industrial society -- cinema and
fashion -- asked us to identify with somebody's bodily image. The interactive
media asks us to identify with somebody's else mental structure. If a cinema
viewer, both male and female was lasting after and trying to emulate the body of
movie star, a computer user is asked to follow the mental trajectory of a new
media designer.
75
II. The Interface
In 1984 the director of Blade Runner Ridley Scott was hired to create a
commercial which introduced Apple Computer’s new Macintosh. In retrospect,
this event is full of historical significance. Released within two years of each
other, Blade Runner (1982) and Macintosh computer (1984) defined the two
aesthetics which, twenty years, still rule contemporary culture. One was a
futuristic dystopia which combined futurism and decay, computer technology and
fetishism, retro-styling and urbanism, Los Angeles and Tokyo. Since Blade
Runner release, its techno-noir was replayed in countless films, computer games,
novels and other cultural objects. And while a number of strong aesthetic systems
have been articulated in the following decades, both by individual artists (Mathew
Barney, Mariko Mori) and by commercial culture at large (the 1980s “post-
modern” pastiche, the 1990s techno-minimalism), none of them was able to
challenge the hold of Blade Runner on our vision of the future.
In contrast to the dark, decayed, “post-modern” vision of Blade Runner,
Graphical User Interface (GUI), popularized by Macintosh, remained true to the
modernist values of clarity and functionality. The user’s screen was ruled by strait
lines and rectangular windows which contained smaller rectangles of individual
files arranged in a grid. The computer communicated with the user via rectangular
boxes containing clean black type rendered again white background. Subsequent
versions of GUI added colors and made possible for users to customize the
appearance of many interface elements, thus somewhat deluding the sterility and
boldness of the original monochrome 1984 version. Yet its original aesthetic
survived in the displays of hand-held communicators such as Palm Pilot, cellular
telephones, car navigation systems and other consumer electronic products which
use small LCD displays comparable in quality to 1984 Macintosh screen.
Like Blade Runner, Macintosh’s GUI articulated a vision of the future,
although a very different one. In this vision, the lines between human and is
technological creations (computers, androids) are clearly drawn and decay is not
tolerated. In computer, once a file is created, it never disappears except when
explicitly deleted by the user. And even then deleted items can be usually
recovered. Thus if in “meatspace” we have to work to remember, in cyberspace
we have to work to forget. (Of course while they run, OS and applications
constantly create, write to and erase various temporary files, as well as swap data
between RAM and virtual memory files on a hard drive, but most of this activity
remains invisible to the user.)
Also like Blade Runner, GUI vision also came to influence many other
areas of culture. This influence ranges from purely graphical (for instance, use of
GUI elements by print and TV designers) to more conceptual. In the 1990s, as the
Internet progressively grew in popularity, the role of a digital computer shifted
76
from being a particular technology (a calculator, a symbol processor, an image
manipulator, etc.) to being a filter to all culture, a form through which all kinds of
cultural and artistic production is being mediated. As a window of a Web browser
comes to replace cinema and television screen, a wall in art gallery, a library and
a book, all at once, the new situation manifest itself: all culture, past and present,
is being filtered through a computer, with its particular human-computer
interface.57
In semiotic terms, the computer interface acts as a code which carries
cultural messages in a variety of media. When you use the Internet, everything
you access — texts, music, video, navigable spaces — passes through the
interface of the browser and then, in its turn, the interface of the OS. In cultural
communication, a code is rarely simply a neutral transport mechanism; usually it
affects the messages transmitted with its help. For instance, it may make some
messages easy to conceive and render others unthinkable. A code may also
provide its own model of the world, its own logical system, or ideology;
subsequent cultural messages or whole languages created using this code will be
limited by this model, system or ideology. Most modern cultural theories rely on
these notions which I will refer to together as “non-transparency of the code”
idea. For instance, according to Whorf-Sapir hypothesis which enjoyed popularity
in the middle of the twentieth century, human thinking is determined by the code
of natural language; the speakers of different natural languages perceive and think
about world differently.58 Whorf-Sapir hypothesis is an extreme expression of
“non-transparency of the code” idea; usually it is formulated in a less extreme
form. But then we think about the case of human-computer interface, applying a
“strong” version of this idea makes sense. The interface shapes how the computer
user conceives the computer itself. It also determines how users think of any
media object accessed via a computer. Stripping different media of their original
distinctions, the interface imposes its own logic on them. Finally, by organizing
computer data in particular ways, the interface provides distinct models of the
world. For instance, a hierarchical file system assumes that the world can be
organized in a logical multi-level hierarchy. In contrast, a hypertext model of the
World Wide Web models the world as a non-hierarchical system ruled by
metonymy. In short, far from being a transparent window into the data inside a
computer, the interface bring with it strong messages of its own.
As an example of how the interface imposes its own logic on media,
consider “cut and paste” operation, standard in all software running under modern
GUI. This operation renders insignificant the traditional distinction between
spatial and temporal media, since the user can cut and paste parts of images,
regions of space and parts of a temporal composition in exactly the same way. It
is also “blind” to traditional distinctions in scale: the user can cut and paste a
single pixel, an image, a whole digital movie in the same way. And last, this
operation also renders insignificant traditional distinctions between media: “cut
77
and paste” can be applied to texts, still and moving images, sounds and 3D objects
in the same way.
The interface comes to play a crucial role in information society yet in a
another way. In this society, not only work and leisure activities increasingly
involve computer use, but they also converge around the same interfaces. Both
“work” applications (word processors, spreadsheet programs, database programs)
and “leisure” applications (computer games, informational DVD) use the same
tools and metaphors of GUI. The best example of this convergence is a Web
browser employed both in the office and at home, both for work and for play. In
this respect information society is quite different from industrial society, with its
clear separation between the field of work and the field of leisure. In the
nineteenth century Karl Marx imagined that a future communist state would
overcome this work-leisure divide as well as the highly specialized and piece-
meal character of modern work itself. Marx's ideal citizen would be cutting wood
in the morning, gardening in the afternoon and composing music in the evening.
Now a subject of information society is engaged in even more activities during a
typical day: inputting and analyzing data, running simulations, searching the
Internet, playing computer games, watching streaming video, listening to music
online, trading stocks, and so on. Yet in performing all these different activities
the user in essence is always using the same few tools and commands: a computer
screen and a mouse; a Web browser; a search engine; cut, paste, copy, delete and
find commands. (In the introduction to “Forms” chapter I will discuss how the
two key new forms of new media — database and navigable space — can be also
understood in relation to work--leisure opposition.)
If human-computer interface become a key semiotic code of the
information society as well as its meta-tool, how does this affect the functioning
of cultural objects in general and art objects in particular? As I already noted
(“Principles of New Media,” 4.2), in computer culture it becomes common to
construct the number of different interfaces to the same “content.” For instance,
the same data can be represented as a 2D graph or as an interactive navigable
space. Or, a Web site may guide the user to different versions of the site
depending on the bandwidth of her Internet connection. (I will elaborate on this in
“Database” section where a new media object will be defined as one or more
interfaces to a multimedia database.) Given these examples, we may be tempted
to think of a new media artwork as also having two separate levels: content and
interface. Thus the old dichotomies content — form and content — medium can
be re-written as content — interface. But postulating such an opposition assumes
that artwork’s content is independent of its medium (in an art historical sense) or
its code (in a semiotic sense). Situated in some idealized medium-free realm,
content is assumed to exist before its material expression. These assumptions are
correct in the case of visualization of quantified data; they also apply to classical
art with its well-defined iconographic motives and representational conventions.
78
But just as modern thinkers, from Whorf to Derrida, insisted on “non-
transparency of a code” idea, modern artists assumed that content and form can’t
be separated. In fact, from the 1910s “abstraction” to the 1960s “process," artists
keep inventing concepts and procedures to assure that they can’t paint some pre-
existent content.
This leaves us with an interesting paradox. Many new media artworks
have what can be called “an informational dimension,” the condition which they
share with all new media objects. Their experience includes retrieving, looking at
and thinking about quantified data. Therefore when we refer to such artworks we
are justified in separating the levels of content and interface. At the same time,
new media artworks have more traditional “experiential” or aesthetic dimensions,
which justifies their status as art rather than as information design. These
dimensions include a particular configuration of space, time, and surface
articulated in the work; a particular sequence of user’s activities over time to
interact with the work; a particular formal, material and phenomenological user
experience. And it is the work’s interface that creates its unique materiality and
the unique user experience. To change the interface even slightly is to
dramatically change the work. From this perspective, to think of an interface as a
separate level, as something that can be arbitrary varied is to eliminate the status
of a new media artwork as art.
There is another way to think about the difference between new media
design and new media art in relation to the content — interface dichotomy. In
contrast to design, in art the connection between content and form (or, in the case
of new media, content and interface) is motivated. That is, the choice of a
particular interface is motivated by work’s content to such degree that it can no
longer be thought of as a separate level. Content and interface merge into one
entity, and no longer can be taken apart.
Finally, the idea of content pre-existing the interface is challenged in yet
another way by new media artworks which dynamically generate their data in real
time. While in a menu-based interactive multimedia application or a static Web
site all data already exists before the user accesses it, in dynamic new media
artworks the data is created on the fly, or, to use the new media lingo, at run time.
This can be accomplished in a variety of ways: procedural computer graphics,
formal language systems, Artificial Intelligence (AI) and Artificial Life (AL)
programming. All these methods share the same principle: a programmer setups
some initial conditions, rules or procedures which control the computer program
generating the data. For the purposes of the present discussion, the most
interesting of these approaches are AL and the evolution paradigm. In AL
approach, the interaction between a number of simple objects at run time leads to
the emergence of complex global behaviors. These behaviors can only be
obtained in the course of running the computer program; they can’t be predicted
beforehand. The evolution paradigm applies the metaphor of the evolution theory
to the generation of images, shapes, animations and other media data. The initial
79
data supplied by the programmer acts as a genotype which is expanded into a full
phenotype by a computer. In either case, the content of an artwork is the result of
a collaboration between the artist/programmer and the computer program, or, if
the work is interactive, between the artist, the computer program and the user.
New media artists who most systematically explored AL approach is the team of
Christa Sommerer and Laurent Mignonneau. In their installation "Life Spacies”
virtual organisms appear and evolve in response to the position, movement and
interactions of the visitors. Artist/programmer Karl Sims made the key
contribution to applying the evolution paradigm to media generation. In his
installation “Galapagos” the computer programs generates twelfth different virtual
organisms at every iteration; the visitors select an organism which will continue to
leave, copulate, mutate and reproduce.59 The commercial products which use AL
and evolution approaches are computer games such as Creatures series
(Mindscape Entertainment) and ”virtual pet” toys such as Tamagochi.
In organizing this book I wanted to highlight the importance of the
interface category by placing its discussion right in the beginning. The two
sections of this chapter present the examples of different issues raised this
category -- but they in no way exhaust it. In “The Language of Cultural Interface”
I introduce the term “cultural interfaces” to describe interfaces used by stand-
alone hypermedia (CD-ROM and DVD titles), Web sites, computer games and
other cultural objects distributed via a computer. I think we need such a term
because as the role of a computer is shifting from being a tool to a universal
media machine, we are increasingly "interfacing" to predominantly cultural data:
texts, photographs, films, music, multimedia documents, virtual environments.
Therefore, human-computer interface is being supplemented by human-computer-
culture interface, which I abbreviate as “cultural interface.” The section then
discusses the how the three cultural forms -- cinema, the printed word, and a
general-purpose human-computer interface — contributed to shaping the
appearance and functionality of cultural interfaces during the 1990s.
The second section “The Screen and the User” discusses the key element
of the modern interface — the computer screen. As in the first section, I am
interested in analyzing continuities between a computer interface and older
cultural forms, languages and conventions. The section positions the computer
screen within a longer historical tradition and it traces different stages in the
development of this tradition: the static illusionistic image of Renaissance
painting; the moving image of film screen, the real-time image of radar and
television; and real-time interactive image of a computer screen.
80
The Language of Cultural Interfaces
Cultural Interfaces
The term human-computer interface (HCI) describes the ways in which the user
interacts with a computer. HCI includes physical input and output devices such a
monitor, a keyboard, and a mouse. It also consists of metaphors used to
conceptualize the organization of computer data. For instance, the Macintosh
interface introduced by Apple in 1984 uses the metaphor of files and folders
arranged on a desktop. Finally, HCI also includes ways of manipulating this data,
i.e. a grammar of meaningful actions which the user can perform on it. The
example of actions provided by modern HCI are copy, rename and delete file; list
the contents of a directory; start and stop a computer program; set computer’s date
and time.
The term HCI was coined when computer was mostly used as a tool for
work. However, during the 1990s, the identity of computer has changed. In the
beginning of the decade, a computer was still largely thought of as a simulation of
a typewriter, a paintbrush or a drafting ruler -- in other words, as a tool used to
produce cultural content which, once created, will be stored and distributed in its
appropriate media: printed page, film, photographic print, electronic recording.
By the end of the decade, as Internet use became commonplace, the computer's
public image was no longer that of tool but also that a universal media machine,
used not only to author, but also to store, distribute and access all media.
As distribution of all forms of culture becomes computer-based, we are
increasingly “interfacing” to predominantly cultural data: texts, photographs,
films, music, virtual environments. In short, we are no longer interfacing to a
computer but to culture encoded in digital form. I will use the term "cultural
interfaces" to describe human-computer-culture interface: the ways in which
computers present and allows us to interact with cultural data. Cultural interfaces
include the interfaces used by the designers of Web sites, CD-ROM and DVD
titles, multimedia encyclopedias, online museums and magazines, computer
games and other new media cultural objects.
If you need to remind yourself what a typical cultural interface looked in
the second part of the 1990s, say 1997, go back in time and click to a random
Web page. You are likely to see something which graphically resembles a
magazine layout from the same decade. The page is dominated by text: headlines,
hyperlinks, blocks of copy. Within this text are few media elements: graphics,
photographs, perhaps a QuickTime movie and a VRML scene. The page also
includes radio buttons and a pull-down menu which allows you to choose an item
from the list. Finally there is a search engine: type a word or a phrase, hit the
81
search button and the computer will scan through a file or a database trying to
match your entry.
For another example of a prototypical cultural interface of the 1990s, you
may load (assuming it would still run on your computer) the most well-known
CD-ROM of the 1990s — Myst (Broderbund, 1993). Its opening clearly recalls a
movie: credits slowly scroll across the screen, accompanied by a movie-like
soundtrack to set the mood. Next, the computer screen shows a book open in the
middle, waiting for your mouse click. Next, an element of a familiar Macintosh
interface makes an appearance, reminding you that along with being a new
movie/book hybrid, Myst is also a computer application: you can adjust sound
volume and graphics quality by selecting from a usual Macintosh-style menu in
the upper top part of the screen. Finally, you are taken inside the game, where the
interplay between the printed word and cinema continue. A virtual camera frames
images of an island which dissolve between each other. At the same time, you
keep encountering books and letters, which take over the screen, providing with
you with clues on how to progress in the game.
Given that computer media is simply a set of characters and numbers
stored in a computer, there are numerous ways in which it could be presented to a
user. Yet, as it always happens with cultural languages, only a few of these
possibilities actually appear viable in a given historical moment. Just as early
fifteenth century Italian painters could only conceive of painting in a very
particular way — quite different from, say, sixteenth century Dutch painters —
today's digital designers and artists use a small set of action grammars and
metaphors out of a much larger set of all possibilities.
Why do cultural interfaces — Web pages, CD-ROM titles, computer
games — look the way they do? Why do designers organize computer data in
certain ways and not in others? Why do they employ some interface metaphors
and not others?
My theory is that the language of cultural interfaces is largely made up
from the elements of other, already familiar cultural forms. In the following I will
explore the contributions of three such forms to this language during its first
decades -- the 1990s. The three forms which I will focus make their appearance in
the opening sequence of the already discussed prototypical new media object of
the 1990s — Myst. Its opening activates them before our eyes, one by one. The
first form is cinema. The second form is the printed word. The third form is a
general-purpose human-computer interface (HCI).
As it should become clear from the following, I use words "cinema" and
"printed word" as shortcuts. They stand not for particular objects, such as a film
or a novel, but rather for larger cultural traditions (we can also use such words as
cultural forms, mechanisms, languages or media). "Cinema" thus includes mobile
camera, representation of space, editing techniques, narrative conventions,
activity of a spectator -- in short, different elements of cinematic perception,
language and reception. Their presence is not limited to the twentieth-century
82
institution of fiction films, they can be already found in panoramas, magic lantern
slides, theater and other nineteenth-century cultural forms; similarly, since the
middle of the twentieth century, they are present not only in films but also in
television and video programs. In the case of the "printed word" I am also
referring to a set of conventions which have developed over many centuries (some
even before the invention of print) and which today are shared by numerous forms
of printed matter, from magazines to instruction manuals: a rectangular page
containing one or more columns of text; illustrations or other graphics framed by
the text; pages which follow each sequentially; a table of contents and index.
Modern human-computer interface has a much shorter history than the
printed word or cinema -- but it is still a history. Its principles such as direct
manipulation of objects on the screen, overlapping windows, iconic
representation, and dynamic menus were gradually developed over a few decades,
from the early 1950s to the early 1980s, when they finally appeared in
commercial systems such as Xerox Star (1981), the Apple Lisa (1982), and most
importantly the Apple Macintosh (1984).60 Since than, they have become an
accepted convention for operating a computer, and a cultural language in their
own right.
Cinema, the printed word and human-computer interface: each of these
traditions has developed its own unique ways of how information is organized,
how it is presented to the user, how space and time are correlated with each other,
how human experience is being structured in the process of accessing
information. Pages of text and a table of contents; 3D spaces framed by a
rectangular frame which can be navigated using a mobile point of view;
hierarchical menus, variables, parameters, copy/paste and search/replace
operations -- these and other elements of these three traditions are shaping cultural
interfaces today. Cinema, the printed word and HCI: they are the three main
reservoirs of metaphors and strategies for organizing information which feed
cultural interfaces.
Bringing cinema, the printed word and HCI interface together and treating
them as occupying the same conceptual plane has an additional advantage -- a
theoretical bonus. It is only natural to think of them as belonging to two different
kind of cultural species, so to speak. If HCI is a general purpose tool which can be
used to manipulate any kind of data, both the printed word and cinema are less
general. They offer ways to organize particular types of data: text in the case of
print, audio-visual narrative taking place in a 3D space in the case of cinema. HCI
is a system of controls to operate a machine; the printed word and cinema are
cultural traditions, distinct ways to record human memory and human experience,
mechanisms for cultural and social exchange of information. Bringing HCI, the
printed word and cinema together allows us to see that the three have more in
common than we may anticipate at first. On the one hand, being a part of our
culture now for half a century, HCI already represents a powerful cultural
83
tradition, a cultural language offering its own ways to represent human memory
and human experience. This language speaks in the form of discrete objects
organized in hierarchies (hierarchical file system), or as catalogs (databases), or as
objects linked together through hyperlinks (hypermedia). On the other hand, we
begin to see that the printed word and cinema also can be thought of as interfaces,
even though historically they have been tied to particular kinds of data. Each has
its own grammar of actions, each comes with its own metaphors, each offers a
particular physical interface. A book or a magazine is a solid object consisting
from separate pages; the actions include going from page to page linearly,
marking individual pages and using table of contexts. In the case of cinema, its
physical interface is a particular architectural arrangement of a movie theater; its
metaphor is a window opening up into a virtual 3D space.
Today, as media is being "liberated" from its traditional physical storage
media — paper, film, stone, glass, magnetic tape — the elements of printed word
interface and cinema interface, which previously were hardwired to the content,
become "liberated" as well. A digital designer can freely mix pages and virtual
cameras, table of contents and screens, bookmarks and points of view. No longer
embedded within particular texts and films, these organizational strategies are
now free floating in our culture, available for use in new contexts. In this respect,
printed word and cinema have indeed became interfaces -- rich sets of metaphors,
ways of navigating through content, ways of accessing and storing data. For a
computer user, both conceptually and psychologically, their elements exist on the
same plane as radio buttons, pull-down menus, command line calls and other
elements of standard human-computer interface.
Let us now discuss some of the elements of these three cultural traditions -
- cinema, the printed word and HCI -- to see how they have shaped the language
of cultural interfaces.
Printed Word
In the 1980's, as PCs and word processing software became commonplace, text
became the first cultural media to be subjected to digitization in a massive way.
But already in the 1960's, two and a half decades before the concept of digital
media was born, researchers were thinking about having the sum total of human
written production -- books, encyclopedias, technical articles, works of fiction and
so on -- available online (Ted Nelson's Xanadu project61).
Text is unique among other media types. It plays a privileged role in
computer culture. On the one hand, it is one media type among others. But, on the
other hand, it is a meta-language of computer media, a code in which all other
media are represented: coordinates of 3D objects, pixel values of digital images,
the formatting of a page in HTML. It is also the primary means of communication
84
between a computer and a user: one types single line commands or runs computer
programs written in a subset of English; the other responds by displaying error
codes or text messages.62
If a computer uses text as its meta-language, cultural interfaces in their
turn inherit the principles of text organization developed by human civilization
throughout its existence. One of these is a page: a rectangular surface containing a
limited amount of information, designed to be accessed in some order, and having
a particular relationship to other pages. In its modern form, the page is born in the
first centuries of the Christian era when the clay tablets and papyrus rolls are
replaced by a codex — the collection of written pages stitched together on one
side.
Cultural interfaces rely on our familiarity with the "page interface" while
also trying to stretch its definition to include new concepts made possible by a
computer. In 1984, Apple introduced a graphical user interface which presented
information in overlapping windows stacked behind one another — essentially, a
set of book pages. The user was given the ability to go back and forth between
these pages, as well as to scroll through individual pages. In this way, a traditional
page was redefined as a virtual page, a surface which can be much larger than the
limited surface of a computer screen. In 1987, Apple shipped popular Hypercard
program which extended the page concept in new ways. Now the users were able
to include multimedia elements within the pages, as well as to establish links
between pages regardless of their ordering. A few years later, designers of HTML
stretched the concept of a page even more by enabling the creation of distributed
documents, where different parts of a document are located on different
computers connected through the network. With this development, a long process
of gradual "virtualization" of the page reached a new stage. Messages written on
clay tablets, which were almost indestructible, were replaced by ink on paper. Ink,
in its turn, was replaced by bits of computer memory, making characters on an
electronic screen. Now, with HTML, which allows parts of a single page to be
located on different computers, the page became even more fluid and unstable.
The conceptual development of the page in computer media can also be
read in a different way — not as a further development of a codex form, but as a
return to earlier forms such as the papyrus roll of ancient Egypt, Greece and
Rome. Scrolling through the contents of a computer window or a World Wide
Web page has more in common with unrolling than turning the pages of a modern
book. In the case of the Web of the 1990s, the similarity with a roll is even
stronger because the information is not available all at once, but arrives
sequentially, top to bottom, as though the roll is being unrolled.
A good example of how cultural interfaces stretch the definition of a page
while mixing together its different historical forms is the Web page created in
1997 by the British design collective antirom for HotWired RGB Gallery.63 The
designers have created a large surface containing rectangular blocks of texts in
85
different font sizes, arranged without any apparent order. The user is invited to
skip from one block to another moving in any direction. Here, the different
directions of reading used in different cultures are combined together in a single
page.
By the mid 1990's, Web pages included a variety of media types — but
they were still essentially traditional pages. Different media elements — graphics,
photographs, digital video, sound and 3D worlds — were embedded within
rectangular surfaces containing text. To that extent a typical Web age was
conceptually similar to a newspaper page which is also dominated by text, with
photographs, drawings, tables and graphs embedded in between, along with links
to other pages of the newspaper. VRML evangelists wanted to overturn this
hierarchy by imaging the future in which the World Wide Web is rendered as a
giant 3D space, with all the other media types, including text, existing within it.64
Given that the history of a page stretches for thousands of years, I think it is
unlikely that it would disappear so quickly.
As Web page became a new cultural convention of its own, its dominance
was challenged by two Web browsers created by artists — Web Stalker (1997) by
I/O/D collective65 and Netomat (1999) by Maciej Wisniewski.66 Web Stalker
emphasizes the hypertextual nature of the Web. Instead of rendering standard
Web pages, it renders the networks of hyperlinks these pages embody. When a
user enters a URL for a particular page, Web Stalker displays all pages linked to
this page as a line graph. Netomat similarly refuses the page convention of the
Web. The user enters a word or a phrase which are passed to search engines.
Netomat then extracts page titles, images, audio or any other media type, as
specified by the user, from the found pages and floats them across the computer
screen. As can be seen, both browsers refuse the page metaphor, instead
substituting their own metaphors: a graph showing the structure of links in the
case of Web Stalker, a flow of media elements in the case of Netomat.
While the 1990's Web browsers and other commercial cultural interfaces
have retained the modern page format, they also have come to rely on a new way
of organizing and accessing texts which has little precedent within book tradition
— hyperlinking. We may be tempted to trace hyperlinking to earlier forms and
practices of non-sequential text organization, such as the Torah's interpretations
and footnotes, but it is actually fundamentally different from them. Both the
Torah's interpretations and footnotes imply a master-slave relationship between
one text and another. But in the case of hyperlinking as implemented by HTML
and earlier by Hypercard, no such relationship of hierarchy is assumed. The two
sources connected through a hyperlink have an equal weight; neither one
dominates the other .Thus the acceptance of hyperlinking in the 1980's can be
correlated with contemporary culture’s suspicion of all hierarchies, and preference
for the aesthetics of collage where radically different sources are brought together
within the singular cultural object ("post-modernism").
86
Traditionally, texts encoded human knowledge and memory, instructed,
inspired, convinced and seduced their readers to adopt new ideas, new ways of
interpreting the world, new ideologies. In short, the printed word was linked to the
art of rhetoric. While it is probably possible to invent a new rhetoric of
hypermedia, which will use hyperlinking not to distract the reader from the
argument (as it is often the case today), but instead to further convince her of
argument's validity, the sheer existence and popularity of hyperlinking
exemplifies the continuing decline of the field of rhetoric in the modern era.
Ancient and Medieval scholars have classified hundreds of different rhetorical
figures. In the middle of the twentieth century linguist Roman Jakobson, under the
influence of computer's binary logic, information theory and cybernetics to which
he was exposed at MIT where he was teaching, radically reduced rhetoric to just
two figures: metaphor and metonymy.67 Finally, in the 1990's, the World Wide
Web hyperlinking has privileged the single figure of metonymy at the expense of
all others.68 The hypertext of the World Wide Web leads the reader from one text
to another, ad infinitum. Contrary to the popular image, in which computer media
collapses all human culture into a single giant library (which implies the existence
of some ordering system), or a single giant book (which implies a narrative
progression), it maybe more accurate to think of the new media culture as an
infinite flat surface where individual texts are placed in no particular order, like
the Web page designed by antirom for HotWired. Expanding this comparison
further, we can note that Random Access Memory, the concept behind the group's
name, also implies the lack of hierarchy: any RAM location can be accessed as
quickly as any other. In contrast to the older storage media of book, film, and
magnetic tape, where data is organized sequentially and linearly, thus suggesting
the presence of a narrative or a rhetorical trajectory, RAM "flattens" the data.
Rather than seducing the user through the careful arrangement of arguments and
examples, points and counterpoints, changing rhythms of presentation (i.e., the
rate of data streaming, to use contemporary language), simulated false paths and
dramatically presented conceptual breakthroughs, cultural interfaces, like RAM
itself, bombards the users with all the data at once.69
In the 1980's many critics have described one of key's effects of "post-
modernism" as that of spatialization: privileging space over time, flattening
historical time, refusing grand narratives. Computer media, which has evolved
during the same decade, accomplished this spatialization quite literally. It
replaced sequential storage with random-access storage; hierarchical organization
of information with a flattened hypertext; psychological movement of narrative in
novel and cinema with physical movement through space, as witnessed by endless
computer animated fly-throughs or computer games such as Myst, Doom and
countless others (see “Navigable Space.”) In short, time becomes a flat image or a
landscape, something to look at or navigate through. If there is a new rhetoric or
87
aesthetic which is possible here, it may have less to do with the ordering of time
by a writer or an orator, and more with spatial wandering. The hypertext reader is
like Robinson Crusoe, walking through the sand and water, picking up a
navigation journal, a rotten fruit, an instrument whose purpose he does not know;
leaving imprints in the sand, which, like computer hyperlinks, follow from one
found object to another.
Cinema
Printed word tradition which has initially dominated the language of cultural
interfaces, is becoming less important, while the part played by cinematic
elements is getting progressively stronger. This is consistent with a general trend
in modern society towards presenting more and more information in the form of
time-based audio-visual moving image sequences, rather than as text. As new
generations of both computer users and computer designers are growing up in a
media-rich environment dominated by television rather than by printed texts, it is
not surprising that they favor cinematic language over the language of print.
A hundred years after cinema's birth, cinematic ways of seeing the world,
of structuring time, of narrating a story, of linking one experience to the next, are
being extended to become the basic ways in which computer users access and
interact with all cultural data. In this way, the computer fulfills the promise of
cinema as a visual Esperanto which pre-occupied many film artists and critics in
the 1920s, from Griffith to Vertov. Indeed, millions of computer users
communicate with each other through the same computer interface. And, in
contrast to cinema where most of its "users" were able to "understand" cinematic
language but not "speak" it (i.e., make films), all computer users can "speak" the
language of the interface. They are active users of the interface, employing it to
perform many tasks: send email, organize their files, run various applications, and
so on.
The original Esperanto never became truly popular. But cultural interfaces
are widely used and are easily learned. We have an unprecedented situation in the
history of cultural languages: something which is designed by a rather small
group of people is immediately adopted by millions of computer users. How is it
possible that people around the world adopt today something which a 20-
something programmer in Northern California has hacked together just the night
before? Shall we conclude that we are somehow biologically "wired" to the
interface language, the way we are "wired," according to the original hypothesis
of Noam Chomsky, to different natural languages?
The answer is of course no. Users are able to "acquire" new cultural
languages, be it cinema a hundred years ago, or cultural interfaces today, because
these languages are based on previous and already familiar cultural forms. In the
88
case of cinema, it was theater, magic lantern shows and other nineteenth century
forms of public entertainment. Cultural interfaces in their turn draw on older
cultural forms such as the printed word and cinema. I have already discussed
some ways in which the printed word tradition structures interface language; now
it is cinema's turn.
I will begin with probably the most important case of cinema's influence
on cultural interfaces — the mobile camera. Originally developed as part of 3D
computer graphics technology for such applications as computer-aided design,
flight simulators and computer movie making, during the 1980's and 1990's the
camera model became as much of an interface convention as scrollable windows
or cut and paste operations. It became an accepted way for interacting with any
data which is represented in three dimensions — which, in a computer culture,
means literally anything and everything: the results of a physical simulation, an
architectural site, design of a new molecule, statistical data, the structure of a
computer network and so on. As computer culture is gradually spatializing all
representations and experiences, they become subjected to the camera's particular
grammar of data access. Zoom, tilt, pan and track: we now use these operations to
interact with data spaces, models, objects and bodies.
Abstracted from its historical temporary "imprisonment" within the
physical body of a movie camera directed at physical reality, a virtualized camera
also becomes an interface to all types of media and information beside 3D space.
As an example, consider GUI of the leading computer animation software —
PowerAnimator from Alias/Wavefront.70 In this interface, each window,
regardless of whether it displays a 3D model, a graph or even plain text, contains
Dolly, Track and Zoom buttons. It is particularly important that the user is
expected to dolly and pan over text as if it was a 3D scene. In this interface,
cinematic vision triumphed over the print tradition, with the camera subsuming
the page. The Guttenberg galaxy turned out to be just a subset of the Lumières'
universe.
Another feature of cinematic perception which persists in cultural
interfaces is a rectangular framing of represented reality.71 Cinema itself inherited
this framing from Western painting. Since the Renaissance, the frame acted as a
window onto a larger space which was assumed to extend beyond the frame. This
space was cut by the frame's rectangle into two parts: "onscreen space," the part
which is inside the frame, and the part which is outside. In the famous formulation
of Leon-Battista Alberti, the frame acted as a window onto the world. Or, in a
more recent formulation of French film theorist Jacques Aumont and his co-
authors, "The onscreen space is habitually perceived as included within a more
vast scenographic space. Even though the onscreen space is the only visible part,
this larger scenographic part is nonetheless considered to exist around it."72
89
Just as a rectangular frame of painting and photography presents a part of
a larger space outside it, a window in HCI presents a partial view of a larger
document. But if in painting (and later in photography), the framing chosen by an
artist was final, computer interface benefits from a new invention introduced by
cinema: the mobility of the frame. As a kino-eye moves around the space
revealing its different regions, so can a computer user scroll through a window's
contents.
It is not surprising to see that screen-based interactive 3D environments,
such as VRML words, also use cinema's rectangular framing since they rely on
other elements of cinematic vision, specifically a mobile virtual camera. It may be
more surprising to realize that Virtual Reality (VR) interface, often promoted as
the most "natural" interface of all, utilizes the same framing.73 As in cinema, the
world presented to a VR user is cut by a rectangular frame. As in cinema, this
frame presents a partial view of a larger space.74 As in cinema, the virtual camera
moves around to reveal different parts of this space.
Of course, the camera is now controlled by the user and in fact is
identified with his/her own sight. Yet, it is crucial that in VR one is seeing the
virtual world through a rectangular frame, and that this frame always presents
only a part of a larger whole. This frame creates a distinct subjective experience
which is much more close to cinematic perception than to unmediated sight.
Interactive virtual worlds, whether accessed through a screen-based or a
VR interface, are often discussed as the logical successor to cinema, as potentially
the key cultural form of the twenty-first century, just as cinema was the key
cultural form of the twentieth century. These discussions usually focus on the
issues of interaction and narrative. So, the typical scenario for twenty-first century
cinema involves a user represented as an avatar existing literally "inside" the
narrative space, rendered with photorealistic 3D computer graphics, interacting
with virtual characters and perhaps other users, and affecting the course of
narrative events.
It is an open question whether this and similar scenarios commonly
invoked in new media discussions of the 1990's, indeed represent an extension of
cinema or if they rather should be thought of as a continuation of some theatrical
traditions, such as improvisational or avant-garde theater. But what undoubtedly
can be observed in the 1990's is how virtual technology's dependence on cinema's
mode of seeing and language is becoming progressively stronger. This coincides
with the move from proprietary and expensive VR systems to more widely
available and standardized technologies, such as VRML (Virtual Reality
Modeling Language). (The following examples refer to a particular VRML
browser — WebSpace Navigator 1.1 from SGI.75 Other VRML browsers have
similar features.)
90
The creator of a VRML world can define a number of viewpoints which
are loaded with the world.76 These viewpoints automatically appear in a special
menu in a VRML browser which allows the user to step through them, one by
one. Just as in cinema, ontology is coupled with epistemology: the world is
designed to be viewed from particular points of view. The designer of a virtual
world is thus a cinematographer as well as an architect. The user can wander
around the world or she can save time by assuming the familiar position of a
cinema viewer for whom the cinematographer has already chosen the best
viewpoints.
Equally interesting is another option which controls how a VRML browser
moves from one viewpoint to the next. By default, the virtual camera smoothly
travels through space from the current viewpoint to the next as though on a dolly,
its movement automatically calculated by the software. Selecting the "jump cuts"
option makes it cut from one view to the next. Both modes are obviously derived
from cinema. Both are more efficient than trying to explore the world on its own.
With a VRML interface, nature is firmly subsumed under culture. The eye
is subordinated to the kino-eye. The body is subordinated to a virtual body of a
virtual camera. While the user can investigate the world on her own, freely
selecting trajectories and viewpoints, the interface privileges cinematic perception
— cuts, pre-computed dolly-like smooth motions of a virtual camera, and pre-
selected viewpoints.
The area of computer culture where cinematic interface is being
transformed into a cultural interface most aggressively is computer games. By the
1990's, game designers have moved from two to three dimensions and have begun
to incorporate cinematic language in a increasingly systematic fashion. Games
started featuring lavish opening cinematic sequences (called in the game business
"cinematics") to set the mood, establish the setting and introduce the narrative.
Frequently, the whole game would be structured as an oscillation between
interactive fragments requiring user's input and non-interactive cinematic
sequences, i.e. "cinematics." As the decade progressed, game designers were
creating increasingly complex — and increasingly cinematic — interactive virtual
worlds. Regardless of a game's genre — action/adventure, fighting, flight
simulator, first-person action, racing or simulation — they came to rely on
cinematography techniques borrowed from traditional cinema, including the
expressive use of camera angles and depth of field, and dramatic lighting of 3D
computer generated sets to create mood and atmosphere. In the beginning of the
decade, many games such as The 7
th
Guest (Trilobyte, 1993) or Voyeur (1994) or
used digital video of actors superimposed over 2D or 3D backgrounds, but by its
end they switched to fully synthetic characters rendered in real time.77 This
switch allowed game designers to go beyond branching-type structure of earlier
games based on digital video were all the possible scenes had to be taped
beforehand. In contrast, 3D characters animated in real time move arbitrary
91
around the space, and the space itself can change during the game. (For instance,
when a player returns to the already visited area, she will find any objects she left
there earlier.) This switch also made virtual words more cinematic, as the
characters could be better visually integrated with their environments.78
A particularly important example of how computer games use — and
extend — cinematic language, is their implementation of a dynamic point of view.
In driving and flying simulators and in combat games, such as Tekken 2 (Namco,
1994 -), after a certain event takes place (car crashes, a fighter being knocked
down), it is automatically replayed from a different point of view. Other games
such as the Doom series (Id Software, 1993 -) and Dungeon Keeper (Bullfrog
Productions, 1997) allow the user to switch between the point of view of the hero
and a top down "bird's eye" view. The designers of online virtual worlds such as
Active Worlds provide their users with similar capabilities. Finally, Nintendo
went even further by dedicating four buttons on their N64 joypad to controlling
the view of the action. While playing Nintendo games such as Super Mario 64
(Nintendo, 1996) the user can continuously adjust the position of the camera.
Some Sony Playstation games such as Tomb Rider (Eidos, 1996) also use the
buttons on the Playstation joypad for changing point of view. Some games such as
Myth: The Fallen Lords (Bungie, 1997) go further, using an AI engine (computer
code which controls the simulated “life” in the game, such as human characters
the player encounters) to automatically control their camera.
The incorporation of virtual camera controls into the very hardware of a
game consoles is truly a historical event. Directing the virtual camera becomes as
important as controlling the hero's actions. This is admitted by the game industry
itself. For instance, a package for Dungeon Keeper lists four key features of the
game, out of which the first two concern control over the camera: "switch your
perspective," "rotate your view," "take on your friend," "unveil hidden levels." In
games such as this one, cinematic perception functions as the subject in its own
right.79 Here, the computer games are returning to "The New Vision" movement
of the 1920s (Moholy-Nagy, Rodchenko, Vertov and others), which foregrounded
new mobility of a photo and film camera, and made unconventional points of
view the key part of their poetics.
The fact that computer games and virtual worlds continue to encode, step
by step, the grammar of a kino-eye in software and in hardware is not an accident.
This encoding is consistent with the overall trajectory driving the computerization
of culture since the 1940's, that being the automation of all cultural operations.
This automation gradually moves from basic to more complex operations: from
image processing and spell checking to software-generated characters, 3D worlds,
and Web Sites. The side effect of this automation is that once particular cultural
codes are implemented in low-level software and hardware, they are no longer
seen as choices but as unquestionable defaults. To take the automation of imaging
as an example, in the early 1960's the newly emerging field of computer graphics
92
incorporated a linear one-point perspective in 3D software, and later directly in
hardware.80 As a result, linear perspective became the default mode of vision in
computer culture, be it computer animation, computer games, visualization or
VRML worlds. Now we are witnessing the next stage of this process: the
translation of cinematic grammar of points of view into software and hardware.
As Hollywood cinematography is translated into algorithms and computer chips,
its convention becomes the default method of interacting with any data subjected
to spatialization, with a narrative, and with other human beings. (At SIGGRAPH
'97 in Los Angeles, one of the presenters called for the incorporation of
Hollywood-style editing in multi-user virtual worlds software. In such
implementation, user interaction with other avatar(s) will be automatically
rendered using classical Hollywood conventions for filming dialog.81) To use the
terms from the 1996 paper authored by Microsoft researchers and entitled “The
Virtual Cinematographer: A Paradigm for Automatic Real-Time Camera Control
and Directing,” the goal of research is to encode “cinematographic expertise,”
translating “heuristics of filmmaking” into computer software and hardware.82
Element by element, cinema is being poured into a computer: first one-point
linear perspective; next the mobile camera and a rectangular window; next
cinematography and editing conventions, and, of course, digital personas also
based on acting conventions borrowed from cinema, to be followed by make-up,
set design, and the narrative structures themselves. From one cultural language
among others, cinema is becoming the cultural interface, a toolbox for all cultural
communication, overtaking the printed word.
Cinema, the major cultural form of the twentieth century, has found a new
life as the toolbox of a computer user. Cinematic means of perception, of
connecting space and time, of representing human memory, thinking, and
emotions become a way of work and a way of life for millions in the computer
age. Cinema's aesthetic strategies have become basic organizational principles of
computer software. The window in a fictional world of a cinematic narrative has
become a window in a datascape. In short, what was cinema has become human-
computer interface.
I will conclude this section by discussing a few artistic projects which, in
different ways, offer alternatives to this trajectory. To summarize it once again,
the trajectory involves gradual translation of elements and techniques of cinematic
perception and language into a de-contextualized set of tools to be used as an
interface to any data. In the process of this translation, cinematic perception is
divorced from its original material embodiment (camera, film stock), as well as
from the historical contexts of its formation. If in cinema the camera functioned as
a material object, co-existing, spatially and temporally, with the world it was
showing us, it has now become a set of abstract operations. The art projects
described below refuse this separation of cinematic vision from the material
93
world. They reunite perception and material reality by making the camera and
what it records a part of a virtual world's ontology. They also refuse the
universalization of cinematic vision by computer culture, which (just as post-
modern visual culture in general) treats cinema as a toolbox, a set of "filters"
which can be used to process any input. In contrast, each of these projects
employs a unique cinematic strategy which has a specific relation to the particular
virtual world it reveals to the user.
In The Invisible Shape of Things Past Joachim Sauter and Dirk
Lüsenbrink of the Berlin-based Art+Com collective created a truly innovative
cultural interface for accessing historical data about Berlin's history.83 The
interface de-virtualizes cinema, so to speak, by placing the records of cinematic
vision back into their historical and material context. As the user navigates
through a 3D model of Berlin, he or she comes across elongated shapes lying on
city streets. These shapes, which the authors call "filmobjects", correspond to
documentary footage recorded at the corresponding points in the city. To create
each shape the original footage is digitized and the frames are stacked one after
another in depth, with the original camera parameters determining the exact
shape. The user can view the footage by clicking on the first frame. As the frames
are displayed one after another, the shape is getting correspondingly thinner.
In following with the already noted general trend of computer culture
towards spatialization of every cultural experience, this cultural interface
spatializes time, representing it as a shape in a 3D space. This shape can be
thought of as a book, with individual frames stacked one after another as book
pages. The trajectory through time and space taken by a camera becomes a book
to be read, page by page. The records of camera's vision become material objects,
sharing the space with the material reality which gave rise to this vision. Cinema
is solidified. This project, than, can be also understood as a virtual monument to
cinema. The (virtual) shapes situated around the (virtual) city, remind us about
the era when cinema was the defining form of cultural expression — as opposed
to a toolbox for data retrieval and use, as it is becoming today in a computer.
Hungarian-born artist Tamás Waliczky openly refuses the default mode of
vision imposed by computer software, that of the one-point linear perspective.
Each of his computer animated films The Garden (1992), The Forest (1993) and
The Way (1994) utilizes a particular perspectival system: a water-drop
perspective in The Garden, a cylindrical perspective in The Forest and a reverse
perspective in The Way. Working with computer programmers, the artist created
custom-made 3D software to implement these perspectival systems. Each of the
systems has an inherent relationship to the subject of a film in which it is used. In
The Garden, its subject is the perspective of a small child, for whom the world
does not yet have an objective existence. In The Forest, the mental trauma of
emigration is transformed into the endless roaming of a camera through the forest
which is actually just a set of transparent cylinders. Finally, in The Way, the self-
94
sufficiency and isolation of a Western subject are conveyed by the use of a
reverse perspective.
In Waliczky's films the camera and the world are made into a single
whole, whereas in The Invisible Shape of Things Past the records of the camera
are placed back into the world. Rather than simply subjecting his virtual worlds to
different types of perspectival projection, Waliczky modified the spatial structure
of the worlds themselves. In The Garden, a child playing in a garden becomes the
center of the world; as he moves around, the actual geometry of all the objects
around him is transformed, with objects getting bigger as he gets close to him. To
create The Forest, a number of cylinders were placed inside each other, each
cylinder mapped with a picture of a tree, repeated a number of times. In the film,
we see a camera moving through this endless static forest in a complex spatial
trajectory — but this is an illusion. In reality, the camera does move, but the
architecture of the world is constantly changing as well, because each cylinder is
rotating at its own speed. As a result, the world and its perception are fused
together.
HCI: Representation versus Control
The development of human-computer interface, until recently, had little to do
with distribution of cultural objects. Following some of the main applications
from the 1940's until the early 1980's, when the current generation of GUI was
developed and reached the mass market together with the rise of a PC (personal
computer), we can list the most significant: real-time control of weapons and
weapon systems; scientific simulation; computer-aided design; finally, office
work with a secretary as a prototypical computer user, filing documents in a
folder, emptying a trash can, creating and editing documents ("word processing").
Today, as the computer is starting to host very different applications for access
and manipulation of cultural data and cultural experiences, their interfaces still
rely on old metaphors and action grammars. Thus, cultural interfaces predictably
use elements of a general-purpose HCI such as scrollable windows containing text
and other data types, hierarchical menus, dialogue boxes, and command-line
input. For instance, a typical "art collection" CD-ROM may try to recreate "the
museum experience" by presenting a navigable 3D rendering of a museum space,
while still resorting to hierarchical menus to allow the user to switch between
different museum collections. Even in the case of The Invisible Shape of Things
Past which uses a unique interface solution of "filmobjects" which is not directly
traceable to either old cultural forms or general-purpose HCI, the designers are
still relying on HCI convention in one case — the use of a pull-down menu to
switch between different maps of Berlin.
95
In their important study of new media Remediation, Jay David Bolter and
Richard Grusin define medium as “that which remediates.”84 In contrast to a
modernist view aims to define the essential properties of every medium, Bolter
and Grusin propose that all media work by “remediating,” i.e. translating,
refashioning, and reforming other media, both on the levels of content and form.
If we are to think of human-computer interface as another media, its history and
present development definitely fits this thesis. The history of human-computer
interface is that of borrowing and reformulating, or, to use new media lingo,
reformatting other media, both past and present: the printed page, film, television.
But along with borrowing conventions of most other media and eclectically
combining them together, HCI designers also heavily borrowed “conventions” of
human-made physical environment, beginning with Macintosh use of desktop
metaphor. And, more than an media before it, HCI is like a chameleon which
keeps changing its appearance, responding to how computers are used in any
given period. For instance, if in the 1970s the designers at Xerox Park modeled
the first GUI on the office desk, because they imagined that the computer were
designing will be used in the office, in the 1990s the primary use of computers as
media access machine led to the borrowing of interfaces of already familiar media
devices, such as VCR or audio CD player controls.
In general, cultural interfaces of the 1990's try to walk an uneasy path
between the richness of control provided in general-purpose HCI and an
"immersive" experience of traditional cultural objects such as books and movies.
Modern general-purpose HCI, be it MAC OS, Windows or UNIX, allow their
users to perform complex and detailed actions on computer data: get information
about an object, copy it, move it to another location, change the way data is
displayed, etc. In contrast, a conventional book or a film positions the user inside
the imaginary universe whose structure is fixed by the author. Cultural interfaces
attempt to mediate between these two fundamentally different and ultimately non-
compatible approaches.
As an example, consider how cultural interfaces conceptualize the
computer screen. If a general-purpose HCI clearly identifies to the user that
certain objects can be acted on while others cannot (icons representing files but
not the desktop itself), cultural interfaces typically hide the hyperlinks within a
continuous representational field. (This technique was already so widely accepted
by the 1990's that the designers of HTML offered it early on to the users by
implementing the "imagemap" feature). The field can be a two-dimensional
collage of different images, a mixture of representational elements and abstract
textures, or a single image of a space such as a city street or a landscape. By trial
and error, clicking all over the field, the user discovers that some parts of this
field are hyperlinks. This concept of a screen combines two distinct pictorial
conventions: the older Western tradition of pictorial illusionism in which a screen
functions as a window into a virtual space, something for the viewer to look into
96
but not to act upon; and the more recent convention of graphical human-computer
interfaces which, by dividing the computer screen into a set of controls with
clearly delineated functions, essentially treats it as a virtual instrument panel. As a
result, the computer screen becomes a battlefield for a number of incompatible
definitions: depth and surface, opaqueness and transparency, image as an
illusionary space and image as an instrument for action.
The computer screen also functions both as a window into an illusionary
space and as a flat surface carrying text labels and graphical icons. We can relate
this to a similar understanding of a pictorial surface in the Dutch art of the
seventeenth century, as analyzed by art historian Svetlana Alpers in her classical
The Art of Describing. Alpers discusses how a Dutch painting of this period
functioned as a combined map / picture, combining different kids of information
and knowledge of the world.85
Here is another example of how cultural interfaces try to find a middle
ground between the conventions of general-purpose HCI and the conventions of
traditional cultural forms. Again we encounter tension and struggle — in this
case, between standardization and originality. One of the main principles of
modern HCI is consistency principle. It dictates that menus, icons, dialogue boxes
and other interface elements should be the same in different applications. The user
knows that every application will contain a "file" menu, or that if she encounters
an icon which looks like a magnifying glass it can be used to zoom on documents.
In contrast, modern culture (including its "post-modern" stage) stresses
originality: every cultural object is supposed to be different from the rest, and if it
is quoting other objects, these quotes have to be defined as such. Cultural
interfaces try to accommodate both the demand for consistency and the demand
for originality. Most of them contain the same set of interface elements with
standard semantics, such as "home," "forward" and "backward" icons. But
because every Web site and CD-ROM is striving to have its own distinct design,
these elements are always designed differently from one product to the next. For
instance, many games such as War Craft II (Blizzard Entertainment, 1996) and
Dungeon Keeper give their icons a "historical" look consistent with the mood of
an imaginary universe portrayed in the game.
The language of cultural interfaces is a hybrid. It is a strange, often
awkward mix between the conventions of traditional cultural forms and the
conventions of HCI — between an immersive environment and a set of controls;
between standardization and originality. Cultural interfaces try to balance the
concept of a surface in painting, photography, cinema, and the printed page as
something to be looked at, glanced at, read, but always from some distance,
without interfering with it, with the concept of the surface in a computer interface
as a virtual control panel, similar to the control panel on a car, plane or any other
complex machine.86 Finally, on yet another level, the traditions of the printed
word and of cinema also compete between themselves. One pulls the computer
97
screen towards being dense and flat information surface, while another wants it to
become a window into a virtual space.
To see that this hybrid language of the cultural interfaces of the 1990s
represents only one historical possibility, consider a very different scenario.
Potentially, cultural interfaces could completely rely on already existing
metaphors and action grammars of a standard HCI, or, at least, rely on them much
more than they actually do. They don't have to "dress up" HCI with custom icons
and buttons, or hide links within images, or organize the information as a series of
pages or a 3D environment. For instance, texts can be presented simply as files
inside a directory, rather than as a set of pages connected by custom-designed
icons. This strategy of using standard HCI to present cultural objects is
encountered quite rarely. In fact, I am aware of only one project which uses it
completely consciously, as a though through choice rather than by necessity : a
CD-ROM by Gerald Van Der Kaap entitled BlindRom V.0.9. (Netherlands,
1993). The CD-ROM includes a standard-looking folder named "Blind Letter."
Inside the folder there are a large number of text files. You don't have to learn yet
another cultural interface, search for hyperlinks hidden in images or navigate
through a 3D environment. Reading these files required simply opening them in
standard Macintosh SimpleText, one by one. This simple technique works very
well. Rather than distracting the user from experiencing the work, the computer
interface becomes part and parcel of the work. Opening these files, I felt that I
was in the presence of a new literary form for a new medium, perhaps the real
medium of a computer — its interface.
As the examples analyzed here illustrate, cultural interfaces try to create
their own language rather than simply using general-purpose HCI. In doing so,
these interfaces try to negotiate between metaphors and ways of controlling a
computer developed in HCI, and the conventions of more traditional cultural
forms. Indeed, neither extreme is ultimately satisfactory by itself. It is one thing to
use a computer to control a weapon or to analyze statistical data, and it is another
to use it to represent cultural memories, values and experiences. The interfaces
developed for a computer in its functions of a calculator, control mechanism or a
communication device are not necessarily suitable for a computer playing the role
of a cultural machine. Conversely, if we simply mimic the existing conventions of
older cultural forms such as the printed word and cinema, we will not take
advantage of all the new capacities offered by a computer: its flexibility in
displaying and manipulating data, interactive control by the user, the ability to run
simulations, etc.
Today the language of cultural interfaces is in its early stage, as was the
language of cinema a hundred years ago. We don't know what the final result will
be, or even if it will ever completely stabilize. Both the printed word and cinema
eventually achieved stable forms which underwent little changes for long periods
of time, in part because of the material investments in their means of production
and distribution. Given that computer language is implemented in software,
98
potentially it can keep on changing forever. But there is one thing we can be sure
of. We are witnessing the emergence of a new cultural meta-langauge, something
which will be at least as significant as the printed word and cinema before it.
99
The Screen and the User
Contemporary human-computer interfaces offer radical new possibilities for art
and communication. Virtual reality allows us to travel through non-existent three-
dimensional spaces. A computer monitor connected to a network becomes a
window through which we can be present in a place thousands of miles away.
Finally, with the help of a mouse or a video camera, a computer is transformed
into an intelligent being capable of engaging us in a dialogue.
VR, interactivity and telepresence are made possible by the recent
technology of a digital computer. However, they are made real by a much, much
older technology — the screen. It is by looking at a screen — a flat, rectangular
surface positioned at some distance from the eyes — that the user experiences the
illusion of navigating through virtual spaces, of being physically present
somewhere else or of being hailed by the computer itself. If computers have
become a common presence in our culture only in the last decade, the screen, on
the other hand, has been used to present visual information for centuries — from
Renaissance painting to twentieth-century cinema.
Today, coupled with a computer, the screen is rapidly becoming the main
means of accessing any kind of information, be it still images, moving images or
text. We are already using it to read the daily newspaper, to watch movies, to
communicate with coworkers, relatives and friends, and, most importantly, to
work (the screens of airline agents, data entry clerks, secretaries, engineers,
doctors, pilots, etc.; the screens of ATM machines, supermarket checkouts,
automobile control panels, and, of course, the screens of computers.) We may
debate whether our society is a society of spectacle or of simulation, but,
undoubtedly, it is the society of a screen. What are the different stages of the
screen's history? What are the relationships between the physical space where the
viewer is located, his/her body, and the screen space? What are the ways in which
computer displays both continue and challenge the tradition of a screen?87
A Screen's Genealogy
Let us start with the definition of a screen. Visual culture of the modern period,
from painting to cinema, is characterized by an intriguing phenomenon: the
existence of another virtual space, another three-dimensional world enclosed by a
frame and situated inside our normal space. The frame separates two absolutely
different spaces that somehow coexist. This phenomenon is what defines the
screen in the most general sense, or, as I will call it, the "classical screen."
What are the properties of a classical screen? It is a flat, rectangular
surface. It is intended for frontal viewing — as opposed to, for instance, a
panorama. It exists in our normal space, the space of our body, and acts as a
100
window into another space. This other space, the space of representation, typically
has a different scale from the scale of our normal space. Defined in this way, a
screen describes equally well a Renaissance painting (recall Alberti’s formulation
referred to above) and a modern computer display. Even proportions have not
changed in five centuries, they are similar for a typical fifteenth century painting,
a film screen and a computer screen. In this respect it is not accidental that the
very names of the two main formats of computer displays point to two genres of
painting: a horizontal format is referred to as "landscape mode" while the vertical
format is referred to as "portrait mode."
A hundred years ago a new type of screen became popular, which I will
call the "dynamic screen." This new type retains all the properties of a classical
screen while adding something new: it can display an image changing over time.
This is the screen of cinema, television, video. The dynamic screen also brings
with it a certain relationship between the image and the spectator — a certain
viewing regime, so to speak. This relationship is already implicit in the classical
screen but now it fully surfaces. A screen's image strives for complete illusion and
visual plenitude while the viewer is asked to suspend disbelief and to identify
with the image. Although the screen in reality is only a window of limited
dimensions positioned inside the physical space of the viewer, the latter is
supposed to completely concentrate on what is seen in this window, focusing
attention on the representation and disregarding the physical space outside. This
viewing regime is made possible by the fact that, be it a painting, movie screen or
television screen, the singular image completely fills the screen. This is why we
are so annoyed in a movie theater when the projected image does not precisely
coincide with the screen's boundaries: it disrupts the illusion, making us conscious
of what exists outside the representation.88
Rather than being a neutral medium of presenting information, the screen
is aggressive. It functions to filter, to screen out, to take over, rendering non-
existent whatever is outside its frame. Of course, the degree of this filtering varies
between cinema viewing and television viewing. In cinema viewing, the viewer is
asked to completely merge with the screen's space. In television viewing, the
screen is smaller, lights are on, conversation between viewers is allowed, and the
act of viewing is often integrated with other daily activities. Still, overall this
viewing regime remains stable — until recently.
This stability has been challenged by the arrival of the computer screen.
On the one hand, rather than showing a single image, a computer screen typically
displays a number of coexisting windows. Indeed, the coexistence of a number of
overlapping windows is a fundamental principle of modern GUI. No single
window completely dominates the viewer's attention. In this sense the possibility
of simultaneously observing a few images which coexist within one screen can be
compared with the phenomenon of zapping — the quick switching of television
channels that allows the viewer to follow more than program.89 In both instances,
101
the viewer no longer concentrates on a single image. (Some television sets enable
a second channel to be watched within a smaller window positioned in a corner of
the main screen. Perhaps future TV sets will adopt the window metaphor of a
computer.) A window interface has more to do with modern graphic design,
which treats a page as a collection of different but equally important blocks of
data such as text, images, and graphic elements, than with cinematic screen.
On the other hand, with VR, the screen disappears altogether. VR typically
uses a head-mounted display whose images completely fill viewer's visual field.
No longer is the viewer looking forward at a rectangular, flat surface located at a
certain distance and which acts as a window into another space. Now she is fully
situated within this other space. Or, more precisely, we can say that the two
spaces, the real, physical space and the virtual simulated space, coincide. The
virtual space, previously confined to a painting or a movie screen, now
completely encompasses the real space. Frontality, rectangular surface, difference
in scale are all gone. The screen has vanished.
Both situations — window interface and VR — disrupt the viewing
regime which characterizes the historical period of the dynamic screen. This
regime, based on the identification of the viewer with a screen image, reaches its
culmination in the cinema which goes to the extreme to enable this identification
(the bigness of the screen, the darkness of the surrounding space) while still
relying on a screen — a rectangular flat surface.
Thus, the era of the dynamic screen which began with cinema is now
ending. And it is this disappearance of the screen — its splitting into many
windows in window interface, its complete take over of the visual field in VR —
that allows us today to recognize it as a cultural category and begin to trace its
history.
The origins of the cinema's screen are well known. We can trace its
emergence to the popular spectacles and entertainment of the eighteenth and
nineteenth centuries: magic lantern shows, phantasmagoria, eidophusikon,
panorama, diorama, zoopraxiscope shows, and so on. The public was ready for
cinema and when it finally appeared it was a huge public event. Not by accident
the "invention" of cinema was claimed by at least a dozen of individuals from a
half-dozen countries.90
The origin of the computer screen is a different story. It appears in the
middle of this century but it does not become a public presence until much later;
and its history has not yet been written. Both of these facts are related to the
context in which it emerged: as with all the other elements of modern human-
computer interface, the computer screen was developed for military use. Its
history has to do not with public entertainment but with military surveillance.
The history of modern surveillance technologies begins at least with
photography. From the advent of photography there existed an interest in using it
for aerial surveillance. Félix Tournachon Nadar, one of the most eminent
102
photographers of the nineteenth century, succeeded in exposing a photographic
plate at 262 feet over Bièvre, France in 1858. He was soon approached by the
French Army to attempt photo reconnaissance but rejected the offer. In 1882,
unmanned photo balloons were already in the air; a little later, they were joined
by photo rockets both in France and in Germany. The only innovation of World
War I was to combine aerial cameras with a superior flying platform — the
airplane.91
Radar became the next major surveillance technology. Massively
employed in World War II, it provided important advantages over photography.
Previously, military commanders had to wait until the pilots returned from
surveillance missions and the film was developed. The inevitable delay between
the time of the surveillance and the delivery of the finished image limited its
usefulness because by the time the photograph was produced, enemy positions
could have changed. However, with radar, as imaging became instantaneous, this
delay was eliminated. The effectiveness of radar had to do with a new means of
displaying an image — a new type of screen.
Consider the imaging technologies of photography and film. The
photographic image is a permanent imprint corresponding to a single referent —
whatever was in front of the lens when the photograph was taken. It also
corresponds to a limited time of observation — the time of exposure). Film is
based on the same principles. A film sequence, composed of a number of still
images, represents the sum of referents and the sum of exposure times of these
individual images. In either case, the image is fixed once and for all. Therefore
the screen can only show past events.
With radar, we see for the first time the mass employment (television is
founded on the same principle but its mass employment comes later) of a
fundamentally new type of screen, the screen which gradually comes to dominate
modern visual culture — video monitor, computer screen, instrument display.
What is new about such a screen is that its image can change in real time,
reflecting changes in the referent, be it the position of an object in space (radar),
any alteration in visible reality (live video) or changing data in the computer's
memory (computer screen). The image can be continually updated in real time.
This is the third, after classic and dynamic, type of a screen — the screen of real
time.
The radar screen changes, tracking the referent. But while it appears that
the element of time delay, always present in the technologies of military
surveillance, is eliminated, in fact, time enters the real-time screen in a new way.
In older, photographic technologies, all parts of an image are exposed
simultaneously. Whereas now the image is produced through sequential scanning:
circular in the case of radar, horizontal in the case of television. Therefore, the
different parts of the image correspond to different moments in time. In this
103
respect, a radar image is more similar to an audio record since consecutive
moments in time become circular tracks on a surface.92
What this means is that the image, in a traditional sense, no longer exists!
And it is only by habit that we still refer to what we see on the real-time screen as
"images." It is only because the scanning is fast enough and because, sometimes,
the referent remains static, that we see what looks like a static image. Yet, such an
image is no longer the norm, but the exception of a more general, new kind of
representation for which we don't have a term yet.
The principles and technology of radar were worked out independently by
scientists in the United States, England, France and Germany during the 1930s.
But, after the beginning of the War only the U.S. had the necessary resources to
continue radar development. In 1940, at MIT, a team of scientists was gathered to
work in the Radiation Laboratory or the "Rad Lab," as it came to be called. The
purpose of the lab was radar research and production. By 1943, the "Rad Lab"
occupied 115 acres of floor space; it had the largest telephone switchboard in
Cambridge and employed 4,000 people.93
Next to photography, radar provided a superior way to gather information
about enemy locations. In fact, it provided too much information, more
information than one person could deal with. Historical footage from the early
days of the war shows a central command room with a large, table-size map of
Britain.94 Small pieces of cardboard in the form of planes are positioned on the
map to show the locations of actual German bombers. A few senior officers
scrutinize the map. Meanwhile, women in army uniforms constantly change the
location of the cardboard pieces by moving them with long sticks as information
is transmitted from dozens of radar stations.95
Was there a more effective way to process and display information
gathered by radar? The computer screen, as well as most other key principles and
technologies of modern human-computer interface — interactive control,
algorithms for 3D wireframe graphics, bit-mapped graphics — were developed as
a way of solving this problem.
The research again took place at MIT. The Radiation Laboratory was
dismantled after the end of the War, but soon the Air Force created another secret
laboratory in its place — Lincoln Laboratory. The purpose of Lincoln Laboratory
was to work on human factors and new display technologies for SAGE — "Semi-
Automatic Ground Environment," a command center to control the U.S. air
defenses established in the mid-1950s.96 Historian of technology Paul Edwards
writes that SAGE's job "was to link together radar installations around the USA's
perimeter, analyze and interpret their signals, and direct manned interceptor jets
toward the incoming bee. It was to be a total system, one whose ‘human
104
components' were fully integrated into the mechanized circuit of detection,
decision and response."97
The creation of SAGE and the development of interactive human-
computer interface was largely a result of a particular military doctrine. In the
1950s the American military thought that when the Soviet Union attacked the
U.S., it would send a large number of bombers simultaneously. Therefore, it
seemed necessary to create a center which could receive information from all U.S.
radar stations, track the large number of enemy bombers and coordinate the
counterattack. A computer screen and the other components of the modern
human-computer interface owe their existence to this particular military idea. (As
somebody who was born in the Soviet Union and now woks on the history of new
media in the U.S., I find this bit of history truly fascinating.)
The earlier version of the center was called the Cape Cod network since it
received information from the radars situated along the coast of New England.
The center was operating right out of the Barta Building situated on the MIT
campus. Each of 82 Air Force officers monitored his own computer display which
showed the outlines of the New England Coast and the locations of key radars.
Whenever an officer noticed a dot indicating a moving plane, he would tell the
computer to follow the plane. To do this the officer simply had to touch the dot
with the special "light pen."98
Thus, the SAGE system contained all of the main elements of the modern
human-computer interface. The light pen, designed in 1949, can be considered a
precursor of the contemporary mouse. More importantly, at SAGE the screen
came to be used not only to display information in real time, as in was in radar
and television, but also to give commands to the computer. Rather than acting
solely as a means to display an image of reality, the screen became the vehicle for
directly affecting reality.
Using the technology developed for SAGE, Lincoln researchers created a
number of computer graphics programs that relied on the screen as a means to
input and output information from a computer. They included programs to display
brain waves (1957), simulate planet and gravitational activity (1960), as well as to
create 2D drawings (1958).99 The single most well known of these became a
program called Sketchpad. Designed in 1962 by Ivan Sutherland, a graduate
student supervised by Claude Shannon, it widely publicized the idea of interactive
computer graphics. With Sketchpad, a human operator could create graphics
directly on computer screen by touching the screen with a light pen. Sketchpad
exemplified a new paradigm of interacting with computers: by changing
something on the screen, the operator changed something in the computer's
memory. The real-time screen became interactive.
This, in short, is the history of the birth of the computer screen. But even
before a computer screen became widely used, a new paradigm emerged — the
105
simulation of an interactive three-dimensional environment without a screen. In
1966, Ivan Sutherland and his colleagues began research on the prototype of VR.
The work was cosponsored by ARPA (Advanced Research Projects Agency) and
the Office of Naval Research.100
"The fundamental idea behind the three-dimensional display is to present
the user with a perspective image which changes as he moves," wrote Sutherland
in 1968.101 The computer tracked the position of the viewer's head and adjusted
the perspective of the computer graphic image accordingly. The display itself
consisted of two six-inch-long monitors which were mounted next to the temples.
They projected an image which appeared superimposed over viewer's field of
vision.
The screen disappeared. It completely took over the visual field.
The Screen and the Body
I have presented one possible genealogy of the modern computer screen. In my
genealogy, the computer screen represents an interactive type, a subtype of the
real-time type, which is a subtype of the dynamic type, which is a subtype of the
classical type. The discussion of these types relied on two ideas. First, the idea of
temporality: the classical screen displays a static, permanent image; the dynamic
screen displays a moving image of the past and finally, the real-time screen shows
the present. Second, the relationship between the space of the viewer and the
space of the representation (I defined the screen as a window into the space of
representation which itself exists in our normal space).
Let us now look at the screen's history from another angle — the
relationship between the screen and the body of the viewer. This is how Roland
Barthes described the screen in "Diderot, Brecht, Eisenstein," written in 1973:
Representation is not defined directly by imitation: even if one gets rid of
notions of the "real," of the "vraisemblable," of the "copy," there will still
be representation for as long as a subject (author, reader, spectator or
voyeur) casts his gaze towards a horizon on which he cuts out a base of a
triangle, his eye (or his mind) forming the apex. The "Organon of
Representation" (which is today becoming possible to write because there
are intimations of something else) will have as its dual foundation the
sovereignty of the act of cutting out [découpage] and the unity of the
subject of action... The scene, the picture, the shot, the cut-out rectangle,
here we have the very condition that allows us to conceive theater,
painting, cinema, literature, all those arts, that is, other than music and
which could be called dioptric arts.102
106
For Barthes, the screen becomes the all-encompassing concept which covers the
functioning of even non-visual representation (literature), although he does make
an appeal to a particular visual model of linear perspective. At any rate, his
concept encompasses all types of representational apparatuses I have discussed:
painting, film, television, radar and computer display. In each of these, reality is
cut by the rectangle of a screen: "a pure cut-out segment with clearly defined
edges, irreversible and incorruptible; everything that surrounds it is banished into
nothingness, remains unnamed, while everything that it admits within its field is
promoted into essence, into light, into view."103 This act of cutting reality into a
sign and nothingness simultaneously doubles the viewing subject who now exists
in two spaces: the familiar physical space of his/her real body and the virtual
space of an image within the screen. This split comes to the surface with VR, but
it already exists with painting and other dioptric arts.
What is the price the subject pays for the mastery of the world, focused
and unified by the screen?
The Draughtsman's Contrast, a 1981 film by Peter Greenway, concerns an
architectural draftsman hired to produce a set of drawings of a country house. The
draughtsman employs a simple drawing tool consisting of a square grid.
Throughout the film, we repeatedly see the draughtsman's face through the grid
which looks like the prison bars. It is as if the subject who attempts to catch the
world, to immobilize it, to fix it within the representational apparatus (here,
perspectival drawing), is trapped by this apparatus himself. The subject is
imprisoned.
I take this image as a metaphor for what appears to be a general tendency
of the Western screen-based representational apparatus. In this tradition, the body
must be fixed in space if the viewer is to see the image at all. From Renaissance
monocular perspective to modern cinema, from Kepler's camera obscura to
nineteenth century camera lucida, the body had to remain still.104
The imprisonment of the body takes place on both the conceptual and
literal levels; both kinds of imprisonment already appear with the first screen
apparatus, Alberti's perspectival window. According to many interpreters of linear
perspective, it presents the world as seen by a singular eye, static, unblinking and
fixated. As described by Norman Bryson, perspective "followed the logic of the
Gaze rather than the Glance, thus producing a visual take that was eternalized,
reduced to one 'point of view' and disembodied."105 Bryson argues that "the gaze
of the painter arrests the flux of phenomena, contemplates the visual field from a
vantage-point outside the mobility of duration, in an eternal moment of disclosed
presence."106 Correspondingly, the world, as seen by this immobile, static and
atemporal Gaze, which belongs more to a statue than to a living body, becomes
equally immobile, reified, fixated, cold and dead. Writing about Dürer's famous
107
print of a draftsman drawing a nude through a screen of perspectival threads,
Martin Jay notes that "a reifying male look" turns "its targets into stone";
consequently, "the marmoreal nude is drained of its capacity to arouse desire."107
Similarly, John Berger compares Alberti's window to "a safe let into a wall, a safe
into which the visible has been deposited."108 And in The Draughtsman's
Contract, time and again, the draughtsman tries to eliminate all motion, any sign
of life from the scenes he is rendering.
With perspectival machines, the imprisonment of the subject also happens
in a literal sense. From the onset of the adaptation of perspective, artists and
draftsmen have attempted to aid the laborious manual process of creating
perspectival images and, between the sixteenth and nineteenth centuries, various
"perspectival machines" were constructed.109 By the first decades of the sixteenth
century, Dürer described a number of such machines.110 Many varieties were
invented, but regardless of the type, the artist had to remain immobile throughout
the process of drawing.
Along with perspectival machines, a whole range of optical apparatuses
was in use, particularly for depicting landscapes and conducting topographical
surveys. The most popular optical apparatus was camera obscura.111 Camera
obscura literally means "dark chamber." It was founded on the premise that if the
rays of light from an object or a scene pass through a small aperture, they will
cross and re-emerge on the other side to form an image on a screen. In order for
the image to become visible, however, "it is necessary that the screen be placed in
a chamber in which light levels are considerably lower than those around the
object."112 Thus, in one of the earliest depictions of the camera obscura, in
Kircher's Ars magna Lucis et umbrae (Rome, 1649), we see the subject enjoying
the image inside a tiny room, oblivious to the fact that he had to imprison himself
inside this "dark chamber" in order to see the image on the screen.
Later, smaller tent-type camera obscura became popular — a movable
prison, so to speak. It consisted of a small tent mounted on a tripod, with a
revolving reflector and lens at its apex. Having positioned himself inside the tent
which provided the necessary darkness, the draftsman would then spend hours
meticulously tracing the image projected by the lens.
Early photography continued the trend toward the imprisonment of the
subject and the object of representation. During photography's first decades, the
exposure times were quite long. The daguerreotype process, for instance, required
exposures of four to seven minutes in the sun and from 12 to 60 minutes in
diffused light. So, similar to the drawings produced with the help of camera
obscura, which depicted reality as static and immobile, early photographs
represented the world as stable, eternal, unshakable. And when photography
ventured to represent the living, such as the human subject, s/he had to be
108
immobilized. Thus, portrait studios universally employed various clamps to
assure the steadiness of the sitter throughout the lengthy time of exposure.
Reminiscent of the torture instruments, the iron clamps firmly held the subject in
place, the subject who voluntarily became the prisoner of the machine in order to
see her/his own image113
Toward the end of the nineteenth century, the petrified world of the
photographic image was shattered by the dynamic screen of the cinema. In "The
Work of Art in the Age of Mechanical Reproduction," Walter Benjamin
expressed his fascination with the new mobility of the visible: “Our taverns and
our metropolitan streets, our offices and furnished rooms, our railroad stations and
our factories appeared to have us locked up hopelessly. When came the film and
burst this prison-world asunder by the dynamite of the tenth of a second, so that
now, in the midst of its far-flung ruins and debris, we calmly and adventurously
go traveling.”114
The cinema screen enabled audiences to take a journey through different
spaces without leaving their seats; in the words of film historian Anne Friedberg,
it created "a mobilized virtual gaze."115 However, the cost of this virtual mobility
was a new, institutionalized immobility of the spectator. All around the world
large prisons were constructed which could hold hundreds of prisoners — movie
houses. The prisoners could not neither talk to each other nor move from seat to
seat. While they were taken on virtual journeys, their bodies had to remain still in
the darkness of the collective camera obscuras.
The formation of this viewing regime took place in parallel with the shift
from what film theorists call "primitive" to "classical" film language.116 An
important part of the shift, which took place in the 1910s, was the new
functioning of the virtual space represented on the screen. During the "primitive"
period, the space of the film theater and the screen space were clearly separated
much like those of theater or vaudeville. The viewers were free to interact, come
and go, and maintain a psychological distance from the virtual world of the
cinematic narrative. In contrast, classical film addressed each viewer as a separate
individual and positioned her/him inside its virtual world narrative. As noted by a
contemporary in 1913, "they [spectators] should be put in the position of being a
'knot hole in the fence' at every stage in the play."117 If "primitive cinema keeps
the spectator looking across a void in a separate space,"118 now the spectator is
placed at the best viewpoint of each shot, inside the virtual space.
This situation is usually conceptualized in terms of the spectator's
identification with the camera eye. The body of the spectator remains in the seat
while her/his eye is coupled with a mobile camera. However, it is also possible to
conceptualize this differently. We can imagine that the camera does not, in fact,
move at all, that it remains stationary, coinciding with the spectator's eyes.
109
Instead, it is the virtual space as a whole that changes its position with each shot.
Using the contemporary vocabulary of computer graphics, we can say that this
virtual space is rotated, scaled and zoomed to always give the spectator the best
viewpoint. Like a striptease, the space slowly disrobes itself, turning, presenting
itself from different sides, teasing, stepping forward and retracting, always
leaving something covered, so the spectator will wait for the next shot ... the
seductive dance which begins all other with the new scene. All spectator has to do
is remain immobile.
Film theorists have taken this immobility to be the essential feature of the
institution of cinema. Anne Friedberg wrote: "As everyone from Baudry (who
compares cinematic spectation to the prisoners in Plato's cave) to Musser points
out, the cinema relies on the immobility of the spectator, seated in an
auditorium."119 Film theoretician Jean-Louis Baudry has probably more than
anyone put the emphasis on immobility as the foundation of cinematic illusion.
Baudry quoted Plato: "In this underground chamber they have been from
childhood, chained by the leg and also by the neck, so that they cannot move and
can only see what is in front of them, because the chains will not let them turn
their heads."120 This immobility and confinement, according to Baudry, enables
prisoners/spectators to mistake representations for their perceptions, regressing
back to childhood when the two were indistinguishable. Thus, rather than a
historical accident, according to Baudry's psychoanalytic explanation, the
immobility of the spectator is the essential condition of cinematic pleasure.
Alberti's window, Dürer's perspectival machines, camera obscura,
photography, cinema — in all of these screen-based apparatuses, the subject had
to remain immobile. In fact, as Friedberg perceptively points out, the progressive
mobilization of the image in modernity was accompanied by the progressive
imprisonment of the viewer: "as the 'mobility' of the gaze became more 'virtual'
— as techniques were developed to paint (and then to photograph) realistic
images, as mobility was implied by changes in lighting (and then
cinematography) — the observer became more immobile, passive, ready to
receive the constructions of a virtual reality placed in front of his or her unmoving
body."121
What happens to this tradition with the arrival of a screen-less
representational apparatus — VR? On the one hand, VR does constitute a
fundamental break with this tradition. It establishes a radically new type of
relationship between the body of a viewer and an image. In contrast to cinema,
where the mobile camera moves independent of the immobile spectator, now the
spectator has to actually move around the physical space in order to experience
the movement in virtual space. The effect is as though the camera is mounted on
user's head. So, in order to look up in virtual space, one has to look up in physical
space; in order to "virtually" step forward one has to actually step forward and so
110
on.122 The spectator is no longer chained, immobilized, anesthetized by the
apparatus which serves him the ready-made images; now s/he has to work, to
speak, in order to see.
At the same time, VR imprisons the body to an unprecedented extent than
ever before. This can be seen clearly with the earliest VR system designed by
Sutherland and his colleagues in the 1960s which I already mentioned above.
According to Howard Rheingold's history of VR, "Sutherland was the first to
propose mounting small computer screens in binocular glasses — far from an
easy hardware task in the early 1960s — and thus immerse the user's point of
view inside the computer graphic world."123 Rheingold further wrote:
In order to change the appearance of the computer-generated graphics
when the user moves, some kind of gaze-tracking tool is needed. Because
the direction of the user's gaze was most economically and accurately
measured at that time by means of a mechanical apparatus, and because
the HMD [head-mounted display] itself was so heavy, the users of
Sutherland's early HMD systems found their head locked into machinery
suspended from the ceiling. The user put his or her head into a metal
contraption that was known as the 'Sword of Damocles' display.124
A pair of tubes connected the display to tracks in the ceiling, "thus making
the user a captive of the machine in a physical sense."125 The user was able to
turn around and rotate her/his head in any direction but s/he could not move away
from the machine more than few steps. Like today's computer mouse, the body
was tied to the computer. In fact, the body was reduced to nothing else — and
nothing more — than a giant mouse, or more, precisely, a giant joystick. Instead
of moving a mouse, the user had to turn her/his own body. Another comparison
which comes to mind is the apparatus built in the late nineteenth century by
Etienne-Jules Marey to measure the frequency of the wing movements of a bird.
The bird was connected to the measuring equipment by wires which were long
enough to enable it to flap its wings in midair but not fly anywhere.126
The parodox of VR that requires the viewer to physically move in order to
see an image (as opposed to remaining immobile) and at the same time physically
ties her/him to a machine is interestingly dramatized in a "cybersex" scene in the
movie Lawnmower Man (Brett Leonard, 1992). In the scene, the heroes, a man
and a woman, are situated in the same room, each fastened to a separate circular
frame which allows the body to freely rotate 360 degrees in all directions. During
"cybersex" the camera cuts back and forth between the virtual space (i.e., what the
heroes see and experience) and the physical space. In the virtual world
represented with psychedelic computer graphics, their bodies melt and morph
111
together disregarding all the laws of physics, while in the real world each of them
simply rotates within his/her own frame.
The paradox reaches its extreme in one of the most long standing VR
projects — the Super Cockpit developed by the U.S. Air Force in the 1980s.127
Instead of using his own eyes to follow both the terrain outside of his plane and
the dozens of instrument panels inside the cockpit, the pilot wears a head-
mounted display that presents both kinds of information in a more efficient way.
What follows is a description of the system from Air & Space magazine:
When he climbed into his F16C, the young fighter jock of 1998 simply
plugged in his helmet and flipped down his visor to activate his Super
Cockpit system. The virtual world he saw exactly mimicked the world
outside. Salient terrain features were outlined and rendered in three
dimensions by the two tiny cathode ray tubes focused at his personal
viewing distance...His compass heading was displayed as a large band of
numbers on the horizon line, his projected flight path a shimmering
highway leading out toward infinity.128
If in most screen-based representations (painting, cinema, video) as well as in
typical VR applications the physical and the virtual worlds have nothing to do
with each other, here the virtual world is precisely synchronized to the physical
one. The pilot positions himself in the virtual world in order to move through the
physical one at a supersonic speed with his representational apparatus which is
securely fastened to his body, more securely than ever before in the history of the
screen.
Representation versus Simulation
In summary, VR continues the screen's tradition of viewer immobility by
fastening the body to a machine, while at the same time it creates an
unprecedented new condition, requiring the viewer to move. We may ask whether
this new condition is without an historical precedent or whether it fits within some
other alternative representational tradition which encourages the movement of the
viewer?
I began my discussion of the screen by emphasizing that a screen's frame
separates two spaces, the physical and the virtual, which have different scales.
Although this condition does not necessarily lead to the immobilization of the
spectator, it does discourage any movement on her part: why move when she can't
enter the represented virtual space anyway? This was very well dramatized in
Alice in Wonderland when Alice struggles to become just the right size in order to
enter the other world.
112
The alternative tradition of which VR is a part can be found whenever the
scale of a representation is the same as the scale of our human world so that the
two spaces are continuous. This is the tradition of simulation rather than that of
representation bound up to a screen. The simulation tradition aims to blend virtual
and physical spaces, rather than to separate them. Therefore, the two spaces have
the same scale; their boundary is de-emphasized (rather than being marked by a
rectangular frame, as in representation tradition); and the spectator is free to move
around the physical space.
To further analyze the different logic of the simulation and the
representation traditions we may compare their typical representatives: frescoes
and mosaics, on the one hand, and the Renaissance painting. The former create an
illusionary space that starts behind the surface of an image. Importantly, the
frescoes, mosaics, and also wall paintings are inseparable from the architecture. In
other words, they can’t not be moved anywhere. In contrast, a modern painting,
which first makes its appearance during the Renaissance, is essentially mobile.
Separate from a wall, it can be transported anywhere. (It is tempting to connect
this new mobility of a representation with the tendency of capitalism to make all
signs as mobile as possible. I will come back to this idea in “Teleaction” section
of the next chapter.)
But, at the same time, an interesting reversal takes place. The interaction
with a fresco or a mosaic, which itself can't be moved, does not assume
immobility on the part of the spectator, while the mobile Renaissance painting
does presuppose such immobility. It is as though the imprisonment of the
spectator is the price for the new mobility of the image. This reversal is consistent
with the different logic of representation and simulation traditions. Since a fresco
or a mosaic are “hardwired” to their architectural setting, this allows the artist to
create the continuity between the virtual and the physical space. In contrast, the
painting can be put in an arbitrary setting and therefore such continuity can no
longer be guaranteed. Responding to this new condition, a painting presents a
virtual space which is clearly distinct from the physical space where the painting
and the spectator are located. At the same time, its imprisons the spectator
through perspective model or other techniques, so she and the painting form one
system. Therefore if in the simulation tradition the spectator exists in a single
coherent space — the physical space and the virtual space which continues it —
in the representational tradition the spectator has a double identity. She
simultaneously exists in the physical space and in the space of the representation.
This split of the subject is the tradeoff for new mobility of an image as well as for
the newly available possibility to represent any arbitrary space, rather than having
to simulate the physical space where an image is located.
While representational tradition comes to dominate post-Renaissance
culture, the simulation tradition does not disappear. In fact, the nineteenth
113
century, with its obsession with naturalism, pushes simulation to the extreme with
the wax museum and the dioramas of natural history museums. Another example
of the simulation tradition is a sculpture on a human scale, for instance, Auguste
Rodin's "The Burghers of Calais." We think of such sculptures as part of post-
Renaissance humanism which puts the human at the center of the universe, when
in fact, they are aliens, black holes within our world into another parallel universe,
the petrified universe of marble or stone, which exists in parallel to our own
world.
VR continues the tradition of simulation. However, it introduces one
important difference. Previously, the simulation depicted a fake space which was
continuous with and extended from the normal space. For instance, a wall
painting created a pseudo landscape which appeared to begin at the wall. In VR,
either there is no connection between the two spaces (for instance, I am in a
physical room while the virtual space is one of an underwater landscape) or, on
the contrary, the two completely coincide (i.e., the Super Cockpit project). In
either case, the actual physical reality is disregarded, dismissed, abandoned.
In this respect, nineteenth century panorama can be thought of as a
transitional form from classical simulations (wall paintings, human size sculpture,
diorama) toward VR. Like VR, panorama creates a 360 degree space. The viewers
are situated in the center of this space and they are encouraged to move around
the central viewing area in order to see different parts of the panorama.129 But in
contrast to wall paintings and mosaics which, after all, acted as decorations of a
real space, the physical space of action, now this physical space is subordinate to
the virtual space. In other words, the central viewing area is conceived as a
continuation of fake space, rather than vice versa as before — and this is why it is
usually empty. It is empty so that we can pretend that it continues the battlefield,
or a view of Paris or whatever else the panorama represents.130 From here we are
one step away from VR where the physical space is totally disregarded and all the
"real" actions take place in virtual space. The screen disappeared because what
was behind it simply took over.
And what about the immobilization of the body in VR which connects it
to the screen tradition? Dramatic as it is, this immobilization probably represents
the last act in the long history of the body's imprisonment. All around us are the
signs of increasing mobility and the miniaturization of communication devices —
mobile telephones and electronic organizers; pagers and laptops; phones and
watches which offer Web surfing; Gameboy and similar hand held game units.
Eventually VR apparatus may be reduced to a chip implanted in a retina and
connected by wireless transmission to the Net. From that moment on, we will
carry our prisons with us — not in order to blissfully confuse representations and
perceptions (as in cinema), but to always "be in touch," always connected, always
"plugged-in." The retina and the screen will merge.
114
This futuristic scenario may never become a reality. For now, we clearly
live in the society of a screen. The screens are everywhere: the screens of airline
agents, data entry clerks, secretaries, engineers, doctors, pilots, etc.; the screens of
ATM machines, supermarket checkouts, automobile control panels, and, of
course, the screens of computers. Rather than disappearing, the screen threatens to
take over our offices, cities and homes. Both computer and television monitors are
getting bigger and flatter; eventually to become wall-size. Architects such as Rem
Koolhaus design “Blade Runner” like buildings where the whole façade is turned
into a giant screen.131
Dynamic, real-time and interactive, a screen is still a screen. Interactivity,
simulation, and telepresence: like centuries ago, we are still looking at a flat
rectangular surface, existing in the space of our body and acting as a window into
another space. Whatever new era we may be entering today, we still have not left
the era of a screen.
115
III. The Operations
Just as there is no “innocent eye,” there is no “pure computer.” A traditional artist
perceives the world through the filters of already existing cultural codes,
languages and representational schemes. Similarly, a new media designer or a
user approaches the computer through a number of cultural filters. The preceding
chapter discussed some of these filters. Human-computer interface models the
world in distinct ways; it also imposes its own logic on digital data. Existing
cultural forms such as printed word and cinema bring their own powerful
conventions of organizing information. These forms further interact with the
human-computer interface conventions to create what I called cultural interfaces
— new sets of conventions used to organize cultural data. Finally, such constructs
as screen (and the corresponding representation tradition along with its
counterpart, the simulation tradition) contribute additional layer of conventions.
The metaphor of a series of filters assumes that at each stage, from bare-
bones digital data to particular media applications, the creative possibilities are
being further restricted. It is important therefore to note that each of these stages
can be also seen as progressively more enabling. That is, although the
programmer who would directly deal with binary values stored in memory would
be as “close to the machine” as possible, it would also take forever to get the
computer to do anything. Indeed, the history of software is one of increasing
abstraction. By removing the programmer and the user further from the machine,
software allows them to accomplish more faster - or, to use the early slogan of
Apple, Inc., “the power to be your best.” From machine language programmers
moved to Assembler, from there — to high level languages such as COBOL,
FORTRAN and C, as well as very high level languages designed for
programming in a particular area, such Macromedia Director’s LINGO and
HTML. The use of computers to author media developed along similar lines. If
the few artists working with computers in the 1960s and 1970 had to write their
own programs in high-level computer languages, beginning with the Macintosh
most artists, designers and occasional users came to use menu-based software
applications: image editors, paint and layout programs, Web editors. And while
each of these programs comes with its built-in commands, default values,
metaphors and interface conventions which strongly influence gets produced with
their help, the evolution of software towards higher and higher levels of
abstraction is fully compatible with the overall trajectory which governs
computers development and use: automation.
In this chapter I will take the next step in describing the language of new
media. I started by analyzing the properties of computer data (Chapter 1), and
then looked at the human-computer interface (Chapter 2). Continuing this bottom-
116
up movement, this chapter takes up the layer of technology which runs on top of
the interface after — application software. Software programs enable new media
designers and artists to create new media objects — and at the same time they act
as yet another filter which shapes their imagination of what is possible to do with
a computer. Similarly, software used by end users to access these objects, such as
Web browsers, image viewers or media players, shape their understanding of
what new media is. For example, digital media players such Windows 98 Media
Player or RealPlayer emulate the interfaces of linear media machines such as a
VCR. They provide such commands as play, stop, eject, rewind and fast forward.
In this way, they make new media simulate old media, hiding its new properties
such as random access.
Rather than analyzing particular software programs, I will address more
general techniques, or commands, which are common to many of them.
Regardless of whether a new media designer is working with quantitative data,
text, images, video, 3D space or their combinations, she employs the same
techniques: copy, cut, paste, search, composite, transform, filter. The existence of
such techniques which are not media specific is another consequence of media
status as computer data. I will call these typical techniques of working with
computer media operations. This chapter will discuss three examples of
operations: selection, compositing, and teleaction.
While the operations are embedded in software, they are not tied up to it.
They are employed not only within a computer but also in the social world outside
of it. but also outside the computer. They are not only ways of working with
computer data but also general ways of working, ways of thinking, and ways of
existing in a computer age.
The communication between the larger social world and software use and
design is a two way process. As we work with software and use the operations
embedded in it, these operations become part of how we understand ourselves,
others and the world. The strategies of working with computer data become our
general cognitive strategies. At the same time, the design of software and the
human-computer interface reflects a larger social logic, ideology, and imaginary
of the contemporary society. So if we find particular operations dominating
software programs, we may also expect to find them at work in culture at large. In
discussing the three operations of selecting, compositing and teleaction in the
sections of this chapter I will illustrate this general thesis with particular
examples. Other examples of operations which are imbedded in software and
hardware and also can be found at work in contemporary culture at large are
sampling and morphing.132
As I already noted in “Interface” chapter, one of the differences between
industrial and information society is that in the latter both work and leisure often
involve the use of the same computer interfaces. This new, more close
relationship between work and leisure is complimented by a more close
117
relationship authors and readers (or, more generally, between producers of
cultural objects and their users). This does not mean that new media completely
collapses the difference between producers and users, or that every new media
text exemplifies Roland Barthes’ concept of “readarly text.” Rather, as we shift
from industrial society to information society, from old media to new media, the
overlapping between producers and users becomes much larger. This holds for
software the two groups use, their respective skills and expertise, the structure of
typical media objects, and the operations they perform on computer data.
While some software products is aimed at either professional producers or
end users, other software is used by both groups: Web browsers and search
engines, word processors, media editing applications such as Photoshop (the latter
routinely employed in post-production of Hollywood feature films) or
Dreamweaver. Further, the differences in functionality and pricing between
professional and amateur software are quite small (few hundred dollars or less)
compared to the real gap between equipment and formats used by professionals
and amateurs before new media. For instance, the differences between 35mm and
8mm film equipment and cost of production, or between professional video
(formats such as D-1 and Beta SP; editing decks, switchers, DVE, and other
editing hardware) and amateur video (VHS) were in the hundreds of thousands of
dollars. Similarly, the gap in skills between professionals and amateurs also got
smaller. For instance, while employing Java or DHTML for Web design in the
late 1990s was the domain of professionals, many Web users were also able to
create a basic Web page using such programs as FrontPage, HomePage or Word..
At the same time, new media does not change the nature of professional-
amateur relationship. The gap became much smaller but it still exist. And it will
always exist, systematically maintained by the professional producers themselves
in order to survive. With photography, film and video, this gap involved three key
areas: technology, skills, and aesthetics.133 With new media, a new area has
emerged. As the “professional” technology becomes accessible to amateurs, the
new media professionals create new standards, formats and design expectations to
maintain their status. Thus, the continuos introduction of new Web design
“features” along with the techniques to create them following the public debut of
HTML around 1993 — rollover buttons and pull-down menus, DHTML and
XML, Javscript scripts and Java applets — can be in part explained as the
strategy employed by the professionals to keep themselves ahead of home users
On the level of new media products, the overlapping between the
producers and the users can be illustrated by computer games. As I will discuss in
more detail in “Navigable Space” section, game companies often release so-called
“level editors,” the special software to allow the players to create their own game
environments for the game they purchased. Other software to add or modify
games is released by third parties or written by game fans themselves. This
phenomenon is referred to as “game patching.” As described the writer and
118
curator Anne-Marie Schleiner, “game patches, (or game add-ons, mods, levels,
maps or wads), refer to the alterations of preexisting game source code in terms of
graphics, game characters, architrecture, sound and game play. Game patching in
the 1990s has evolved into a kind of popular hacker art form with numerous
shareware editors available on the Internet for modifying most games.”134
Every commercial game is also expected to have an extensive “options”
area where the player can customize various aspects of the game. Thus, a game
player becomes somewhat of a game designer, although her creativity involves
not making something from scratch but selecting combinations of different
options. I will discuss this concept of creativity as selection in more detail in
“Menus, Filters, Plug-ins” section.
While some operations are the domain of new media professionals, and
other operations are the domain of end users, the two groups also employ some of
the same operations. The examples are copy, cut and paste, sort, search, filter,
transcode, rip. The operations discussed in this chapter exemplify these three
kinds. “Selection” is the operation employed by both professional designers and
end users. “Compositing” is used exclusively belongs exclusively by the
designers. The third operation, “teleaction,” is an example of operation typically
used by users.
Although this chapter focuses on new media operations, the concept of an
operation can be used in relation to other technologically-based cultural practices.
We can connect it to other more familiar terms such as “procedure,” “practice” or
“method.” At the same time, it would be a mistake to reduce the concept of an
operation to such concepts as “tool” or “medium.” In fact, one of the assumptions
underlying this book is that these traditional concepts do not work very well in
relation to new media, and that we need new concepts such as an interface and
operations. On the one hand, operations are usually in part automated, the way
traditional tools were not. On the other hand, like computer algorithms, they can
be written down as series of steps, i.e. they exist as concepts before being
materialized in hardware and software. In fact, most of new media operations,
from morphing to texture mapping, from searching and matching to hyperlinking,
begin as algorithms published in computer science papers; eventually these
algorithms become commands of standard software applications. So, for instance,
when the user applies a particular Photoshop filter to an image, the main
Photoshop programs invokes a separate program which corresponds to this filter.
The program reads in the pixel values, performs some actions of them, and writes
modified values to the screen.
Thus, operations should be seen as another case of the more general
principle of new media — transcoding. Encoded in algorithms and implemented
as software commands, operations exist independently from the media data to
which they can be applied. The separation between algorithms and data in
programming becomes the separation between operations and media data.
119
As an example of the operations in other areas of culture, consider
architectural practice of Peter Eiseman. His projects use diffirent operations
provided by CAD programs as the basis of the design of building’s exterior and/or
interior form. Eiseman systematically utilized the full range of computer
operations available: extrusion, twisting, extension, displacement, morphing,
warping, shifting, scaling, rotation, and so on.135
Another example is provided by clothing design by Iseey Miyake. Each of
his designs is a result of a particular conceptual procedure, translated into a
technological process.136 For instance, Just Before (Spring/Summer 1998
collection) is a gigantic role of identical dresses with suggested lines of
demarcation already incorporated into the fabric. An individual dress can be cut
out from the roll in a variety of possible ways. Dunes (Spring/Summer 1998
collection) is based on the operation of shrinking. A model is cut two times larger
than its final size; next patches and pieces of tape are fitted in the key places;
finally it is shrinked down to size by dipping it into special solution. This creates a
particular wrinkled texture except in the places protected by patches and tapes.
Dunes exemplifies an important feature of operations: they can be
combined together in a sequence. The (new media) designer can manipulate the
resulting script, removing and adding new operations. This script exists separately
from the data to which it can be applied. Thus, the script of Dunes consists from
cutting the model; applying patches and tapes to key areas; and shrinking. It can
be applied to different designs and fabrics. New media software, designers and
users have even more flexibility. New filters can be “plugged into” the program,
extending the range of operations available. The script can be edited using special
scripting languages. It can be also saved and later applied to a different object.
The designers and users can automatically apply the script to a number of objects
and even instruct the computer to automatically invoke the script at a particular
time or if particular condition as occurred. The example of the former are backup
or disk defragmenter programs often designated to start at a particular time at
night. The example of the later is filtering email messages in email programs such
as Eudora or Microsoft Outlook. While retrieving new email messages from the
server, the program can move email messages into a particular folder (or delete
them, or raise their priority, etc.) if the message header or address contain a
particular string.
120
Menus, Filters, Plug-ins
The Logic of Selection
Viewpoint Datalabs International is selling thousands 3D geometric models
widely used by computer animators and designers. Its catalog describes the
models as follows: "VP4370: Man, Extra Low Resolution. VP4369: Man, Low
Resolution. VP4752: Man, Muscular in Shorts and Tennis Shoe. VP5200. Man,
w/Beard, Boxer Shorts..."137 Adobe Photoshop 5.0 comes with more than 100
filers which allow the user to modify an image in numerous ways; After Effects
4.0, the standard for compositing moving images, is shipped with 80 effects plug-
ins; thousands more are available from third parties.138 Macromedia Director 7
comes with an extensive library of “behaviors” — ready-to-use pieces of
computer code.139 Softimage|3D (v3.8), the leading 3D modeling and animation
software, is shipped with over 400 textures which can be applied to 3D
objects.140 QuickTime 4 from Apple, a format for digital video, comes with 15
built-in filters and 13 built-in video transitions.141 Geocities Web site, which
pioneered the concept of hosting users’ Web sites for free in exchange for adding
ad banners into users’ pages, gives users access to a collection of over 40,000
clip art images for customizing their sites.142 Index Stock Imagery offers 375,000
stock photos available for use in Web banner ads.143 Microsoft Word 97 Web
Page Wizard lets the user to create a simple Web by selecting from eight pre-
determined styles described by such terms as “Elegant,” “Festive” and
“Professional.” Microsoft Chat 2.1 asks the user to specify her avator by choosing
among twelth built-in cartoon character. During the online session, the user can
further customize the selected character by interpolating between eight values
which represent eight fundamental emotions as defined by Microsoft
programmers.
These examples illustrate a new logic of computer culture. New media
objects are rarely created completely from scratch; usually they are assembled
from ready-made parts. Put differently, in computer culture authentic creation has
been replaced by selection from a menu. In the process of creating a new media
object, the designer selects from libraries of 3D models and texture maps, sounds
and behaviors, background images and buttons, filters and transitions. Every
authoring and editing software comes with such libraries. In addition, both
software manufacturers and third parties sell separate collections which work as
“plug-ins,” i.e. they appear as additional commands and ready-to-use media
121
elements under software’s menus. The Web provides a further source of plug-ins
and media elements, with numerous collections available for free.
New media users are similarly asked to select from pre-defined menus of
choices when using software to create documents or access various Internet
services. Here are few examples: selecting one of pre-defined styles when
creating a Web page in Microsoft Word or a similar program; selecting one of
“AutoLayouts” when creating a slide in PowerPoint; selecting one of pre-
determined avatars on entering a multi-user virtual world such as Palace; selecting
one of the pre-determined viewpoints when navigating a VRML world. (Avatar is
a character or a graphic icon representing a user in a virtual world.)
All in all, selecting from a library or menu of pre-defined elements or
choices is one of the key operations for both professional producers of new media
and for the end users. This operation makes production process more efficient for
the professionals; and it makes end users feel that they are not just consumers but
“authors” creating a new media object or experience. What are the historical
origins of this new cultural logic? How can we describe theoretically the
particular dynamics of standardization and invention which comes with it? Is the
model of authorship it puts forward specific to new media or can we already find
it work in old media?
Art historian Ernst Gombrich and Roland Barthes, among others, critiqued
the romantic ideal of the artist creating totally from scratch, pulling images
directly from his imagination, or inventing new ways to see the world all
alone.144 According to Gombrich, the realist artist can only represent nature by
relaying on already established “representational schemes”; the history of illusion
in art involves slow and subtle modifications of these schemes over many
generations of artists. In his famous essay “The Death of the Author,” Barthes
offered even more radical criticism of the idea an author as a solitary inventor
alone responsible for work’s content. As Barthes puts it, "the Text is a tissue of
quotations drawn from the innumerable centers of culture."145 Yet, even though a
modern artist may be only reproducing, or, at best, combining in new ways
preexisting texts, idioms and schemas, the actual material process of art making
supports the romantic ideal. An artist operates like God creating the Universe —
she starts with an empty canvas or a blank page. Gradually filling in the details,
he brings a new world into existence.
Such a process of art making, manual and painstakingly slow, was
appropriate for the age of pre-industrial artisan culture. In the twentieth century,
as the rest of the culture moved to mass production and automation, literally
becoming a "culture industry" (the term of Theodor Adorno), fine arts continued
to insist on its artisan model. Only in the 1910s when some artists began to
assemble collages and montages from already existing cultural "parts," the
industrial method of production entered the realm of art. Photomontage became
the most “pure” expression of this new method. By the early 1920s,
122
photomontage practitioners already created (or rather, constructed) some of the
most remarkable images of modern art such as Cut with the Cake-Knife (Hannah
Höch, 1919), Metropolis (Paul Citroën, 1923), The Electrification of the Whole
Country (Gustav Klutsis, 1920), and Tatlin at Home (Raoul Hausmann, 1920), to
mention just a few examples. Yet, although photomontage became an established
practice of Dadaists, Surrealists, and Constructivists in the 1920s, and Pop artists
in the 1960s, the creation from scratch, as exemplified by painting and drawing,
remained the main operation of modern art.
In contrast, electronic art from its very beginning was based on a new
principle: modification of an already existing signal. The first electronic
instrument designed in 1920 by the Russian scientist and musician Leon
Theremin contained a generator producing a sine wave; the performer simply
modified its frequency and amplitude.146 In the 1960s video artists began to build
video synthesizers based on the same principle. The artist was no longer a
romantic genius generating a new world purely out of his imagination; he became
a technician turning a knob here, pressing switch there — an accessory to the
machine.
Substitute a simple sine wave by a more complex signal (sounds, rhythms,
melodies); add a whole bank of signal generators and you have arrived at a
modern music synthesizer, the first instrument which embodies the logic of all
new media: selection from a menu of choices.
The first music synthesizers appeared in the 1950s, followed by video
synthesizers in the 1960s, followed by DVE (Digital Video Effects) in the late
1970s — the banks of effects used by video editors; followed by computer
software such as 1984 MacDraw that came with a repertoire of basic shapes. The
process of art making has finally caught up with modern times. It has become
synchronized with the rest of modern society where everything is assembled from
ready-made parts; from objects to people's identities. The modern subject
proceeds through life by selecting from numerous menus and catalogs of items —
be it assembling an outfit, decorating the apartment, choosing dishes from a
restaurant menu, or choosing which interest groups to join. With electronic and
digital media, art making similarly entails choosing from ready-made elements:
textures and icons supplied by a paint program; 3D models which come with a 3D
modeling program; melodies and rhythms built into a music synthesis program.
While previously the great text of culture from which the artist created
her or his own unique "tissue of quotations" was bubbling and shimmering
somewhere below the consciousness, now it has become externalized (and greatly
reduced in the process) — 2D objects, 3D models, textures, transitions, effects
which are available as soon as the artist turns on the computer. The World Wide
Web takes this process to the next level: it encourages the creation of texts that
completely consist of pointers to other texts that are already on the Web. One
does not have to add any original writing; it is enough to select from what already
123
exists. Put differently, now anybody can become a creator by simply providing a
new menu, i.e. by making a new selection from the total corpus available.
The same logic applies to branching-type interactive new media objects.
In a branching-type interactive program, when the user reaching a particular
object, she can select which branch to follow next by clicking a button or on the
part of an image or by choosing from a menu. The visual result of making a
choice is that is either a whole screen or its part(s) change. A typical interactive
program of the 1980s and early 1990s was self-contained, i.e. it run on a computer
which was not networked. In contrast to surfing the Web where it is very easy to
move from one site to another, the designers of self-contained programs could
expect undivided attention from a user. Therefore it was safe to change the whole
screen after a user makes a selection. The effect was similar to turning pages in a
book. This book metaphor was promoted by first popular hypermedia authoring
software — Apple’s HyperCard (1987); a good example of its use can be found in
the game Myst (Broderbund, 1993). Myst presents the player with still images
which fill the screen. When the player clicking on the left or right parts of an
image, it is replaced by another image. (For more on navigation in Myst, see
“Digital Cinema” and “Navigable Space” sections below.) In the second part of
the 1990s, as most interactive documents migrated to the Web and simultaneously
became more complex, it became important to give all pages of the site a common
identity and also visually display page’s position in relation to the site’s
branching-tree structure. Consequently, with the help of such technologies such as
HTML Frames, Dynamic HTML and Flash, interactive designers established a
different convention. Now parts of the screen, which typically contain company
logo, top-level menus, and page’s path, remain constant while other parts changed
dynamically. (Microsoft and Macromedia sites provide good examples of this
new convention.147) But regardless of whether making a selection leads the user
to a whole new screen or only changes part(s)of it, the user still navigates through
branching structure consisting from pre-defined objects. While more complex
types of interactivity can be created by via a computer program which controls
and modifies the media object at run time, the majority of interactive media uses
fixed branching tree structures.
It is often claimed that a user of a branching interactive program becomes
its co-author: by choosing a unique path through the elements of a work, she
supposedly creates a new work. But it is also possible to see the same process in a
different way. If a complete work is a sum of all possible paths through its
elements, then the user following a particular path only accesses a part of this
whole. In other words, the user is only activating a part of the total work that
already exists. Just as with the example of Web pages which consist from nothing
but the links to other pages, here the user does not add new objects to a corpus,
but only selects its subset. This is a new type of authorship which corresponds
neither to pre-modern (before Romanticism) idea of providing minor modification
124
to the tradition nor to the modern idea (nineteenth and first part of the twentieth
centuries) of a creator-genius revolting against it. It does, however, fit perfectly
with the logic of advanced industrial and post-industrial societies, where almost
every practical act involves choosing from some menu, catalog, or database. In
fact, as I already noted when discussing interactivity in “Principles of New
Media” section, new media is the best available expression of the logic of identity
in these societies: choosing values from a number of pre-defined menus.
How can a modern subject escape from this logic? In a society saturated
with brands and labels, people respond by adopting minimalist aesthetics and
hard-to-identify clothing style. Writing about an empty loft as an expression of
minimalist ideal, architecture critic Herbert Muschamp points out that people
“reject exposing the subjectivity when one piece of stuff is prefered to another.”
The opposition between an the indvidualised inner world and objective, shared,
objective, neutral world outside becomes reversed:
The private living space has taken on the guise of objectivity: neutral,
value-free, as if this were a found space, not an impeccably designed one.
The world outside, meawhile, has become subjectified, rendered into a
changing collage of pesonal whims and fancies. This is to be expected in a
culure dominated by the distribution system. That system, exists, after all,
not to make things but to sell them, to apeal to individual impulses, tastes,
desires. As a result, the public realm has becime a collective repository of
dreams and designs from which the self requires refuge.148
How can one accomplishing the similar escape in new media? It can only be
accomplished by refusing all options and customization, and ultimately refusing
all forms of interactivity. Paradoxically, by followng an interactive path one does
not construct a unique self but instead adopts already pre-established identitities.
Similarly, chossing values from menu or customisng one’s desktop or an
aplication automatally makes one participate in the “changing collage of personal
whims and fancies” mapped out and coded into software by the companies. Thus,
short of using command-line interface of UNIX which can be though of an
equivalent of minimalist loft in the realm of computing, I would prefer using
Microsoft Windows exactly the way it was installed at the factory.
“Postmodernism” and Photoshop
As I noted in this chapter’s introduction, computer operations encode existing
cultural norms in their design. "The logic of selection" is a good example of this.
But what was a set of social and economic practices and conventions now became
encoded in the software itself. The result is a new form of control, soft but
125
powerful. Although software does not directly prevent its users from creating
from scratch, its design on every level makes it "natural" to follow a different
logic: that of selection.
While computer software “naturalizes” the model of authorship as
selection from libraries of pre-defined objects, we can already find this model at
work with old media, such as magic lantern slides shows.149 As film historian
Charles Musser points out, in contrast modern cinema where the authorship
extends from pre-production to post-production but does not cover exhibition
(i.e., the theatrical presentation of a film is completely standardized and does not
involve making creative decisions), in magic lantern slide shows the exhibition
was a highly creative act. Magic lantern exhibitioner was the in fact an artist who
skillfully arranged a presentation of slides which he bought from the distributors.
This is a perfect example of authorship as selection: an author puts together an
object from the elements which she herself did not create. The creative energy of
the author goes into selection and sequencing of elements, rather than into their
original design.
Although not all modern media arts follow this authorship model, the
technological logic of analog media strongly supports it. Stored using industrially
manufactured materials such as film stock or magnetic tape, media elements can
be more easily copied, isolated and assembled in new combinations. In addition,
various media manipulation machines, such as a tape recorder and a film slicer,
make the operations of selection and combination easier to perform,. In parallel,
we witness the development of archives of various media which enable the
authors to draw on already existing media elements rather than always having to
record new elements themselves. For instance, in the 1930s German
photojournalist Dr. Otto Bettmann started what latter became known as Bettmann
Archive; at the time of its acquisition by Bill Gates’s Corbis Corporation in 1995
it contained 16 million photographs, including some of most frequently used
images of this century. Similar archives were created for film and audio media.
Using “stock” photographs, movie clips and audio recording become the standard
practice of modern media production.
To summarize: the practice of putting together a media object from
already existing and commercially distributed media elements already existed
with old media, but new media technology further standardizes it and makes it
much easier to perform. What before involved scissors and glue now involves
simply clicking on "cut” and “paste. And, by encoding the operations of selection
and combination into the very interfaces of authoring and editing software, new
media “legitimizes” them. Pulling elements from databases and libraries becomes
the default; creating them from scratch becomes an exception. The Web acts as a
perfect materialization of this logic. It is one gigantic library of graphics,
photographs, video, audio, design layouts, software code and texts; and each and
126
every element is free since it can be saved to user’s computer with a single mouse
click.
It is not accidental that the development of GUI which legitimized “cut
and paste” logic as well as media manipulation software such as Photoshop,
which popularized plug-in architecture, took place during the 1980s — the same
decade when contemporary culture became “post-modern.” In evoking this term I
follow Fredric Jameson usage of post-modernism as “a periodizing concept whose
function is to correlate the emergence of new formal features in culture with the
emergence of a new type of social life and a new economic order.”150 As it
became apparent by the early 1980s for critics such as Jameson, culture no longer
tried to “make it new.” Rather, endless recycling and quoting of the past media
content, artistic styles and forms became the new “international style” and the
new cultural logic of modern society. Rather than assembling more media
recordings of reality, culture is now busy re-working, recombining and analyzing
the already accumulated media material. Invoking the metaphor of Plato’s cave,
Jameson writes that post-modern cultural production “can no longer look directly
out of its eyes at the real word but must, as in Plato’s cave, trace its mental images
of the world on its confining walls.”151 In my view, this new cultural condition
found its perfect reflection in the emerging computer software of the 1980s which
privileged the selection from already existing media elements over creating them
from scratch. And at the same time, to large extent it is this software which made
post-modernism possible. The shift of all cultural production to first electronic
tools such as switchers and DVEs (1980s) and then to computer-based tools
(1990s) greatly eased the practice of relying on old media content in creating new
productions. It also made media universe much more self-referential, because
when all media objects are designed, stored and distributed using a single
machine — computer — it becomes much easier to borrow elements from already
existing objects. Here again the Web became the perfect expression of this logic,
since new Web pages are routinely created by copying and modifying already
existing Web pages. This applies both for home users creating their home pages
and for professional Web, hypermedia, and game development companies.
From Object to Signal
Selecting ready-made elements which will become part of the content of a new
media object is only one aspect of “logic of selection.” While working on the
object, the designer also typically selects and applies various filters and “effects.”
All these filters, be it manipulating image appearance, creating a transition
between moving images, or applying a filter to a piece of music, involve the
same principle: algorithmically modifying the existing media object or its parts.
Since computer media consist from samples which are represented in a computer
127
as numbers, a computer program can access every sample in turn and modify its
value according to some algorithm (see “Principles of New Media,” (2) and (3)).
Most image filters work in this way. For instance, to add noise to an image, a
program such as Photoshop reads in the image file pixel by pixel, adds a
randomly generated number to the value of each pixel, and writes out a new
image file. Programs can also work on more than one media object at once. For
instance, to blend two images together, a program reads in values of
corresponding pixels from the two images; it then calculates a new pixel value
based on the percentages of existing pixel values; this process is repeated for all
the pixels.
Although we can also find precursors to filter operations in old media (for
instance, hand colorization of silent film), they really comes into their own with
the electronic media technologies. All electronic media technologies of the
nineteenth and twentieth century are based on modifying a signal by passing it
through various filters. These include technologies for real-time communication
such as telephone; broadcasting technologies used for mass distribution of media
products such as radio and television; and technologies to synthesize media, such
as video and audio synthesizers which originate with the instrument designed by
Theremin in 1920.
In retrospect, the shift from a material object to a signal accomplished by
electronic technologies represents a fundamental conceptual step towards
computer media. In contrast to a permanent imprint in some material, a signal can
be modified in real time by passing it through some filter(s). Moreover, in
contrast to manual modifications of a material object, an electronic filter can
modify the signal all at once. Finally, and most importantly, all machines for
electronic media synthesis, recording, transmission and reception include controls
for signal modification. As a result, an electronic signal does not have a singular
identity — a particular state which is qualitatively different from all other possible
states. Consider, for example, loudness control of the radio receiver or brightness
control of an analog television set. They don’t have any privileged values. In
contrast to a material object, electronic signal is essentially mutable.
This mutability of electronic media is just one step away from
“variability” of new media (see “Principles of New Media” section.) As already
discussed, a new media object can exist in numerous versions . For instance, in
the case of a digital image, we can change its contrast and color, blur or sharpen
it, turn it into a 3D shape, use its values to control sound, and so on. But, to a
significant extent, an electronic signal is already characterized by similar
variability, because it can exist in numerous states. For example, in the case of a
sine wave, we can modify its amplitude or frequency; each modification produces
a new version of the original signal without affecting its structure. Therefore, in
essence, a television or radio signal are already new media. Put differently, in the
progression from a material object to an electronic signal to computer media the
first shift is more radical than the second. All that happens when we move from
128
analog electronics to digital computers is that the range of variations is greatly
expanded. This happens because, firstly, modern digital computers separate
hardware and software, and, secondly, because an object is now represented as
numbers, i.e. it become computer data which can be modified by software. In
short, a media object becomes “soft” — with all the implications contained in this
metaphor.
The experimental filmmaker Hollis Frampton whose reputation rests on
his remarkable structural films and who, towards the end of his life, came to be
interested in computer media, seemed to already understood this fundamental
importance of the shift from a material object to an electronic signal.152 He wrote
in one of his essays:
Since the New Stone Age, all the arts have tended, through accident or
design, toward a certain fixity in their object. If Romanticism deferred
stabilizing the artifact, it nonetheless placed its trust, finally, in a
specialized dream of statis: the 'assembly line' of the Industrial Revolution
was at first understood as responsive to copious imagination.
If the television assembly line has by now run riot (half a billion
people can watch a wedding as consequential as mine or yours) it has also
confuted itself in its own malleability.
We're all familiar with the parameters of expression: Hue,
Saturation, Brightness, Contrast. For the adventurous, there remain the
twin deities Vertical Hold and Horizontal Hold…and, for those aspiring to
the pinnacles, Fine Tuning.153
What Frampton calls “malleability” of television signal becomes “variability” of
new media. While the analog television set allowed the viewer to modify the
signal on just a few dimensions such as brightness and hue, new media
technologies give the user much more control. A new media object can be
modified on numerous dimensions, and these modifications can be expressed
numerically. For instance, the user of a Web browser software can instruct the
browser to skip all multimedia elements; tell it to enlarge font size while
displaying a page or to completely substitute the original font by a different one.
The user can also re-shape the browser window to any size and proportions as
well as change the spatial and color resolution of the display itself. Further, a
designer can specify that different versions of the same Web site will be displayed
depending upon the bandwidth of user’s connection and the resolution of her
display. For instance, a user accessing the site via a high-speed connection and a
high resolution screen will get a rich multimedia version while the user accessing
the same site via a small LCD display of a hand-held electronic will receive just a
few lines of text. More radically, a number of completely different interfaces can
be constructed to the same data, from a database to a virtual environment. In
129
short, the new media object is something which can exist in numerous versions
and numerous incarnations.
To conclude this discussion of selection operation, I would like to invoke
a particular cultural figure — a new kind of author for whom this operation is the
keys. This author is a DJ who creates music in real-time by mixing already
existing music tracks and who is dependent on various electronic hardware
devices. In the 1990s DJ acquired a new cultural prestige, becoming a required
presence at art openings and book release parties, in hip restaurants and hotels, in
the pages of Art Forum and Wired. The rise of this figure can be directly
correlated to the rise of computer culture. DJ best demonstrates its new logic:
selection and combination of pre-existent elements. DJ also demonstrates the true
potential of this logic to create new artistic forms. Finally, DJ example also makes
it clear that selection by itself is not sufficient. The essence of DJ’s art is the
ability to mix the selected elements together in rich and sophisticated ways. In
contrast to “paste and cut” metaphor of modern GUI which suggests that selected
elements can be simply, almost mechanically combined, the practice of live
electronic music demonstrates that true art lies in the “mix.”
130
Compositing
From Image Streams to Modular Media
The movie Wag the Dog (Barry Levinson, 1997) contains an scene in which a
Washington spin doctor and a Hollywood producer are editing a fake news
footage designed to win public support for the non-existent war. The footage
shows a girl, a cat in her arms, running through the destroyed village. If a few
decades earlier creating together such a shot required staging and then filming the
whole thing on location, the computer tools make it possible to create it in real
time. Now, the only live element is the girl, played by a professional actress. The
actress is videotaped against a blue screen. The other two elements in the shot, the
destroyed village and the car, come from the database of stock footage. Scanning
through the database, the producers trying different versions of these elements; a
computer updates the composite scene in real time.
The logic of this shot is typical of new media production process,
regardless of whether the object being put together is a video or film shot, as in
Wag the Dog example; a 2D still image; a sound track; a 3D virtual environment;
a computer game scene; or a sound track. In the course of production, some
elements are created specifically for the project; others are selected from
databases of stock material. Once all the elements are ready, they are composited
together into a single object. That is, they are fitted together and adjusted in a
such a way that their separate identities become invisible. The fact that they come
diverse sources and were created by different people in different times is hidden.
The result is a single seamless image, sound, space or a scene.
As used in new media field, the term digital compositing has a particular
and well-defined meaning. It refers to the process of combining a number of
moving image sequences and possibly stills into a single sequence with the help
of special compositing software such as After Effects (Adobe), Compositor
(Alias|Wavefront), or Cineon (Kodak). Compositing was formally defined in a
paper published in 1984 by two scientists working for Lucasfilm. In describing
compositing they make a significant analogy with computer programming:
Experience has taught us to break dwn large bodies of source code into
separate modules in order to save compilation time. An error in one
routine forces only the recompilation of its module and the relatively
quick reloading of the entire program. Similarly, small errors in coloration
or design in one object should not force “recompilation” of the entire
image.
131
Separating the image into elements which can be independently rendered saves
enormous time. Each element has an associated matte, coverage information
which designates the shape of the element. The compositing of those elements
makes use of the mattes to accumulate the final image.154
Most often the composited sequence simulates a traditional film shot. That
it, it looks like something which took place in real physical space and was filmed
by a real film camera. To achieve this, all elements which comprise the finished
composite — for example, footage shot on location, referred in the industry as a
“live plate,” footage of actors shot in front of a blue screen, and 3D computer-
generated elements — are aligned in perspective, and modified so they have same
contrast and color saturation. To simulate the depth of field of effect, some
elements are blurred while others are sharpened. Once all the elements are
assembled, a virtual camera move through the simulated space may be added to
increase its “reality effect.” Finally, such artifacts as film grain or video noise can
be added. (See “Illusion” chapter for more detailed discussion of how 3D
computer graphics is used in the service of traditional cinematic realism.) In
summary, digital compositing can be broken into three conceptual steps:
1. Construction of a seamless 3D virtual space from different elements.
2. Simulation of a camera move(s) through this space (optional).
3. Simulation of the artifacts of a particular media (optional).
If 3D computer animation is used to create a virtual space from scratch,
compositing typically uses existing film or video footage. Therefore I need to
explain why I claim the result of a composite is a virtual space. Let us consider
two different examples of compositing. A compositor may use a number of
moving and still images to create a totally new 3D space and then generate a
camera move through it. For example, in Cliffhanger (Renny Harlin, 1993) the
shot of the main hero, played by Silvester Stallone, which was filmed in the studio
against a blue screen, was composited with the shot of a mountain landscape. The
resulting shot shows Stallone high in the mountains hanging over an abyss. In
other cases, new elements will be added (or removed from) a live action sequence
without changing neither its perspective nor the camera move. For example, a 3D
computer generated creature can be added to a live action shot of an outdoor
location, such as in many dinosaur shots in Jurassic Park (Steven Spielberg,
special effects by Industrial Light & Magic, 1993) In the first example it is
immediately clear that composited shot represents something which never took
place in reality. In other words, the result of the composite is a virtual space. In
the second example, it may appear at first that the existing physical space is
preserved. However, here as well, the final result is a virtual world which never
really existed. Put differently, what existed was a field of grass with trees without
dinosaurs.
132
Digital compositing is routinely used to put together TV commercials and
music videos, computer games scenes, shots in feature films and most other
moving images in computer culture. Throughout the 1990s, Hollywood directors
increasingly came to rely on compositing to assemble larger and larger part of a
film. In 1999 George Lucas released Stars Wars: Episode 1 (1999); according to
Lucas, %95 of the film was assembled on a computer. As I will discuss below,
digital compositing as a technique to create moving images goes back to video
keying and optical printing in cinema; but what before was a rather special
operation now become a norm for creating moving imagery. Digital compositing
also greatly expanded the range of this technique, allowing to control the
transparency of individual layers and to combine potentially unlimited number of
layers. For instance, a typical special effects shot from a Hollywood film may
consist from a few hundred, or even thousands of layers. Although in some
situations a few layers can be combined in real time automatically (virtual sets
technology), in general compositing is a time consuming and difficult operation.
This is one aspect of compositing the scene from Wag the Dog misrepresented; to
create the composite shown in this scene would require many hours.
Digital compositing exemplifies a more general operation of computer
culture: assembling together a number of elements to create a single seamless
object. Thus we can distinguish between compositing in wider sense (i.e., the
general operation) and compositing in a narrow sense (assembling movie image
elements to create a photorealistic shot). The latter meaning corresponds to the
accepted usage of the term compositing. For me, compositing in a narrow sense is
a particular case of a more general operation of compositing — a typical
operation in assembling any new media object.
As a general operation, compositing is a counterpart of selection. Since a
typical new media object is put together from elements which come from different
sources, these elements need to be coordinated and adjusted to fit together.
Although the logic of these two operations — selection and compositing — may
suggest that they always follow one another (first select, then composite), in
practice their relationship is more interactive. Once an object is partially
assembled, new elements may need to be added; existing elements may need to be
re-worked. This interactivity is made possible by modular organization of a new
media object on different scales (see “Principles of New Media,” (2)). Throughout
the production process, the elements retain their separate identity and therefore
they can be easily modified, substituted or deleted. When the object is complete, it
can be “output” as a single “stream” in which separate elements no longer are
accessible. The example of the operation which “collapses” all elements together
is “flatten image” command in Adobe Photoshop 5.0. Another example of
“collapsing” elements into a single stream is recording a digitally composited
moving image sequence on film, which was a typical procedure in Hollywood
film production in the 1980s and 1990s.
133
Alternatively, the completed object may retain the modular structure when
it is distributed. For instance, in computer games the player can interactively
control characters, moving them in space. In some games, the user moves 2D
images of characters, called sprites, over the background image; in others,
everything is represented as 3D objects, including the characters. In either case,
during production the elements are adjusted to form a single whole, stylistically,
spatially and semantically; during the play the user can move the elements within
the programmed limits.
In general, 3D computer graphics representation is more “progressive”
than a 2D image because it allows true independence of elements; therefore it
may gradually replace image streams, still used by our culture: photographs, 2D
drawings, films, video. In other words, 3D computer graphics representation is
more modular than 2D still image or 2D moving image stream. This modularity
makes it easier for a designer to modify the scene at any time. It also gives the
scene additional functionality. For instance, the user may “control” the character,
moving him or her around the 3D space. Scene elements can be later reused for
new productions. Finally, modularity also allows for a more efficient storage and
transmission of a media object. For example, to transmit a video clip over a
network all pixels which make up this clip have to be send over; but to transmit a
3D scene only requires sending the coordinates of the objects in it. These is how
online virtual worlds, online computer games and networked military simulators
work: first the copies of all objects making up a world are downloaded to a user
computer, and after this the server only has to keep sending their new 3D
coordinates.
If the general trajectory of computer culture is from 2D images towards
3D computer graphics representations, digital compositing represents an
intermediary historical step between the two. A composited space which consists
from a number of moving image layers is more modular than a single shot of a
physical space. The layers can be repositioned against each other and adjusted
separately. Yet such a representation is not as modular as a true 3D virtual space,
because each of the layers retains its own perspective. (In “Digital Cinema”
section below I will discuss the newer post-production method in which digitized
film or video sequences are positioned in a virtual computer generated space.)
When and were moving image “streams” will be replaced by %100 3D computer
generated scenes will depend not only on cultural acceptance of computer scene's
look but also on economics. A 3D scene is much more functional than a film or
video shot of the same scene but, if it is to contain similar level of detail, it may
be much more expensive to generate.
The general evolution of all media types towards becoming more and
more modular, and the particular evolution of a moving image in the same
direction, can be traced through the history of popular media file formats.
QuickTime developers early on specified that a single QuickTime movie may
consist from a number of separate tracks, just as a still Photoshop image consists
134
from a number of layers. QuickTime 4 format (1999) included 11 different track
types, including video track, sound track, text track and sprite track (graphic
objects which can be moved independently of video).155 By placing different
media on different tracks which can be edited and exported independently,
QuickTime encourages the designers to think in modular terms. In addition, a
movie may contain a number of video tracks which can act as layers in a digital
composite. By using alpha channels (masks saved with video tracks) and different
modes of track interaction (such as partial transparency), QuickTime user can
create complex compositing effects within a single QuickTime movie, without
having to resort to any special compositing software. In effect, QuickTime
architects embedded the practice of digital compositing in the media format itself.
What previously required special software now can be done by simply using the
features of QuickTime format itself.
Another example of media format evolving towards more and more data
modularity is MPEG.156 The early version of the format such as MPEG-1 (1992)
was defined as “the standard for storage and retrieval of moving pictures and
audio on storage media.” The format specified a compression scheme for a video
and/or audio data conceptualized in a traditional way. In contrast, MPEG-7 (to be
approved in 2001) is defined as “the content representation standard for
multimedia information search, filtering, management and processing.” It is
based on different concept of a media composition which consist from a number
of a media objects of various types, from video and audio to 3D models and facial
expressions, and the information on how these objects are combined. MPEG-7
provides an abstract language to describe such a scene. The evolution of MPEG
thus allows us to trace the conceptual evolution in how we understand new media
— from a traditional “stream” to a modular composition, more similar in its logic
to a structural computer program than a traditional image or a film.
The Resistance to Montage
The connection between aesthetics of post-modernism and the operation of
selection also applies to compositing. Together, these two operations reflect and
at the same time enable "post-modern" practice of pastiche and quotation. They
work in tandem: one operation is used to select elements and styles from the
“database of culture”; another is to put them together into new objects. Thus,
along with selection, compositing is the key operation of post-modern, or
computer-based authorship.
At the same time, we should think of the aesthetic and the technological
as aligned but ultimately separate layers, to use the metaphor of digital technology
itself. The logic of the 1980s post-modern aesthetics and the logic of the 1990s
135
computer-based compositing are not the same. In the 1980s post-modern
aesthetics, historical references and media quotes were kept as distinct elements;
the boundaries between elements were well-defined (think of David Salle’s
paintings, Barbara Kruger’s montages and various music videos.) Interestingly,
this aesthetics corresponds to electronic and early digital tools of the period, such
as video switchers, keyers, DVE (digital video effect devices), and computer
graphics card which had limited color resolution. These tools enabled hard-edge
“copy and paste” operation but not smooch, multi-layer composites. (A lot can be
made out of the fact that one of the key post-modern artists of the 1980s, Richard
Prince, who became well-known for his “appropriation” photographs, was
operating one of the earliest computer-based photo editing system in the late
1970s as a part of his commercial job, before he started making “appropriation”
photographs.) The 1990s compositing supported a different aesthetics
characterized by smoothness and continuity. The elements were now blended
together, and the boundaries were erased, rather than emphasized. This aesthetics
of continuity can be best observed in television spots and special effects
sequences of feature films which actually put together through digital compositing
(i.e., compositing in the narrow, technical sense). For instance, the computer-
generated dinosaurs in Jurassic Park are made to perfectly blend with the
landscape, just as the live actors, 3D virtual actors and computer-rendered ship are
made to blend together in Titanic (James Cameron, special effects by Digital
Domain, 1997). But the aesthetics of continuity can also be found in other areas of
new media. Computer-generated morphs allow for a continuos transition between
two images which before would be accomplished through a dissolve or a cut.157
Many computer games also obey the aesthetics of continuity in that, in cinematic
terms, they are single-takes. They have no cuts. From beginning to end, they
present a single continuos trajectory through a 3D space. This is particularly true
for first-person shooters such as Quake. The lack of montage in these games fits
in with a first person point of view they employ. These games simulate the
continuity of a human experience, guaranteed by the laws of physics. While
modern telecommunication, from telegraph, telephone and television to
telepresence and World Wide Web allowed us to suspend these laws, moving
almost instantly from one virtual location to another with a tog of a switch or a
press of a button, in RL (real life) we still obey physics: in order to move from
one point to another we have to pass through every point in between. (I will
investigate navigation through space as a key form of computer culture in
“Navigable Space” section below.)
All these examples — smooth composites, morphing, uninterrupted
navigation in games — have one thing in common: where old media relied on
montage, new media substitutes the aesthetics of continuity. A film cut is replaced
by a digital morph or by a digital composite. Similarly, the instant changes in time
and space characteristic of modern narrative, both in literature and in cinema, are
136
replaced by a continuos non-interrupted first-person narrative of games and VR.
Computer multimedia also does not use any montage. The desire to correlate
between different senses, or, to use new media lingo, different media tracks,
which preoccupied many artists throughout the twentieth century such as
Kandinsky, Skriabin, Eisenstein, and Godard, to mention just a few well-known
names, is foreign to multimedia. Instead it follows the principle of simple
addition. The elements in different media are placed next to each other without
any attempt to establish contrast, complementarity or dissonance between them.
This is best illustrated by Web sites of the 1990s which typically contains JPEG
images, QuickTime clips, audio files, and other media elements, side by side.
We can also find strong anti-montage tendencies in modern GUI. In the
middle of the 1980s Apple published guidelines for interface design of all
Macintosh application software. According to these guidelines, an interface
should communicate the same messages through more than one sense. For
instance, an alert box appearing on the screen should be accompanied by a sound.
This alignment of different senses can be compared to naturalistic use of different
media in traditional film language, which was attacked by Eisenstein and other
montage activists. Another example of anti-montage tendency in GUI is peaceful
co-existence of multiple information objects on the computer screen, exemplified
by a number of simultaneously opened windows. Just as with media elements in a
Web, the user can add more and more windows without establishing any
conceptual tension between them.
The aesthetics of continuity can’t be fully deduced from compositing
technology, although in many cases it would not be possible without it. Similarly,
montage aesthetics which dominated much of modern art and media should not be
thought of as a simple result of the available tools; but at the same time these
tools, with their possibilities and limitations, contributed to its development. For
instance, a film camera enables to shoot film footage of a certain limited length;
to create a longer film the separate pieces have to be put together. This is typically
in editing were the pieces are trimmed and then glued together. Not surprisingly,
the modern film language is built on discontinuities: short shots replace one
another; the point of view changes from shot to shot. The Russian montage school
pushes such discontinuities to the extreme but, with a very few exceptions such as
Andy Warhol’s early films and Wavelength by Michael Snow, all film systems
are based on them.
In computer culture, montage is no longer dominant aesthetics, as it was
throughout the twentieth century, from the avant-garde of the 1920s up until post-
modernism of the 1980s. Digital compositing in which different spaces are
combined into a single seamless virtual space is a good example of the alternative
aesthetics of continuity; however, compositing in general can be understood as a
counterpart of montage aesthetics. Montage aims to create visual, stylistic,
semantic, and emotional dissonance between different elements. In contrast,
compositing aims to blend them into a seamless whole, a single gestalt. Since I
137
already evoked DJ as somebody who exemplifies “authoring by selection,” I will
use this figure once again as an example of how anti-montage aesthetics of
continuity cuts across culture and is not limited to the creation of computer still
and moving images and spaces. DJ’s art is measured by his ability to seamlessly
go from one track to another. A great DJ is thus a compositor and anti-montage
artist par excellence. He is able to create a perfect temporal transition from very
different musical layers; and he can do this in real time, in front of the dancing
crowd.
In discussing selection from a menu, I pointed out that this operation is
typical of both new media and culture at large. Similarly, the operation of
compositing is not limited to new media. Consider, for instance, the frequent use
of one or more layers of semi-transparent materials in contemporary packaging
and architecture. The result is a visual composite, since a viewer can see both
what is in front and what is behind the layer. It is interesting that one architectural
project which explicitly refers to computer culture — The Digital House (Hariri &
Hariri, project, 1988) — systematically employs such semi-transparent layers
throughout.158 If in the famous glass house of Mies van de Rohe the inhabitant
was looking outside through glass walls, the more complex plan of The Digital
House creates the possibility of seeing through a number of interior spaces at
once. Thus the inhabitant of the house is constantly faced with complex visual
composites.
Having discussed compositing as a general operation of new media and as
a counterpart of selection, I will now focus on its particular case — compositing
in the narrow sense, i.e. creation of a single moving image sequence from a
number of separate sequences and (optionally) stills using special compositing
software. Today digital compositing is responsible for an increasing number of
moving images: all special effects in cinema, computer games, virtual worlds,
most television visuals and even television news (see discussion of virtual sets
below). Most often the moving image constructed through compositing presents a
fake 3D world. I say “fake” because regardless of whether a compositor creates a
totally new 3D space from different elements (Cliffhanger example), or only adds
some elements to a live action footage (Jurassic Park example), the resulting
moving image shows something which did not exist in reality. Digital
compositing thus belongs together with other simulation techniques. These are the
techniques used to create fake realities and thus, ultimately, to deceive the viewer:
fashion and make up, realist painting, dioramas, military decoys and VR. Why
digital compositing acquired such prominence? If we are to create an archeology
which will connect digital compositing with previous techniques of visual
simulation, there should we locate the essential historical breaks? Or, to ask this
question differently: what is the historical logic which drives the evolution of
these techniques? Shall we indeed expect computer culture to gradually abandon
138
pure lens-based imaging (still photography, film, video) replacing it instead with
composited images and ultimately with 3D computer generated simulations?
Archeology of Compositing: Cinema
I will start my archeology of compositing with Potemkin's Villages. According to
the historical myth, at the end of the eighteenth century, Russian ruler Catherine
the Great decided to travel around Russia in order to observe first-hand how the
peasants lived. The first minister and Catherine's lover, Potemkin, had ordered the
construction of special fake villages along her projected route. Each village
consisted of a row of pretty facades. The facades faced the road; at the same time,
to conceal their artifice, they were positioned at a considerable distance. Since
Catherine the Great never left her carriage, she returned from her journey
convinced that all peasants lived in happiness and prosperity.
This extraordinary arrangement can be seen as a metaphor for life in the
Soviet Union where I grew up in the 1970s. There, the experience of all citizens
was split between the ugly reality of their lives and the official shining facades of
ideological pretense. However, the split took place not only on a metaphorical but
also on a literal level, particularly in Moscow — the showcase Communist city.
When prestigious foreign guests visited Moscow, they, like Catherine the Great,
were taken around in limousines which always followed few special routes. Along
these routes, every building was freshly painted, the shop windows displayed
consumer goods, the drunks were removed, having been picked up by the militia
early in the morning. The monochrome, rusty, half-broken, amorphous Soviet
reality was carefully hidden from the view of the passengers.
In turning selected streets into fake facades, Soviet rulers adopted
eighteenth century technique of creating fake reality. But, of course, the twentieth
century brought with it a much more effective technology: cinema. By
substituting a window of a carriage or a car with a screen showing projected
images, cinema opened up new possibilities for simulation.
Fictional cinema, as we know it, is based upon lying to a viewer. A perfect
example is the construction of a cinematic space. Traditional fiction film
transports us into a space: a room, a house, a city. Usually, none of these exist in
reality. What exists are the few fragments carefully constructed in a studio. Out of
these disjointed fragments, a film synthesizers the illusion of a coherent space.
The development of the techniques to accomplish this synthesis coincides
with the shift in American cinema between approximately 1907 and 1917 from a
so-called "primitive" to a "classical" film style. Before the classical period, the
space of film theater and the screen space were clearly separated much like in
theater or vaudeville. The viewers were free to interact, come and go, and
maintain a psychological distance from the cinematic narrative. Correspondingly,
139
the early cinema's system of representation was presentational: actors played to
the audience, and the style was strictly frontal.159 The composition of the shots
also emphasized frontality.
In contrast, classical Hollywood film positions each viewer inside the
fictional space of the narrative. The viewer is asked to identify with the characters
and to experience the story from their points of view. Accordingly, the space no
longer acts as a theatrical backdrop. Instead, through new compositional
principles, staging, set design, deep focus cinematography, lighting and camera
movement, the viewer is situated at the optimum viewpoint of each shot. The
viewer is "present" inside a space which does not really exist. A fake space.
In general, Hollywood cinema was always careful to hide the artificial
nature of its space, but there is one exception: rear screen projection shots which
were introduced in the 1930s. A typical shot shows actors sitting inside a
stationary vehicle; a film of a moving landscape is projected on the screen behind
car's windows. The artificiality of rear screen projection shots stands in striking
contrast against the smooth fabric of Hollywood cinematic style in general.
The synthesis of a coherent space out of distinct fragments is only one
example of how fictional cinema fakes reality. A film in general is comprised
from separate image sequences. These sequences can come from different
physical locations. Two consecutive shots of what looks like one room may
correspond to two places inside one studio. They can also correspond to the
locations in Moscow and Berlin, or Berlin and New York. The viewer will never
know.
This is the key advantage of cinema over older fake reality technologies,
be it eighteenth century Potemkin's Villages or nineteenth century Panoramas and
Dioramas. Before cinema, the simulation was limited to the construction of a fake
space inside a real space visible to the viewer. Examples include theater
decorations and military decoys. In the nineteenth century, Panorama offered a
small improvement: by enclosing a viewer within a 360-degree view, the area of
fake space was expanded. Louis-Jacques Daguerre introduced another innovation
by having viewers move from one set to another in his London Diorama. As
described by the historian Paul Johnson, its "amphitheater, seating 200, pivoted
through a 73-degree arc, from one 'picture' to another. Each picture was seen
through a 2,800-square-foot-window."160 But, already in the eighteenth century,
Potemkin had pushed this technique to its limit: he created a giant facade — a
Diorama stretching for hundred of miles — along which the viewer (Catherine the
Great) passed. In cinema a viewer remains stationary: what is moving is the film
itself.
Therefore, if the older simulation technologies were limited by the
materiality of a viewer's body, existing in a particular point in space and time,
film overcomes these spatial and temporal limitation. It achieves this by
substituting recorded images for unmediated human sight and by editing these
140
images together. Through editing, images that could have been shot in different
geographic locations or in different times create an illusion of a contiguous space
and time.
Editing, or montage, is the key twentieth technology for creating fake
realities. Theoreticians of cinema have distinguished between many kinds of
montage but, for the purposes of sketching the archeology of the technologies of
simulation leading to digital compositing, I will distinguish between two basic
techniques. The first technique is temporal montage: separate realities form
consecutive moments in time. The second technique is montage within a shot. It is
the opposite of the first: separate realities form contingent parts of a single image.
The first technique of temporal montage is much more common; this is what we
usually mean by montage in film. It defines the cinematic language as we know it.
In contrast, the montage within a shot is used more rarely throughout film history.
An example of this technique is the dream sequence in The Life of an American
Fireman by Edward Porter in 1903, in which an image of a dream appears over a
man's sleeping head. Other examples include the split screens beginning in 1908
which show the different interlocutors of a telephone conversation;
superimposition of a few images and multiple screens used by the avant-garde
filmmakers in the 1920’s (for instance, superimposed images in Vertov’s Man
with a Movie Camera and a three-part screen in Gance Abel’s 1927 Napoléon);
rear screen projection shots; and the use of deep focus and a particular
compositional strategies used to juxtapose close and far away scenes (for instance,
a character looking through a window, such as in Citizen Kane, Ivan the Terrible
and Rear Window.)161
In a fiction film temporal montage serves a number of functions. As I
already pointed out, it creates a sense of presence in a virtual space. It is also
utilized to change the meanings of individual shots (recall Kuleshov's effect), or,
more precisely, to construct a meaning from separate pieces of pro-filmic reality.
However, the use of temporal montage extends beyond the construction of an
artistic fiction. Montage also becomes a key technology for ideological
manipulation, through its employment in propaganda films, documentaries, news,
commercials and so on. The pioneer of this ideological montage is once again
Vertov. In 1923 Vertov analyzed how he put together episodes of his news
program Kino-Pravda (Cinema-Truth) out of shots filmed at different locations
and in different times. This is one example of his montage: "the bodies of people's
heroes are being lowered into the graves (filmed in Astrakhan' in 1918); the
graves are being covered with earth (Kronshtad, 1921); gun salute (Petrograd,
1920); eternal memory, people take down their hats (Moscow, 1922)." Here is
another example: "montage of the greetings by the crowd and montage of the
greetings by the machines to the comrade Lenin, filmed at different times."162 As
theorized by Vertov, through montage, film can overcome its indexical nature,
141
presenting a viewer with objects which never existed in reality.
Archeology of Compositing: Video
Outside of cinema, montage within a shot becomes a standard technique of
modern photography and design (photomontages of Alexander Rodchenko, El
Lissitzky, Hannah Höch, John Heartfield and countless other lesser-known
twentieth century designers). However, in the realm of a moving image, temporal
montage dominates. Temporal montage is cinema's main operation for creating
fake realities.
After the World War II a gradual shift takes place from film-based to
electronic image recording and editing. This shift brings with it a new technique:
keying. One of the most basic techniques used today in any video and television
production, keying refers to combining two different image sources together. Any
area of uniform color in one video image can be cut out and substituted with
another source. Significantly, this new source can be a live video camera
positioned somewhere, a pre-recorded tape, or computer generated graphics. The
possibilities for creating fake realities are multiplied once again.
With electronic keying becoming a part of a standard television practice in
the 1970s, the construction of not only still but also of moving images finally
begin to routinely rely on montage within a shot. In fact, rear projection and other
special effects shots, which had occupied marginal presence in a classical film,
became the norm: weather man in front of a weather map, an announcer in front
of footage of a news event, a singer in front of an animation in a music video.
An image created through keying presents a hybrid reality, composed of
two different spaces. Television normally relates these spaces semantically, but
not visually. To take a typical example, we may be shown an image of an
announcer sitting in a studio; behind her, in a cutout, we see news footage of a
city street. The two spaces are connected through their meanings (the announcer
discusses events shown in a cutout), but visually they are disjoint, as they do not
share neither the same scale nor perspective. If classical cinematic montage
creates an illusion of a coherent space and hides its own work, electronic montage
openly presents the viewer with an apparent visual clash of different spaces.
What will happen if the two spaces seamlessly merge? This operation
forms the basis of a remarkable video Steps directed by Polish born filmmaker
Zbigniew Rybczynski in 1987. Steps is shot on video tape and uses keying; it also
utilizes film footage and makes an inadvertent reference to virtual reality. In this
way, Rybczynski connects three generations of fake reality technologies: analog,
electronic and digital. He also reminds us that it was the 1920s Soviet filmmakers
who first fully realized the possibilities of montage which continue to be explored
and expanded by electronic and digital media.
142
In the video, a group of American tourists is invited into a sophisticated
video studio to participate in a kind of virtual reality / time machine experiment.
The group is positioned in front of a blue screen. Next, the tourists find
themselves literally inside the famous Odessa steps sequence from Sergei
Eisenstein's Potemkin (1925). Rybczynski skillfully keys the shots of the people
in the studio into the shots from Potemkin creating a single coherent space. At the
same time, he emphasizes the artificiality of this space by contrasting the color
video images of the tourists with the original grainy black and white Eisenstein's
footage. The tourists walk up and down the steps, snap pictures at the attacking
soldiers, play with a baby in a crib. Gradually, the two realities begin to interact
and mix together: some Americans fall down the steps after being shot by the
soldiers from Eisenstein's sequence; a tourist drops an apple which is picked up
by a soldier.
The Odessa steps sequence, already a famous example of cinematic
montage, becomes just one element in a new ironic re-mix by Rybczynski. The
original shots which were already edited by Eisenstein are now edited again with
video images of the tourists, using both temporal montage and montage within a
shot, the latter done through video keying. A "film look" is juxtaposed with
"video look," color is juxtaposed with black and white, the "presentness" of video
is juxtaposed with the "always already" of film.
In "Steps" Eisenstein's sequence becomes a generator for numerous kinds
of juxtapositions, super-impositions, mixes and re-mixes. But Rybczynski treats
this sequence not only as a single element of his own montage but also as a
singular, physically existing space. In other words, the Odessa steps sequence is
read as a single shot corresponding to a real space, a space which could be visited
like any other tourist attraction.
Along with Rybczynski, another filmmaker who systematically
experimented with the possibilities of electronic montage within a shot is Jean-
Luk Godard. While in the 1960s Godard was actively exploring new possibilities
of temporal montage such as jump cut, in later video works such as Scénario du
film ‘Passion’ (1982) and Histore(s) du cinéma (1989-) he developed an unique
aesthetics of continuity which relies on electronically mixing a number of images
together within a single shot. If Rybczynski’s aesthetics is based on the operation
of video keying, Godard’s aesthetics similarly relies on a single operation
available to any video editor: mixing. Godard uses the electronic mixer to create
very slow cross-dissolves between images, cross-dissolves which seem never to
resolve in a singular image, ultimately becoming the film itself. In Histore(s) du
cinéma, Godard mixes together two, three or more images; images gradually fade
in and out, but never disappear completely, staying on the screen for a few
minutes at a time. This technique can be interpreted as representation of ideas or
mental images floating around in our minds, coming in and out of the mental
focus. Another variation of the same technique used by Godard is to move from
one image to another by oscillating between the two. The images flicker back and
143
forth over and over, until the second image finally replaces the first. This
technique can be also interpreted as an attempt to represent mind's movement
from one concept, mental image or memory to another — the attempt, in other
words, to represent what, according to Locke and other associationist
philosophers, is the basis of our mental life — forming associations.
Godard wrote: "There are no more simple images... The whole world is
too much for an image. You need several of them, a chain of images..." 163
Accordingly, Godard always multiple images, images cross-dissolved together,
coming together and separating. The electronic mixing that replaces both
temporal montage and montage within the shot becomes for Godard an
appropriate technique to visualize this “vague and complicated system that the
whole world is continually entering and watching.”164
Digital Compositing
The next generation in simulation technologies is digital compositing. On first
glance, computers do not bring any conceptually new techniques for creating fake
realities. They simply expand the possibilities of joining together different image
within one shot. Rather than keying together images from two video sources, we
can now composite an unlimited number of image layers. A shot may consist of
dozens, hundreds, or thousands of image layers. These image may all have
different origins: film shot on location (“live plates”), computer-generated sets or
virtual actors, digital matte paintings, archival footage, and so on. Following the
success of Terminator 2 and Jurassic Park, most Hollywood films came to utilize
digital compositing to create a least some of their shots.
Thus historically, a digitally composed image, like an electronically keyed
image, can be seen as a continuation of montage within a shot. But while
electronic keying creates disjoined spaces reminding us of the avant-garde
collages of Rodchenko or Moholy-Nagy from the 1920s, digital composing brings
back the nineteenth century techniques of creating smooth "combination prints"
like those of Henry Peach Robinson and Oscar G. Reijlander.
But this historical continuity is deceiving. Digital compositing does
represent a qualitatively new step in the history of visual simulation because it
allows the creation of moving images of non-existent worlds. Computer generated
characters can move within real landscapes; conversely, real actors can move and
act within synthetic environments. In contrast to nineteenth century "combination
prints" which emulated academic painting, digital composites simulate the
established language of cinema and television. Regardless of the particular
combination of live action elements and computer-generated elements which
make up the composited shot, the camera can pan, zoom, and dolly through it.
The interactions of the elements of the virtual world over time between
144
themselves (for instance, the dinosaur attacking the car) along with the ability to
look at it from different viewpoints become the guarantee of its authenticity.
These new abilities to create a virtual world which moves — and to be
able to move through it —come at a price. Although in the scene from Wag the
Dog compositing the fake news footage took place in real time, in reality aligning
together numerous elements to create a convincing composite is a time-
consuming task. For instance, a 40 second sequence from Titanic in which the
camera flies over the computer-generated ship populated by computer-generated
characters took many months to produce and its total cost was 1.1 million
dollars.165 In contrast, although images of such complexity were out of reach for
video keying, it was possible to use it to combine three image sources in real-
time. (This trade-off between image construction time and its complexity is
similar to another trade-off I already noted above: between image construction
time and its functionality. That is, images created with 3D computer graphics are
more functional than image streams recorded by film or video cameras, but in
most cases they are much more time consuming to generate.)
If a compositor restricts the composite to just a few images, as it was done
with electronic keying, compositing can also be created in real time. The resulting
illusion of a seamless space is stronger that what was possible with electronic
keying. The example of real-time compositing is Virtual Sets technology which
was first introduced in the early 1990s and since them has been making its way
into television studios around the world. This technology allows to composite
video image and computer-generated three-dimensional elements on the fly.
(Actually, because the generation of computer-elements is computation intensive,
the final image transmitted to the audience may be a seconds behind the original
image picked by television camera.) The typical application of Virtual Sets
involves composing an image of an actor over a computer-generated set. The
computer reads the position of the video camera and uses this information to
render the set in proper perspective. The illusion is made more convincing by
generating shadows and/or reflections of the actor and integrating them into the
composite. Because of the relatively low resolution of analog television, the
resulting effect is quite convincing. A particularly interesting application of
Virtual Sets is replacement and insertion of arena-tied advertising messages
during live TV broadcasts of sports and entertainment events. Computer-
synthesized advertising messages can be inserted onto the playing field or other
empty areas in the arena in the proper perspective, as though they were actually
present in physical reality.166
Digital compositing represents a fundamental break with previous
techniques for visual deception yet for another reason. Throughout the history of
representation, artists and designers focused on the problem of creating a
convincing illusion within a single image, be it a painting, a film frame or a view
seen by Catherine the Great through the window of her carriage. Set making, one-
145
point perspective, chiaroscuro, trick photography and other cinematography
techniques were all developed to solve this problem. Film montage introduced a
new paradigm: creating an effect of presence in a virtual world by joining
different images over time. Temporal montage became the dominant paradigm for
visual simulation of non-existent spaces.
As the examples of digital composing for film and Virtual Sets applications for
television demonstrate, the computer era introduces a different paradigm. This
paradigm is concerned not with time but with space. It can be seen as the next
step in the development of techniques for creating a single convincing image of
non-existent spaces: painting, photography, cinematography. Having mastered
this task, the culture came to focus on how to seamlessly join a number of such
images into one coherent whole (electronic keying, digital compositing.) Whether
it is composing a live video of a newscaster with a 3D computer generated set or
composing thousands of elements to create images of "Titanic," the main problem
is no longer how to generate convincingly looking individual images but how to
blend them together. Consequently, what is important now is what happens on the
edges where different images are joined. The borders where different realities
come together is the new arena where Potemkins of our era try to outdo one
another.
Compositing and New Types of Montage
In the beginning of this section I pointed out that the use of digital compositing to
create continuos spaces out of different elements can be seen as an example of
larger anti-montage aesthetics of computer culture. Indeed, if in the beginning of
the twentieth century cinema discovered that it can simulate a single space
through temporal montage — a time-based mosaics of different shots — in the
end of the century it came with the technique to accomplish the similar result
without montage. In digital compositing, the elements are not juxtaposed but
blended, with their boundaries erased rather than foregrounded.
At the same time, by relating digital compositing to theory and practice of
film montage, we can better understand how this new key technique of
assembling moving images redefines our concepts of a moving image (in “Digital
Cinema” I will offer another conceptualization of a digital moving image, arrived
at through a different historical trajectory.) While traditional film montage
privileges temporal montage over montage within a shot — because technically
the later was much more difficult to achieve — compositing makes them equal.
More precisely, it erases the strict conceptual and technical separation between
the two. Consider, for instance, the interface layout typical of many programs for
computer-based editing and digital compositing, such as Adobe Premiere 4.2, a
popular editing program, and Alias|Wavefront Composer 4.0, a professional
146
compositing program. In this interface, the horizontal dimension represents time,
while the vertical dimension represents spatial order of different image layers
making up each image. A moving image sequence appears as a number of blocks
staggered vertically, with each block standing for a particular image layer. Thus if
Pudovkin, one of Russian film montage theorists and practitioners of the 1920s,
conceived of montage as a one-dimensional line of bricks, now it becomes a 2D
brick wall. This interface makes montage in time and montage within a shot equal
in importance.
If Premiere interface conceptualizes editing as an operation in 2D
dimensions, the interface of one of the most popular compositing programs, After
Effects 3.0, adds a third dimension. Following the conventions of traditional film
and video editing, Premiere assumes that all image sequences are the same size
and proportions; in fact, it makes working with images which do not conform to
the standard 3 by 4 frame ratio rather difficult. In contrast, the user of After
Effects places image sequences of arbitrary sizes and proportions within the larger
frame. Breaking with the conventions of old moving image media, the interface of
After Effects assumes that the individual elements making up a moving image can
freely move, rotate and change proportions over time.
Serge Einstein already used the metaphor of many-dimensional space in
his writings on montage, naming one of his articles “The Filmic Forth
Dimension” (“Kino cheturekh izmereneii”).167 However, his theories of montage
ultimately focused on one dimension — time. Eisenstein formulated a number of
principles, such as counterpoint, which can be used to coordinate the changes in
different visual dimensions over time. The examples of visual dimensions he
considered are graphic directions, volumes, masses, space, and contrast.168 When
the sound film became a possibility, Eisenstein extended these principles to
handle what, in computer language, can be called synchronization of visual and
audio tracks; and later he added the dimension of color.169 Eisenstein also
developed a different set of principles (“methods of montage”) according to
which different shots can be edited together to form a longer sequence. The
examples of “methods of montage” include metric montage which uses absolute
lengths of shots to establish a ‘beat,” and rhythmic montage based on pattern of
movement within the shots. These methods can be used by themselves to structure
a sequence of shots, but they also can be combined within a single sequence.
The new logic of a digital moving image contained in the operation of
compositing runs against Einstein's aesthetics with its focus on time. Digital
compositing makes the dimensions of space (3D fake space being created by a
composite and 2½ D space of all the layers being composited) and frame (separate
images moving in 2D within the frame) as important as time. In addition, the
possibility of imbedding hyperlinks within a moving sequence introduced in
QuickTime 3 and other digital formats adds yet another spatial dimension.170 The
147
typical use of hyperlinking in digital movies it to link elements of a movie with
information displayed outside of it. For instance, when a particular frame is
displayed, a specific Web page can be loaded in another window. This practice
“spatializes” moving image: no longer completely filling a screen, it is now just
one window among others. makes moving image hyperlinking spatial as well.
In summary, if film technology, film practice and film theory privileged
the temporal development of a moving image, computer technology spatializes
moving image making time just one dimension among a number of others. The
new spatial dimensions can be defined as follows:
1. Spatial order of layers in a composite (2 1/2 space).
2. Virtual space constructed trough compositing (3D space).
3. 2D movement of layers in relation to the image frame (2D space).
4. The relationship between moving image and the linked information in the
adjustment windows (2D space).
These dimensions should be added to the list of visual and sound dimensions of
the moving image, elaborated by Eistenstein and other filmmakers. Their use
opens new possibilities for cinema as well as a new challenge for film theory. No
longer just a subset of audio-visual culture, digital moving image becomes a part
of audio-visual-spatial culture.
Of course, the simple use of these dimensions by itself does not result in
montage. Most images and spaces of contemporary culture are juxtaposition of
different elements; calling any such juxtaposition montage will render the term
meaningless. Media critic and historian Erkki Hutamo suggested that we should
reserve the use of the term for “strong” cases, and I will follow his suggestion
here.171 Thus, in order to “qualify” as an example of montage, a new media
object should fulfill two conditions: the juxtapositions of elements should follow
a particular system; and these juxtapositions should play key role in how the work
establishes its meaning, emotional and aesthetic effect. These conditions would
also apply to the particular case of new spatial dimensions of a digital moving
image. By establishing a logic which controls the changes and the correlation of
values on these dimensions, digital filmmakers can create what I will call spatial
montage. In the section “New Language of Cinema” below I will continue the
discussion of spatial montage by analysing two concrete examples: a CD-ROM
and a Web site.
While the dominant use of digital compositing is to create a seamless
virtual space, it does not have to be subordinated to this goal. The borders
between different worlds do not have to be erased; the different spaces do not
have to be matched in perspective, scale and lighting; the individual layers can
retain their separate identity rather then being merged into a single space; the
different worlds can clash semantically rather than form a single universe. I will
148
conclude this section by invoking a few more works, which, together with videos
by Rybczynski and Godard, point at the new aesthetic possibilities of digital
compositing if it is not used in the service of simulation. Although all these works
were created before digital compositing became available, they explore its
aesthetic logic — for compositing is not just a technological but first of all a
conceptual operation. I will use these works to introduce two other montage
methods based on compositing: ontological montage and stylistic montage.
Rybczynski’s film Tango (1982) made when he was still living in Poland
uses layering as a metaphor for the particular overcrowdness characteristic of
Socialist countries in the second part of the twentieth century, and for human co-
habitation in general. A number of people perform various actions moving in
loops through the same small room, apparently unaware of each other.
Rybczynski offsets the loops in such a way that even though his characters keep
moving through the same points in space, they never run into another.
Compositing, achieved in Tango through optical printing, allows the filmmaker to
superimpose a number of elements, or whole words, within a single space. (In this
film each person moving through the room can be said to form a separate world.)
As in Steps, these worlds are matched in perspective and scale— and yet the
viewer knows that the scene being shown either could not occur in normal human
experience at all given the laws of physics, or is highly unlikely to occur given the
conventions of human life. In the case of Tango, while the depicted scene could
have occurred physically, the probability of it actually occurring is close to zero.
Works such as Tango and Steps develop what I will call an ontological montage:
the co-existence of ontologically incompatible elements within the same time and
space.
The films of Czech filmmaker Konrad Zeman exemplify another montage
method based on compositing which I will call stylistic montage. In a career
which spanned from the 1940s to the 1980s Zeman used a variety of special effect
techniques to create juxtapositions of stylistically diverse images in different
media. Zeman juxtaposes different media both in time, cutting from a live action
shot to a shot of a model or documentary footage, and within the same shot. For
example, a shot may combine filmed human figures, an old engraving used for
background, and a model. Of course, such artists as Picasso, Braque, Picabia and
Max Ernst were creating similar juxtaposition of elements in different media in
still images already before the World War II. However, in the realm of a moving
image stylistic montage only came to the surface in the 1990s when the computer
became the meeting ground for different generations of media formats used in the
twentieth century — 35 mm and 8 mm film, amateur and professional video, and
early digital film formats. While previously filmmakers usually worked with a
single format throughout the whole film, the accelerated replacement of different
analog and digital formats since the 1970s made the co-existence of stylistically
diverse elements a norm rather than exemption for new media objects.
Compositing can be used to hide this diversity — or it can be used to foreground
149
it, as well as to create it artificially. For instance, the film Forest Gump strongly
emphasizes stylistic differences between various shots; this simulation of different
film and video artifacts is an important aspect of its narrative system.
In Zeman's films such as Baron Prásil (“Baron Muchhausen,” 1961) and
Na komete (“On the Comet,” 1970), live action footage, etchings, miniatures and
other elements are layered together in self-conscious and ironic way. Like
Rybczynski, Zeman keeps the coherent perspectival space in his films while
making us aware that it is constructed. One of his devices is to superimpose
filmed actors over an old etching used as a background. In Zeman’s aesthetics
neither graphic nor cinematographic dominate; the two are blended together in
equal proportion creating a unique visual style. At the same time, Zeman
subordinates the logic of feature filmmaking to the logic of animation. That is, the
shots in his films which combine live action footage with graphic elements
position all elements on parallel planes; the elements move parallel to the screen.
This is the logic of an animation stand where the stack of images is arranged
parallel to each other, rather than of live action cinema where the camera typically
moves through 3D space. As we will see in “Digital Cinema” section, this
subordination of live action to animation is the logic of digital cinema in general.
Young St. Petersburg artist Olga Tobreluts, who does use digital
compositing, also respects the illusion of a coherent perspectival space, while
continuously playing tricks with it. In "Gore ot Uma"(1994; directed by Olga
Komarova), a video work based on a famous play written by the nineteenth
century Russian writer Aleksandr Griboedov, Tobreluts overlays images
representing radically different realities (a close-up of plants; animals in the Zoo)
on the windows and walls of various interior spaces. In one shot, two characters
converse in front of a window behind which we see a flock of soaring birds taken
from Alfred Hitchcock's "The Birds"; in another, a delicate computer- rendered
design keeps morphing on the wall behind a dancing couple. In these and similar
shots Tobreluts aligns the two realities in perspective but not in scale. The result
is an ontological montage — and also a new kind of montage within a shot.
Which is to say, if the 1920s avant-garde, and MTV in its wake, juxtaposed
radically different realities within a single image, and if Hollywood digital artists
use computer compositing to glue different images into a seamless illusionistic
space (for instance, synthetic dinosaurs composited against filmed landscape in
"Jurassic Park"), Zeman, Rybczynski and Tobreluts explores the creative space
between these two extremes. This space in between modernist collage and
Hollywood cinematic realism is a new direction for cinema ready to be further
explored with the help of digital compositing.
150
Teleaction
Representation versus Communication
Teleaction, the third operation which I will discuss in this chapter, may appear to
be qualitatively different from the first two, selecting and compositing. It is not
employed to create new media, but only to access it. Therefore that we may at
first think that teleaction does not have a direct effect on the language of new
media.
Of course, this operation is made possible by the designers of computer
hardware and software. For instance, numerous Web cameras allow the users to
observe remote locations; most Web sites also include hyperlinks which allow the
user to “teleport” from one remote server to another. At the same time, in the case
of many commercial sites, a designers aims to try to prevent a user from leaving
the site for as long as possible. To use the industry lingo (circa 1999), a designer
wants to make each user “hardcore” (making the user stay on the site); the goal of
commercial Web design is to create “stickness” (a measure of how long an
individual user stays on a particular Web site), and to increase “eyeball hang
time” (Web-site royalty). So while it is the end user who is employing the
operation of teleaction, it is the designer who makes it (im)possible. Still, no new
media objects are being generated when the user folows a hyperlink to another
Web site, or uses telepresence to observe or act in a remote location, or
communicates in real time with other users using Internet chat, or just makes a
plain old-fashioned telephone call. In short, once we begin dealing with verbs and
nouns which start with “tele,” we no longer deal with the traditional cultural
domain of representation. Instead, we enter a new conceptual space which this
book has not explored so far — telecommunication. How can we start navigating
it?
When we think of the decade of the 1890s, we think of the birth of
cinema. In the preceding decades, and the one which immediately followed the
1890s, most other modern media technologies were developed, enabling the
recording of still images of visible reality (photography) and sound (the
phonograph), as well as real-time transmission of images, sounds, and text
(telegraph, television, the fax, telephone and radio). Yet, more than any of these
other inventions, it was the introduction of cinema which impressed itself most
strongly on public memory. The year which we remember and celebrate is 1895;
it is not 1875 (first television experiments of Carey) or 1907 (the introduction of
the fax). Clearly, we are more impressed (or at least, we have been until the
Internet) with modern media's ability to record aspects of reality and then use
these recordings to simulate it for our senses, than with its real-time
151
communication aspect. If we had a choice to be among the Lumiere's first
audience or be the among the first users of the telephone, we would choose the
former. Why? The reason is that the new recording technologies led to the
development of new arts in the way that real-time communication did not. The
fact that aspects of sensible reality can be recorded and that these recordings can
be later combined, re-shaped and manipulated — in short, edited — made
possible the new media-based arts which were soon to dominate the twentieth
century: fiction films, radio concerts and music programs, television serials and
news programs. Despite persistent experiments of the avant-garde artists with
modern technologies of real-time communication — radio in the 1920s, video in
the 1970s, Internet in the 1990s — the ability to communicate over a physical
distance in real-time by itself did not seem to inspire fundamentally new aesthetic
principles the way film or tape recording did.
Since their beginning in the nineteenth century, modern media
technologies have developed along two distinct trajectories. The first is
representational technologies: film, audio and video magnetic tape, varios digital
storage formats. The second is real-time communication tecnologies, i.e.
everything which begins with “tele”: telegraph, telephone, telex, television,
telepresence. Such new twentieth century cultural forms as radio and later
television emerge at the intersections of these two trajectories. In this meeting, the
technologies of real-time communication became subordinated to technologies of
representation. Telecommunication was used for distribution, as with
broadcasting which enabled a twentith century radio listener or television viewer
to receive a transmission in real time. But a typical program being broadcast, be it
a film, a play or a musical performance, was a traditional aesthetic object, i.e. a
construction which utilizes elements of familiar reality and which was created by
professionals before the transmission. For instance, although following the
adoptation of video tape recorders television retained some live programs such as
news and talk shows, the majority of programming came to be pre-recorded.
The attempts of some artists from the 1960s onward to substitute a
traditionally defined aesthetic object by other concepts such as “process,”
“practice,” and “concept” only highlight the strong hold of the tradtional concept
on our cultural imagination. The concept of an aesthetic object as an object, i.e. as
a self-contained structure limited in space and/or time, is fundamental to all
modern thinking about aesthetics. For instance, in his Languages of Art (1976),
one of the most influential aesthetic theories of the last decades, philosopher
Nelson Goodman names the following four symptoms of the aesthetic: syntactic
density, semantic density, syntactic repleteness and the ability to exemplify.172
These characteristics assume a finite object in space and/or time: a literary text, a
musical or dance performance, a painting, a work of architecture. For another
example of how modern aesthetic theory relies on the concept of a fixed object we
152
can look at the very influential article “From Work to Text” by Roland Barthes. In
this article Barthes establisheds an oppositon between a traditional notion of a
“work” and a new notion of “text,” about which he advances seven
”propositions.”173 As can be seen from these propositions, Barthes’s notion of a
“text” is an attempt to go beyond traditional aesthetic object understood as
something clearly delineated from other objects semantically and physically —
and yet ultimately Barthes retains the traditonal concept. Proposition (1) states:
“The work can be held in hand, the text is held in language, only exists in the
movement of discourse.” “Text” is ruled by metonymy (3) (think of
hyperlinking) ; it aims at dissemination of meanings and is fundamentally
intertextual (4) (recall another Barthes’s quote already cited in “Selection”
section); it does not have a single Author (5); it “requires that one try to abolish
(or at the very least to diminish) the distance between writing and reading” (6),
the distance which, as Barthes notes, is a recent historical invention. Like a post-
serial musical score which makes a performer into its co-author, “text” “asks of
the reader a practical collaboration” (6). Given this last proposition in particular,
many interactive new media objects qualify as “texts” in Barthes’s definition. Yet
his notion of a “text” still assumes a reader “reading,” in most general sense,
something which was previously “written.” In short, while a “text” is interactive,
hypertextual, distributed, and dynamic (to translate Barthes’s propositions into
new media terms), it is still a finite object.
By foregrounding telecommunication, both real-time and asynchronous, as
a fundamental cultural activity, Internet asks us to reconsider the very paradigm
of what an aesthetic object is. Is it necessary for the concept of the aesthetics to
assume representation? Does art necessary involves a finite object? Can
telecommunication between users by itself be a subject of an aesthetic? Similarly,
can the user’s search for information be understood aesthetically? In short, if a
user accessing information and a user telecommunicating with other(s) are as
common in computer culture as a user interacting with a representation, can we
expand our aesthetic theories to include these two new situations?
I find these to be hard questions; but as a way to begin approaching them,
this section will offer an analysis of different kinds of “tele” operations which I
summed up by my own term “teleaction.”
Telepresence: Illusion versus Action
In an opening sequence from a movie Titanic (James Cameron, 1997), we see an
operator sitting at the controls. The operator is wearing a head-mounted display
which shows an image transmitted from a remote location. This allows him to
remotely control a small vehicle, and with its help, to explore the insides of the
153
“Titanic” lying on the bottom of the ocean. In short, the operator becomes
"telepresent."
With the rise of the Web, telepresence which until recently was restricted
to few specialized industrial and military applications, became more of a familiar
experience. The search on Yahoo! for "interesting devices connected to the Net"
returns links to a variety of Net-based telepresence applications: coffee machines,
robots, interactive model railroad, audio devices and, of course, the ever-popular
web cams.174 Some of these devices, such as most web cams, do not allow for
true telepresence — you get images from a remote location but you can't act.
Others, however, are true telepresence links, which allow the user to perform
actions remotely.
Remote video cameras and remotely navigated devices such as the one
shown in Titanic exemplify the notion of being “present” in a physically remote
location. At the same time, the experience of daily navigating the Web also
involves telepresence on a more basic level. By following hyperlinks, the user
“teleports” from one server to another, from one physical location to the next. So
if we are still fetishising video-based telepresence as portrayed in “Titanic,” this is
only because we are slow to accept the primacy of information space over
physical space in computer culture. But in fact the ability to instantly “teleport”
from one server to another, to be able to explore a multitude of documents located
on computers around the world, all from one location, is much more important
that being able to perform physical actions in one remote location.
Following my strategy in “Compositing” section where I focused on
digital compositing of moving images as an example of the general operation of
compositing, this section will discuss telepresence in its accepted, more narrow
meaning: the ability to see and act at a distance. And just as I constructed one
possible archeology of digital compositing, here I would like to construct one
possible historical trajectory leading to computer-based telepresence. If digital
compositing can be placed along with other technologies for creating fake reality
such as fashion and make up, realist paintings, dioramas, military decoys and VR,
telepresence can be thought of as one example of representational technologies
used to enable action, i.e. to allow the viewer to manipulate reality through
representations. Other examples of these action-enabling technologies are maps,
architectural drawings, and x-rays. All of them allow their user to act over
distance. Given this, what are the new possibilities for action offered by
telepresence in contrast to these older technologies? This question will guide my
discussion of telepresence here.
If we look at the word itself, the meaning of the term telepresence is
presence over distance. But presence where? Interactive media designer and
theorist Brenda Laurel defines telepresence as "a medium that allows you to take
your body with you into some other environment... you get to take some subset of
154
your senses with you into another environment. And that environment may be a
computer-generated environment, it may be a camera-originated environment, or
it may be a combination of the two."175 In this definition, telepresence
encompasses two different situations: being "present" in a synthetic computer-
generated environment (what is commonly referred as virtual reality ) and being
"present" in a real remote physical location via a live video image. Scott Fisher,
one of the developers of NASA Ames Virtual Environment Workstation — the
first modern VR system — similarly does not distinguish between being "present"
in a computer-generated or a real remote physical location. Describing Ames
system, he writes: "Virtual environments at the Ames system are synthesized
with 3D computer-generated imagery, or are remotely sensed by user-controlled,
stereoscopic video camera configurations."176 Fisher uses "virtual environments"
as all-encompassing term, reserving "telepresence" for the second situation:
"presence" in a remote physical location.177 I will follow his usage here.
Both popular media and the critics have downplayed the concept of
telepresence in favor of virtual reality. The photographs of the Ames system, for
instance, have been often featured to illustrate the idea of an escape from any
physical space into a computer-generated world. The fact that a head-mounted
display can also show a televised image of a remote physical location was hardly
ever mentioned.
And yet, from the point of view of the history of the technologies of
action, telepresence is a much more radical technology than virtual reality, or
computer simulations in general. Let us consider the difference between the two.
Like fake reality technologies which preceded it, virtual reality provides
the subject with the illusion of being present in a simulated world. Virtual reality
adds a new capability: it allows the subject to actively change this world. In other
words, the subject is given control over a fake reality. For instance, an architect
can modify an architectural model, a chemist can try different molecule
configuration, a tank driver can shoot at a model of a tank, and so on. But, what is
modified in each case is nothing but data stored in a computer's memory! The
user of any computer simulation has power over the virtual world which only
exists inside a computer.
Telepresence allows the subject to control not just the simulation but
reality itself. Telepresence provides the ability to remotely manipulate physical
reality in real time through its image. The body of a teleoperator is transmitted, in
real time, to another location where it can act on subject's behalf: repairing a
space station, doing underwater excavation or bombing a military base in
Baghdad or Yugoslavia.
155
Thus, the essence of telepresence is that it is anti-presence. I don't have to
be physically present in a location to affect reality at this location. A better term
would be teleaction. Acting over distance. In real time.
Catherine the Great was fooled into mistaking painted facades for real
villages (see “Compositing.”) Today, from thousands of miles away — as it was
demonstrated during the Gulf War — we can send missile equipped with a
television camera close enough to tell the difference between a target and a decoy.
We can direct the flight of the missile using the image transmitted back by its
camera, we can carefully fly towards the target. And, using the same image, we
blow the target away. All that is needed is to position the computer cursor over
the right place in image and to press a button.
Image-Instruments
How new is this use of images?178 Does it originate with telepresence? Since we
are accustomed to consider the history of visual representations in the West in
terms of illusion, it may seem that to use images to enable action is a completely
new phenomenon. However, French philosopher and sociologist Bruno Latour
proposes that certain kinds of images have always functioned as instruments of
control and power, power being defined as the ability to mobilize and manipulate
resources across space and time.
One example of such image-instruments analyzed by Latour are
perspectival images. Perspective establishes the precise and reciprocal
relationship between objects and their signs. We can go from objects to signs
(two-dimensional representations); but we can also go from such signs to three-
dimensional objects. This reciprocal relationship allows us not only to represent
reality but also to control it.179 For instance, we cannot measure the sun in space
directly, but we only need a small ruler to measure it on a photograph (the
perspectival image par excellence).180 And even if we could fly around the sun,
we would still be better off studying the sun through its representations which we
can bring back from the trip — because now we have unlimited time to measure,
analyze, and catalog them. We can move objects from one place to another by
simply moving their representations: "You can see a church in Rome, and carry it
with you in London in such a way as to reconstruct it in London, or you can go
back to Rome and amend the picture." Finally, we can also represent absent things
and plan our movement through space by working on representations: "One
cannot smell or hear or touch Sakhalin Island, but you can look at the map and
determine at which bearing you will see the land when you send the next
fleet."181 All in all, perspective is more than just a sign system, reflecting reality
156
— it makes possible the manipulation of reality through the manipulation of its
signs.
Perspective is only one example of image-instruments. Any representation
which systematically captures some features of reality can be used as an
instrument. In fact, most types of representations which do not fit into the history
of illusionism (which includes both representation and simulation traditions as
outlined in “Screen” section) — diagrams and charts, maps and x-rays, infrared
and radar images — belong to the second history: that of representations as
instruments for action.
Telecommunication
Given that images have always been used to affect reality, does telepresence bring
anything new? A map, for instance, already allows for a kind of teleaction: it can
be used to predict the future and therefore to change it. To quote Latour again,
"one cannot smell or hear or touch Sakhalin Island, but you can look at the map
and determine at which bearing you will see the land when you send the next
fleet."
In my view, there are two fundamental differences. Because telepresence
involves electronic transmission of video images, the constructions of
representations takes place instantaneously. Making a perspectival drawing or a
chart, taking a photograph or shooting film takes time. Now I can use a remote
video camera which capture images in real-time, sending these images back to me
without any delay. This allows me to monitor any visible changes in a remote
location (weather conditions, movements of troops, and so on), adjusting my
actions accordingly. Depending upon what information I need, radar can be used
instead of a video camera as well. In either case, an image-instrument displayed
by a real-time screen (see “Screen” section) is formed in real time.
The second difference is directly related to the first. The ability to receive
visual information about a remote place in real time allows us to manipulate
physical reality in this place, also in real-time. If power, according to Latour,
includes the ability to manipulate resources at a distance, then teleaction provides
a new and unique kind of power: real-time remote control. I can drive a toy
vehicle, repair a space station, do underwater excavation, operate on a patient or
kill — all from a distance.
What technology is responsible for this new power? Since teleoperator
typically acts with the help of a live video image (for instance, when remotely
operating a moving vehicle such as in the opening sequence of "Titanic"), we may
think at first that it is the technology of video, or, more precisely, of television.
The original nineteenth century meaning of television was "vision over distance."
157
Only after 1920s, when television was equated with broadcasting, does this
meaning fade away. However, during the preceding half a century (television
research begins in the 1870s), television engineers were mostly concerned with
the problem of how to transmit consecutive images of a remote location to enable
"remote seeing."
If images are transmitted at regular intervals, if these intervals are short
enough, and if the images have sufficient detail, the viewer will have enough
reliable information about the remote location for teleaction. The early television
systems used slow mechanical scanning and the resolution as low as thirty lines.
In the case of modern television systems, the visible reality is being scanned at the
resolution of a few hundred lines sixty times a second. This provides enough
information for most telepresence tasks.
Now, consider the Telegarden project by Ken Goldberg and his
associates.182 In this Web telerobotics project, the Web users operate a robotic
arm to plant the seeds in a garden. Instead of continuosly refreshed video, the
project uses uses user-driven still images. The image shows the garden from the
viewpoint of the video camera attached to the robotic arm. When the arm is
moved to a new location, a new still image is transmitted. These still images
provide enough information for the particular teleaction in this project — planting
the seeds.
As this example indicates, it is possible to teleact without video. More
generally, we can say that different kinds of teleaction require different temporal
and spatial resolution. If the operator needs an immediate feedback on her actions
(the example of remote operation of a vehicle is again appropriate here), frequent
update of images is essential. But in the case of planting a garden using a remote
robot arm, user-triggered still images are sufficient.
Now, consider another example of telepresence. Radar images are
obtained by scanning the surrounding area once every few seconds. The visible
reality is reduced to a single point. A radar image does not contain any indications
about shapes, textures or colors present in a video image — it only records the
position of an object. Yet this information is quite sufficient for the most basic
teleaction: to destroy an object.
In this extreme case of teleaction, the image is so minimal it hardly can be
called an image at all. However, it is still sufficient for real-time remote action.
What is crucial is that the information is transmitted in real time.
If we put the examples of video-based and radar-based telepresence
together, the common denominator turns out to be not video but electronic
transmission of signals. In other words, the technology which makes teleaction in
real time possible is electronic telecommunication. It itself was made possible by
two discoveries of the nineteenth century: electricity and electromagnetism.
Coupled with a computer used for real time control, electronic telecommunication
158
leads to a new and unprecedented relationship between objects and their signs. It
makes instantaneous not only the process by which objects are turned into signs
but also the reverse process — manipulation of objects through these signs.
Umberto Eco once defined a sign as something which can be used to tell a
lie. This definition correctly describes one function of visual representations — to
deceive. But in the age of electronic telecommunication we need a new definition:
a sign is something which can be used to teleact.
Distance and Aura
Having analyzed the operation of telepresence in its more narrow and
conventional meanings as a physical presence in a remote environment, I now
want to come back to a more general sense of telepresence: real-time
communication with a physically remote location. This meaning fits all “tele”
technologies, from television, radio, fax and telephone to Internet hyperlinking
and chat. Again, I want to ask the same question as before: what is different about
more recent telecommunication technology as opposed to older onces?
To address this question I will juxtapose the arguments by two key
theoreticians of old and new media: Walter Benjamin and Paul Virilio. These
arguments come from two essays written half a century apart: Benjamin’s
celebrated "The Work of Art in the Age of Mechanical Reproduction" (1936)183
and Virilio’s "Big Optics" (1992).184 Benjamin’s and Virilio’s essays focus on
the same theme: the disruption caused by a cultural artifact, specifically, new
communication technology (film in the case of Benjamin, telecommunication in
the case of Virilio) in the familiar patterns of human perception; in short,
intervention of technology into human nature. But what is human nature and what
is technology? How does one draw the boundary between the two in the twentieth
century? Both Benjamin and Virilio solve this problem in the same way. They
equate nature with spatial distance between the observer and the observed; and
they see technologies as destroying this distance. As we will see, these two
assumptions lead them to interpret the prominent new technologies of their times
in a very similar way.
Benjamin starts with his now famous concept of aura: the unique presence
of a work of art, of a historical or of a natural object. We may think that an object
has to be close by if we are to experience its aura but, paradoxically, Benjamin
defines aura "as the unique phenomenon of a distance"(224). "If, while resting on
a summer afternoon, you follow with your eyes a mountain range on the horizon
or a branch which casts its shadow over you, you experience the aura of those
mountains, of that branch" (225). Similarly, writes Benjamin, "painter maintains
in his work a natural distance from reality" (235). This respect for distance
159
common to both natural perception and painting is overturned by the new
technologies of mass reproduction, particularly photography and film. The
cameraman, whom Benjamin compares to a surgeon, "penetrates deeply into its
[reality] web" (237); his camera zooms in order to "pray an object from its shell"
(225). With its new mobility, glorified in such films as "A Man with the Movie
Camera," the camera can be anywhere, and, with its superhuman vision, it can
obtain a close-up of any object. These close-ups, writes Benjamin, satisfy the
desires of the masses "to bring things 'closer' spatially and humanly," "to get hold
of an object at very close range" (225). Along with disregarding the scale, the
unique locations of the objects are discarded as well as their photographs brought
together within a single picture magazine or a film newsreel, the forms which fit
in with the demand of mass democratic society for "the universal equality of
things."
Writing about telecommunication and telepresence, Virilio also uses the
concept of distance to understand their effect. In Virilio's reading, these
technologies collapse the physical distances, uprooting the familiar patterns of
perception which grounded our culture and politics. Virilio introduces the terms
Small Optics and Big Optics to underline the dramatic nature of this change.
Small Optics are based on geometric perspective and shared by human vision,
painting and film. It involves the distinctions between near and far, between an
object and a horizon against which the object stands out. Big Optics is real-time
electronic transmission of information, "the active optics of time passing at the
speed of light."
As Small Optics are being replaced by Big Optics, the distinctions
characteristic of Small Optics era are erased. If information from any point can be
transmitted with the same speed, the concepts of near and far, horizon, distance
and space itself no longer have any meaning. So, if for Benjamin the industrial
age displaced every object from its original setting, for Virilio post-industrial age
eliminates the dimension of space altogether. At least in principle, every point on
Earth is now instantly accessible from any other point on Earth. As a
consequence, Big Optics locks us in a claustrophobic world without any depth or
horizon; the Earth becomes our prison.
Virilio asks us to notice "the progressive derealization of the terrestrial
horizon,...resulting in an impending primacy of real time perspective of
undulatory optics over real space of the linear geometrical optics of the
Quattrocento."185 He mourns the destruction of distance, geographic grandeur,
the vastness of natural space, the vastness which guaranteed time delay between
events and our reactions, giving us time for critical reflection necessary to arrive
at a correct decision. The regime of Big Optics inevitably leads to real time
politics, the politics which requires instant reactions to the events transmitted with
160
the speed of light, and which ultimately can only be efficiently handled by
computers responding to each other.
Given the surprising similarity of Benjamin's and Virilio's accounts of new
technologies, it is telling how differently they draw the boundaries between
natural and cultural, between what is already assimilated within the human nature
and what is still new and threatening. Writing in 1936, Benjamin uses the real
landscape and a painting as examples of what is natural for human perception.
This natural state is invaded by film which collapses distances, bringing
everything equally close and destroys aura. Virilio, writing half a century later,
draws lines quite differently. If for Benjamin film still represented an alien
presence, for Virilio it already became part of our human nature, the continuation
of our natural sight. Virilio considers human vision, the Renaissance perspective,
painting and film as all belonging to Small Optics of geometric perspective in
contrast to the Big Optics of instant electronic transmission.
Virilio postulates a historical break between film and telecommunication,
between Small Optics and Big Optics. It is also possible to read the movement
from the first to the second in terms of continuity — if we are to use the concept
of modernization. Modernization is accompanied by the process of disruption of
physical space and matter, the process which privileges interchangeable and
mobile signs over the original objects and relations. In the words of an art
historian Jonathan Crary (who draws on Deleuze and Guattari's Anti-Oedipus and
on Marx's Grandrisse), "Modernization is the process by which capitalism
uproots and makes mobile that which is grounded, clears away or obliterates that
which impedes circulation, and makes exchangeable what is singular."186 The
concept of modernization fits equally well Benjamin's account of film and
Virilio's account of telecommunication, the latter just being a more advanced
stage in this continual process of turning objects into mobile signs. Before,
different physical locations met within a single magazine spread or a film
newsreel; now, they meet within a single electronic screen. Of course, the signs
now themselves exist as digital data which makes their transmission and
manipulation even easier. Also, in contrast to photographs, which remain fixed
once they are printed, computer representation makes every image inherently
mutable — creating signs which are no longer just mobile but also forever
modifiable.187 Yet, significant as they are, these are ultimately quantitative rather
than qualitative differences — with one exception.
As can be seen from my discussion above, in contrast to photography and
film, electronic telecommunication can function as two-way communication. Not
only can user immediately obtain images of various locations, bringing them
together within a single electronic screen, but, via telepresence, she can also be
"present" in these locations. In other words, she can affect change on material
reality over physical distance in real time.
161
Film, telecommunication, telepresence. Benjamin's and Virilio's analyses
make it possible for us to understand the historical effect of these technologies in
terms of progressive diminishing and finally complete elimination of something
which both writers see as a fundamental condition of human perception — spatial
distance, the distance between the subject who is seeing and the object being seen.
This reading of distance involved in (perspectival) vision as something positive,
as a necessary ingredient of human culture provides an important alternative for a
much more dominant tendency in modern thought to read distance negatively.
This negative reading is then used to attack the visual sense as a whole. Distance
becomes responsible for creating the gap between the spectator and spectacle, for
separating subject and object, for putting the first in the position of transcendental
mastery and rendering the second inert. Distance allows the subject to
treat the Other as object; in short, it makes objectification possible. Or, as a
French fisherman have summarized these arguments to young Lacan who was
looking at a sardine can floating on the surface of the sea, years before he became
a famous psychoanalyst: "You see the can? Do you see it? Well, it doesn't see
you!"188
In Western thought, vision has always been understood and discussed in
opposition to touch; so, inevitably, the denigration of vision (to use Martin Jay's
term189) leads to the elevation of touch. Thus criticisim of vision predictably
leads to the new theoretical interest in the idea of the haptic. We may be tempted,
for instance, to read the lack of distance characteristic of the act of touching as
allowing for a different relationship between subject and object. Benjamin and
Virilio block this seemingly logical line of argument since they both stress the
aggression potentially present in touchin. Rather than understanding touch as a
respectful and careful contact or as a caress, they present it as unceremonious and
aggressive disruption of matter.
Thus, the standard connotations of vision and touch become reversed. For
Benjamin and Virilio, distance guaranteed by vision preserves the aura of an
object, its position in the world, while the desire "to brings things 'closer' "
destroys objects' relations to each other, ultimately obliterating the material order
altogether and rendering the notions of distance and space meaningless. So even if
we are to disagree with their arguments about new technologies and to question
their equitation of natural order and distance, the critique of vision — touch
opposition is something we should retain. Indeed, in contrast to older action
enabling representational technologies, real-time image instruments literally
allows us to touch us objects over distance, thus making possible their easy
destruction as well. The potential aggresivity of looking turns out to be rather
innocent than the actual aggression of electronically-enabled touch.
162
IV. The Illusions
Zeuxis was a legendary Greek painter who lived in the fifth century BC. The story
of his competition with Parrhasius exemplifies the concern with illusionism which
was to occupy Western art throughout much of its history. According to the story,
Zeuxis painted grapes with such a skill that the birds begun to fly down trying to
eat from the painted vine.190
RealityEngine is a high-performance graphics computer which was
manufactured by Silicon Graphics Inc. in the last decade of the twentieth century
AC. Optimized to generate real time interactive photorealistic 3D graphics, it is
used to create computer games and special effects for feature films and TV, to run
scientific visualization models and computer-aided design software. Last but not
least, RealityEngine is routinely employed to run high-end VR environments —
this latest achievement in West's struggle to outdo Zeuxis.
In terms of the images it can generate RealityEngine may not be superior
to Zeuxis. Yet it can do other tricks, unavailable to the Greek painter. For
instance, it allows the viewer to move around virtual grapes, touch them, lift them
on a palm of a hand. And this ability of a viewer to interact with a representation
may be as important in contributing to the overall reality effect as the images
themselves. Which makes RealityEngine a formidable contender to Zeuxis.
In the twentieth century art has largely rejected the goal of illusionism, the
goal which was so important to it before, and, as a consequence, it lost much of its
popular support. The production of illusionistic representations became the
domain of mass culture and of media technologies — photography, film and
video. The creation of illusions was delegated to optical and electronic machines.
Today, everywhere, these machines are being replaced by new, digital
illusion generators — computers. The production of all illusionistic images is
becoming the sole province of PCs and Macs, Onyxes and RealityEngines.191
This massive replacement is one of the key economic factors which keeps
the new media industries expanding. As a consequence, these industries are
obsessed with visual illusionism. This obsession is particularly strong in the field
of computer imaging and animation. Its annual SIGGRAPH conventions is the
competition between Zeuxis and Parrhasius on the industrial scale: about 40,000
people gather on a trade floor around thousands of new hardware and software
displays, all competing with each other to deliver the best illusionistic images.
The industry frames each new technological advance in image acquisition and
display in terms of the ability of computer technologies to catch up and surpass
the visual fidelity of analog media technologies. On their side, animators and
software engineers are perfecting the techniques for synthesizing photorealistic
images of sets and human actors. The quest for a perfect simulation of reality
163
drives the whole field of Virtual Reality (VR). In a different sense, the designers
of human-computer interfaces are also concerned with illusion. Many of them
believe that their main goal is to make the computer invisible, i.e. to construct an
interface which is completely “natural.” (In reality, what they usually mean by
“natural” is simply older, already assimilated technologies, such as office
stationary and furniture, a car, VCR controls, or a telephone.)
Continuing our bottom-up trajectory in examining new media, we have
now arrived at the level of appearance. Although industry’s obsession with
illusionism is not the sole factor responsible for making new media look they way
they do, it definitely one of the key. Focusing on the issue of illusionism, the
sections of this chapter address different questions raised by it. How is the “reality
effect” of a synthetic image different from that of the optical media? Is computer
technology redefines our standards of illusionism as determined by our earlier
experience with photography, film and video? “Synthetic Realism as Bricolage”
and “Synthetic Image and its Subject” provide two possible answers to these
questions. These sections investigate the new “internal” logic of a computer-
generated illusionistic image by comparing lens-based and computer imaging
technologies. The third section, “Illusion, Narrative, and Interactivity,” asks how
visual illusionism and interactivity work together (as well as against each other),
in virtual worlds, computer games, military simulators and other new interactive
new media objects and interfaces.
The discussions in these sections do not by any means exhaust the topic of
illusionism in new media. “Compositing” and “Digital Cinema” sections in the
preceding and last chapter, respectively, deal with this topic from other
perspectives. As an example of other interesting questions which the topic of
illusionism in new media may generate, I will list three below.
1.A parallel can be established between the gradual turn of computer imaging
towards representational and photorealistic (the industry term for synthetic images
which look as through they were created using traditional photography or
cinematography) images throughout the end of the 1970s — beginning of the
1980s and the similar turn towards representational painting and photography in
the art world during the same period.192 In the art world we witness photorealism,
neo-expressionism, “post-modern” “simulation” photography. In computer world,
during the same period, we may note the rapid development of the key algorithms
for photorealistic 3D image synthesis such as Phong shading, texture mapping,
bump mapping, reflection mapping and cast shadows; also the development of
first paint programs in mid 1970s which allowed manual creation of
representational images and eventually, towards the end of the 1980s, software
such as Photoshop which, for a while, made a manipulated photograph the most
common type of imagery created on a computer. In contrast, from the 1960s until
late 1970s computer imaging was mostly abstract because it was algorithm-driven
164
and the technologies for inputting photographs into a computer were not easily
accessible.193 Similarly, art world
was either dominated by non-representational movements, such as conceptual art,
minimalism and performance, or at least was approaching representation with a
strong sense of irony and distance, in the case of pop art. (Although it is possible
to argue that 1980s “simulation” artists also used “appropriated” images
ironically, in their case the distance between the media and artists’ images
became visually very small or non-existent.)
2.In the twentieth century, a very particular looking image created by still
photography and cinematography came to dominate modern visual culture. Some
of its qualities are linear perspective, depth of field effect (so only a part of 3D
space is in focus), particular tonal and color range, and motion blur (rapidly
moving objects appear smudged). As I will discuss in the following two sections,
considerable research had to be accomplished before it became possible to
simulate all these visual artifacts with computers. And even armed with special
software, the designer still has to spend significant time manually recreating the
look of photography or film. In other words, computer software does not produce
such images by default. The paradox of digital visual culture is while all imaging
is shifting towards being computer-based, the dominance of photographic and
cinematic looking images is becoming even stronger. But rather than being a
direct, “natural” result of photo and film technology, these images are constructed
on computers. 3D virtual worlds are subjected to depth of field and motion blur
algorithms; digital video is run through the special filters which simulate film
grain; and so on.
While visually, these computer-generated or filtered images are
indistinguishable from traditional photo and film images, on the level of
“material,” they are quite different as they are made from pixels or represented by
mathematical equations and algorithms. In terms of the kinds of operations which
can be performed on them, they are also quite different from images of
photography and film. These operations, such as “copy and paste,” “add,”
“multiply,” “compress,” “filter,” reflect first of all the logic of computer
algorithms and of human-computer interface; only secondly they refer to the
dimensions inherently meaningful to human perception. (In fact, we can think of
these operations as well as HCI in general as balancing between the two poles of
computer logic and human logic, by which I mean the everyday ways of
perception, cognition, causality and motivation — in short, human everyday
existence.)
Other aspects of the new logic of computer images can be derived from
the general principles of new media (see “Principles of New Media): many
operations involved in their synthesis and editing are automated; they typically
exist in many versions; they include hyperlinks; they act as interactive interfaces
165
(thus an image is something we expect to enter rather than to stay on its surface);
and so on. To summarize, the visual culture of a computer age is cinematographic
in its appearance, digital on the level of its material, and computational (i.e.,
software driven) in its logic. What are the interactions between these three levels?
Can we expect that cinematographic images (I use this phrase here to include
images of both traditional analog and computer-simulated cinematography and
photography) will be at some point replaced by some very differently images
whose appearance will be more in tune with their underlying computer-based
logic?
My own feeling is that the answer to this question is no. Cinematographic
images are very efficient for cultural communication. Since they share many
qualities with natural perception, they are easily processed by the brain. Their
similarity to “the real thing” allows the designers to provoke emotions in viewers,
as well as effectively visualize non-existent objects and scenes. And since
computer representation turns these images into numerically coded data which is
discrete (pixels) and modular (layers), they become subject to all economically
beneficial effects of computerization: algorithmic manipulation, automation,
variability and so on. A digitally-coded cinematographic image thus has two
identities, so to speak: one satisfies the demands of human communication,
another makes it suitable for computer-based practices of production and
distribution.
3. The available theories and histories of illusion in art and media, from
Gombrich’s Art and Illusion and Andre Bazin’s “The Myth of Total Cinema” to
Stephen Bann’s The True Vine, only deal with the visual dimensions.194 In my
view, most of these theories have three arguments in common. These arguments
concern three different relationships, respectively: between an image and physical
reality (1); between an image and natural perception (2); between present and past
images (3):
1. Illusionistic images share some features with the represented physical reality
(for instance, the number of an object’s angles).
2. Illusionistic images share some features with human vision (for instance,
linear perspective).
3. Each period offers some new “features” which are perceived by audiences as
“improvement” over of the previous period (for instance, the evolution of
cinema from silent to sound to color).195
Until the arrival of computer media these theories were sufficient since the human
desire to simulate reality indeed focused on its visual appearance (although not
exclusively — think, for instance, of the tradition of automata). Today, while still
useful, the traditional analysis of visual illusionism needs to be supplemented by
166
new theories. The reason is that the reality effect in many areas of new media
such only partially depends on image’s appearance. Such areas of new media as
computer games, motion simulators, virtual worlds and VR, in particular,
exemplify how computer-based illusionism functions differently. Rather than
utilizing the single dimension of visual fidelity, they construct the reality effect on
a number of dimensions, of which visual fidelity is just one. These new
dimensions include active bodily engagement with a virtual world (for instance,
the user of VR moves the whole body); the involvement of other senses beside
vision (spatialized audio in virtual worlds and games; use of touch in VR;
joysticks with force feedback; special vibrating and moving chairs for computer
games play and motion rides), and the accuracy of the simulation of physical
objects, natural pnenomena, anthropomorhic characters and humans.
This last dimension, in particular, calls for an extensive analysis, because
of the variety of methods and subjects of simulation. If the history of illusionism
in art and media largely revolves around the simulation of how things look, for
computer simulation this is one goal among many. Besides their visual
appearance, simulation in new media aims to realistically model how objects and
humans act, react, move, grow, evolve, think and feel. Physically-based modeling
is used to simulate the behavior of inanimate objects and their interactions such as
a ball bouncing of the floor or a glass being shattered. Computer games
extensively use physical modeling to simulate collisions between objects and
vehicle behavior — for instance, a car being bounced against the walls of the
racing tracks, or behavior of a plane in a flight simulation. Other methods such as
AL, formal grammars, fractal geometry and various applications of the
complexity theory (popularly referred to as “chaos theory”) are used to simulate
natural phenomena such as such as waterfalls and ocean waves, and animal
behavior (flocking birds, school of fish). Yet another important area of simulation
which also relies on many different methods is virtual characters and avatars,
extensively used in movies, games, virtual worlds and human-computer
interfaces. The examples are enemies and monsters in Quake; army units in
WarCraft and similar games; human-like creatures in Creatures and other AL
games and toys; and anthropomorphic interfaces such as Microsoft Office
Assistant in Windows 98 — an animated character which periodically pops out in
a small window offering help and tips. The goal of human simulation in itself can
be further broken into a set of various sub-goals: simulation of human
psychological states, human behavior, motivations, and emotions. (Thus,
ultimately, the fully “realistic” simulation of a human being requires not only
completely fulfilling the vision of the original AI paradigm but also going beyond
it — since original AI was solely aimed at simulating human perception and
thinking processes but not emotions and motivations.) Yet another kind of
simulation involve modeling the dynamic behavior over time of whole systems
composed from organic and/or non-organic elements (for instance, popular series
167
of Sim games such as SimCity or SimAnts which simulate a city and an ant
colony, respectively)
And even on the visual dimension — the one dimension which new media
“reality engines” share with the traditional illusionistic techniques — things work
very differently. New media changes our concept of what an image is — because
it turns a viewer into an active user. As a result, a illusionistic image is no longer
something a subject simply looks at, comparing it with her memories of
represented reality in order to judge the reality effect of this image. The new
media image is something the user actively goes into, zooming in or clicking on
individual parts with the assumption that they contain hyperlinks (for instance,
imagemaps in Web sites). Moreover, new media turns most images into image-
interfaces and image-instruments (on the concept of image as interface, see
“Cultural Interfaces” section; on image-instrument, see “Teleaction” section.)
Image becomes interactive, i.e. it now functions as an interface between a user
and a computer or other devices. The user emplys Image-interface to control a
computer, asking it to zoom into the image or display another one, start a software
application, connect to the Internet, and so on. The user employs image-
instruments to directly affect reality:, move a robotic arm in a remote location,
fire a missile, change the speed of the car and set the temperature, and so on. To
evoke the term often used in film theory, new media moves us from identification
to action. What kinds of actions can be performed via an image, how easily they
can be accomplished, their range — all this plays part in user’s assessment of the
reality effect of the image.
168
Synthetic Realism and its Discontents
"Realism" is the concept which inevitably accompanies the development and
assimilation of 3D computer graphics. In media, trade publications and research
papers, the history of technological innovation and research is presented as a
progression toward realism — the ability to simulate any object in such a way that
its computer image is indistinguishable from a photograph. At the same time, it is
constantly pointed out that this realism is qualitatively different from the realism
of optically based image technologies (photography, film), for the simulated
reality is not indexically related to the existing world.
Despite this difference, the ability to generate three-dimensional stills does
not represent a radical break in the history of visual representation of the
multitude comparable to the achievements of Giotto. A Renaissance painting and
a computer image employ the same technique (a set of consistent depth cues) to
create an illusion of space — existent or imaginary. The real break is the
introduction of a moving synthetic image — interactive 3D computer graphics
and computer animation. With these technologies, a viewer has an experience of
moving around the simulated 3D space — something one can't do with an
illusionistic painting.
In order to better understand the nature of "realism" of the synthetic
moving image it is relevant to consider a contiguous practice of the moving image
— the cinema. I will approach the problem of "realism" in 3D computer
animation starting from the arguments advanced in film theory in regard to
cinematic realism.
This section considers finished 3D computer animations which are created
beforehand and then incorporated in a film, a television program, a Web site or a
computer game. In the case of animations which are being generated by a
computer in real-time, and thus are dependent not only on available software but
also on hardware capabilities, somewhat different logic applies. The example of a
new media object from the 1990s which uses both types of animation is a typical
computer game. The interactive parts of the game are animated in real time.
Periodically, the game switches to a “full motion video” mode. “Full motion
video” is either a digital video sequence or a 3D animation which was pre-
rendered and therefore has higher level of detail — and thus “realism” — than the
animations done in real time. The last section of this chapter, “Image, Narrative
and Illusion” considers how such temporal shifts which are not limited to games
but are typical of interactive new media objects in general, affects their “realism.”
Technology and Style in Cinema
169
The idea of cinematic realism is first of all associated with André Bazin, for
whom cinematic technology and style move toward a "total and complete
representation of reality."196 In "The Myth of Total Cinema" Bazin claims that
the idea of cinema existed long before the medium had actually appeared and that
the development of cinema technology "little by little made a reality out of
original 'myth'."197 In this account, the modern technology of cinema is a
realization of an ancient myth of mimesis, just as the development of aviation is a
realization of the myth of Icarus. In another influential essay, "The Evolution of
the Language of Cinema," Bazin reads the history of film style in similar
teleological terms: the introduction of depth of field style in the end of 1930s and
the subsequent innovations of Italian neorealists in 1940s gradually bring a
spectator "into a relation with the image closer to that which he enjoys with
reality." The essays differ not only in that the first interprets film technology
while the second concentrates on film style, but also in their distinct approaches to
the problem of realism. In the first essay realism stands for the approximation of
phenomenological qualities of reality, "the reconstruction of a perfect illusion of
the outside world in sound, color and relief."198 In the second essay Bazin
emphasizes that a realistic representation should also approximate the perceptual
and cognitive dynamics of natural vision. For Bazin, this dynamics involves
active exploration of visual reality. Consequently, he interprets the introduction of
depth of field as a step toward realism, because now the viewer can freely explore
the space of film image.199
Against Bazin's "idealist" and evolutionary account, Jean-Louis Comolli
proposes a "materialist" and fundamentally non-linear reading of the history of
cinematic technology and style. The cinema, Comolli tells us, "is born
immediately as a social machine...from the anticipation and confirmation of its
social profitability; economic, ideological and symbolic."200 Comolli thus
proposes to read history of cinema techniques as an intersection of technical,
aesthetic, social and ideological determinations; however, his analyses clearly
privilege an ideological function of the cinema. For Comolli, this function is
"'objective' duplication of the 'real' itself conceived as specular reflection" (133).
Along with other representational cultural practices, cinema works to endlessly
reduplicate the visible thus sustaining the illusion that it is the phenomenal forms
(such as the commodity form) which constitute the social "real" — rather than
"invisible" to the eye relations of productions. To fulfill its function, cinema must
maintain and constantly update its "realism." Comolli sketches this process using
two alternative figures — addition and substitution.
In terms of technological developments, the history of realism in the
cinema is one of additions. First, additions are necessary to maintain the process
of disavowal, which for Comolli defines the nature of cinematic spectatorship
170
(132). Each new technological development (sound, panchromatic stock, color)
points to the viewers just how "un-realistic" the previous image was and also
reminds them that the present image, even though more realistic, will be
superseded in the future — thus constantly sustaining the state of disavowal.
Secondly, since cinema functions in a structure with other visual media, it has to
keep up with their changing level of realism. For instance, by 1920s the spread of
photography with its finely gradated image made cinematic image seem harsh by
comparison, and film industry was forced to change to the panchromatic stock to
keep up with the standard of photographic realism (131). This example is a good
illustration of Comolli's reliance on Althusserian structuralist Marxism.
Unprofitable economically for the film industry, this change is "profitable" in
more abstract terms for the social structure as a whole, helping to sustain the
ideology of the real/visible.
In terms of cinematic style, the history of realism in cinema is one of the
substitutions of cinematic techniques. For instance, while the change to
panchromatic stock adds to the image quality, it leads to other losses. If earlier
cinematic realism was maintained through the effects of depth, now
"depth(perspective) loses its importance in the production of 'reality effects' in
favor of shade, range, color" (131). So theorized, realistic effect in the cinema
appears as a constant sum in an equation with a few variables which change
historically and have equal weight: if more shading or color is "put in,"
perspective can be "taken out." Comolli follows the same logic of
substitution/substraction in sketching the development of cinematic style in its
first two decades: the early cinematographic image announces its realism through
an abundance of moving figures and the use of deep focus; later these devices
fade away and others, such as fictional logic, psychological characters, coherent
space-time of narration, take over (130).
While for Bazin realism functions as an Idea (in a Hegelian sense), for
Comolli it plays an ideological role (in a Marxist sense); for David Bordwell and
Janet Staiger, realism in film is first of all connected with the industrial
organization of cinema. Put differently, Bazin draws the idea of realism from
mythological utopian thinking. For him, realism is found in the space between
reality and a transcendental spectator. Comolli sees it as an effect, produced
between the image and the historical viewer and continuously sustained through
the ideologically determined additions and substitutions of cinematic technologies
and techniques. Bordwell and Staiger locate realism within the institutional
discourses of film industries, implying that it is a rational and pragmatic tool in
industrial competition.201 Emphasizing that cinema is an industry like any other,
Bordwell and Staiger attribute the changes in cinematic technology to the factors
shared by all modern industries — efficiency, product differentiation,
maintenance of a standard of quality (247). One of the advantages of adopting an
industrial model is that it allows the authors to look at specific agents —
171
manufacturing and supplying firms and professional associations (250). The latter
are particularly important since it is in their discourses (conferences, trade
meetings and publications) that the standards and goals of stylistic and technical
innovations are articulated.
Bordwell and Staiger agree with Comolli that the development of
cinematic technology is not linear, however, they claim that it is not random
either, as the professional discourses articulate goals of the research and set the
limits for permissible innovations (260). According to Bordwell and Staiger,
realism is one of these goals. They believe that such definition of a realism is
specific to Hollywood:
“Showmanship,” realism, invisibility: such cannons guided the SMPE
[Society of Motion Picture Engineers] members toward understanding the
acceptable and unacceptable choices in technical innovations, and these
too became teleological. In another industry, the engineer's goal might be
an unbreakable glass or a lighter alloy. In the film industry, the goals were
not only increased efficiency, economy, and flexibility but also spectacle,
concealment of artifice, and what Goldsmith [1934 president of SMPE]
called “the production of an acceptance semblance of reality.” (258)
Bordwell and Staiger are satisfied with Goldsmith's definition of realism as
"the production of an acceptance semblance of reality." However, such general
and transhistorical definition does not seem to have any specificity for Hollywood
and thus can't really account for the direction of technological innovation.
Moreover, although they claim to have successfully reduced realism to a rational
and a functional notion, in fact they have not managed to eliminate Bazin's
idealism. It reappears in the comparison between the goals of innovation in film
and other industries. "Lighter alloy" is used in aviation industry which can be
thought of as the realization of the myth of Icarus; and is there not something
mythical and fairy tale-like about "unbreakable glass"?
Technology and Style in Computer Animation
How can these three influential accounts of cinematic realism be used to approach
the problem of realism in 3D computer animation? Bazin, Comolli, and Bordwell
and Staiger offer us three different strategies, three different starting points. Bazin
builds his argument by comparing the changing quality of the cinematic image
with the phenomenological impression of visual reality. Comolli's analysis
suggests a different strategy: to think of the history of computer graphics
technologies and the changing stylistic conventions as a chain of substitutions
functioning to sustain the reality effect for audiences. Finally, to follow Bordwell
172
and Staiger's approach is to analyze the relationship between the character of
realism in computer animation and the particular industrial organization of the
computer graphics industry. (For instance, we can ask how this character is
affected by the cost difference between hardware and software development.)
Further, we should pay attention to professional organizations in the field and
their discourses which articulate the goals of research and where we may expect
to find "admonitions about the range and nature of permissible innovations"
(Bordwell and Staiger, 260). I will try the three strategies in turn.
If we follow Bazin's approach and compare images drawn from the history
of 3D computer graphics with the visual perception of natural reality, his
evolutionary narrative appears to be confirmed. During the 1970s and the 1980s,
computer images progressed towards fuller and fuller illusion of reality: from
wireframe displays to smooth shadows, detailed textures, aerial perspective; from
geometric shapes to moving animal and human figures; from Cimabue to Giotto
to Leonardo and beyond. Bazin's idea that deep focus cinematography allowed the
spectator a more active position in relation to film image, thus bringing cinematic
perception closer to real life perception, also finds a recent equivalent in
interactive computer graphics, where the user can freely explore the virtual space
of the display from different points of view. And with such extensions of
computer graphics technology as virtual reality, the promise of Bazin's "total
realism" appears to be closer than ever, literally within arms reach of VR's user.
The history of the style and technology of computer animation can also be
seen in a different way. Comolli reads the history of realistic media as a constant
trade-off of codes, a chain of substitutions producing the reality effect for
audiences, rather than as an asymptotic movement toward the axes labeled
"reality." His interpretation of the history of film style is first of all supported by
the shift he observes between the cinematic style of the 1900s and the 1920s, the
example I have already mentioned. Early film announces its realism by excessive
representations of deep space achieved through every possible means: deep focus,
moving figures, frame compositions which emphasize the effect of linear
perspective. In the 1920s, with the adaptation of panchromatic film stock, "depth
(perspective) loses its importance in the production of 'reality effects' in favor of
shade, range, color" (Comolli, 131). A similar trade-off of codes can be observed
during the short history of commercial 3D computer animation which begins
around 1980. Initially, the animations were schematic, cartoon-like because the
objects could only be rendered in wireframe or facet shaded form. Illusionism was
limited to the indication of objects's volumes. To compensate for this limited
illusionism in the representation of objects, computer animations of the early
1980s ubiquitously showed deep space. This was done by emphasizing linear
perspective (mostly, through the excessive use of grids) and by building
animations around rapid movement in depth in the direction perpendicular to the
screen. These strategies are exemplified by computer sequences of Disney movie
Tron released in 1982. Toward the end of the 1980s, with commercial availability
173
of such techniques as smooth shading, texture mapping and cast shadows, the
representation of objects in animations approached much closer the ideal of
photorealism. At this time, the codes by which early animation signaled deep
space started to disappear. In place of rapid in-depth movements and grids,
animations begun to feature lateral movements in shallow space.
The observed substitution of realistic codes in the history of 3D computer
animation seems to confirm Comolli's argument. The introduction of new
illusionistic techniques dislodges old ones. Comolli explains this process of
sustaining reality effect from the point of view of audiences. Following Bordwell
and Staiger's approach, we can consider the same phenomenon from the
producers' point of view. For the production companies, the constant substitution
of codes is necessary to stay competitive. As in every industry, the producers of
computer animation stay competitive by differentiating their products. To attract
clients, a company has to be able to offer some novel effects and techniques. But
why do the old techniques disappear? The specificity of industrial organization of
the computer animation field is that it is driven by software innovation. (In this,
the field is closer to the computer industry as a whole, rather than film industry or
graphic design.) New algorithms to produce new effects are constantly developed.
To stay competitive, a company has to quickly incorporate the new software into
their offerings. The animations are designed to show off the latest algorithm.
Correspondingly, the effects possible with older algorithms are featured less often
— available to everybody else in the field, they no longer signal "state of the art."
Thus, the trade-off of codes in the history of computer animation can be related to
the competitive pressure to quickly utilize the latest achievements of software
research.
While commercial companies employ programmers capable of adopting
published algorithms for the production environment, the theoretical work of
developing these algorithms mainly takes place in academic computer science
departments and in research groups of top computer companies such as Microsoft
or SGI (formerly Silicon Graphics). To further pursue the question of realism we
need to ask about the direction of this work. Do computer graphics researches
share a common goal?
In analyzing the same question for film industry, Bordwell and Staiger
claim that realism "was rationally adopted as an engineering aim" (258). They
attempt to discover the specificity of Hollywood's conception of realism in the
discourses of the professional organizations such as SMPE. For the computer
graphics industry, the major professional organization is SIGGRAPH (Special
Interest Group on Computer Graphics of the Association for Computing
Machinery). Its annual conventions, attended by tenths of thousands, combine a
trade show, a festival of computer animation and a scientific conference where the
best new research work is presented. The conferences also serve as the meeting
place for the researchers, engineers and commercial designers. If the research has
174
a common direction, we can expect to find its articulations in SIGGRAPH
proceedings.
Indeed, a typical research paper includes a reference to realism as the goal
of investigations in computer graphics field. For example, a 1987 paper presented
by three highly recognized scientists offers this definition of realism:
Reys is an image rendering system developed at Lucasfilm Ltd. and
currently in use at Pixar. In designing Reys, our goal was an architecture
optimized for fast high-quality rendering of complex animated scenes. By
fast we mean being able to compute a feature-length film in about a year;
high quality means virtually indistinguishable from live action motion
picture photography; and complex means as visually rich as real
scenes.202
In this definition, achieving synthetic realism means attaining two goals: the
simulation of codes of traditional cinematography and the simulation of the
perceptual properties of real life objects and environments. The first goal, the
simulation of cinematographic codes, was in principle solved early on as these
codes are well-defined and few in number. Every current professional computer
animation system incorporates a virtual camera with variable length lens, depth of
field effect, motion blur and controllable lights which simulate the lights available
to a traditional cinematographer.
The second goal, the simulation of "real scenes," turned out to be more
complex. Creating computer time-based representation of an object involves
solving three separate problems: the representation of an object's shape, the
effects of light on its surface, and the pattern of movement. To have a general
solution for each problem requires the exact simulation of underlying physical
properties and processes. This is impossible because of the extreme mathematical
complexity. For instance, to fully simulate the shape of a tree would involve
mathematically "growing" every leaf, every brunch, every piece of bark; and to
fully simulate the color of a tree's surface a programmer has to consider every
other object in the scene, from grass to clouds to other trees. In practice, computer
graphics researchers have resorted to solving particular local cases, developing a
number of unrelated techniques for simulation of some kinds of shapes, materials,
lighting effects and movements.
The result is a realism which is highly uneven. Of course, one may suggest
that this is not an entirely new development and that it can already be observed in
the history of twentieth century optical and electronic representational
technologies, which allows for more precise rendering of certain features of visual
reality at the expense of others. For instance, both color film and color television
were designed to assure acceptable rendering of human flesh tones at the expense
175
of other colors. However, the limitations of synthetic realism are qualitatively
different.
In the case of optically-based representation, the camera records already
existing reality. Everything which exists can be photographed. Camera artifacts,
such as depth of field, film grain, and the limited tonal range, affects the image as
a whole.
In the case of 3D computer graphics the situation is quite different. Now
reality itself has to be constructed from scratch before it can be photographed by a
virtual camera. Therefore, the photorealistic simulation of "real scenes" is
practically impossible as techniques available to commercial animators only cover
the particular phenomena of visual reality. The animator using a particular
software package can, for instance, easily create a shape of human face, but not
the hair; the materials such as plastic or metal but not cloth or leather; the flight of
a bird but not the jumps of a frog. The realism of computer animation is highly
uneven, reflecting the range of problems which were addressed and solved.
What determines which particular problems received priority in research?
To a large extent, this was determined by the needs of the early sponsors of this
research — the Pentagon and Hollywood. I am not concerned here to fully trace
the history of these sponsorships. What is important for my argument is that the
requirements of military and entertainment applications led the researchers to
concentrate on simulation of the particular phenomena of visual reality, such as
landscapes and moving figures.
One of the original motivations behind the development of photorealistic
computer graphics was its application for flight simulators and other training
technology.203 And since simulators require synthetic landscapes, a lot of
research went into the techniques to render clouds, rugged terrain, trees, aerial
perspective. Thus, the work which led to the development of the famous
technique to represent natural shapes, such as mountains, using fractal
mathematics was done at Boeing.204 Other well-known algorithms to simulate
natural scenes and clouds were developed by the Grumman Aerospace
Corporation.205 The latter technology was used for flight simulators and also was
applied to pattern recognition research in target tracking by a missile.206
Another major sponsor was the entertainment industry, lured by the
promise of lowering the costs of film and television production. In 1979
Lucasfilm, Ltd., George Lucas's company, organized a computer animation
research division. It hired the best computer scientists in the field to produce
animations for special effects. The research for the effects in such films as Star
Trek II: The Wrath of Khan (Nicholas Meyer, Paramount Pictures, special effects
by Industrial Light & Magic,1982) and Return of the Jedi (Richard Marquand,
Lucasfilm Ltd., special effects by Industrial Light & Magic,1983) have led to the
development of important algorithms which became widely used.207
176
Along with creating particular effects for films such as star fields and
explosions, a lot of research activity has been dedicated to the development of
moving humanoid figures and synthetic actors. This is not surprising since
commercial film and video productions center around human characters.
Significantly, the first time computer animation was used in a feature film
(Looker, Michael Cricton, Warner Brothers,1981) was to create a three-
dimensional model of an actress. One of the early attempt to simulate human
facial expressions featured synthetic replicas of Marilyn Monroe and Humphrey
Bogart.208 In another early acclaimed 3D animation, produced by Kleiser-
Wolczak Construction Company in 1988, a synthetic human figure was
humorously casted as Nestor Sextone, a candidate for the presidency in the
Synthetic Actors Guild.
The task of creating fully synthetic human actors has turned out to be
more complex than was originally anticipated. Researchers continue to work on
this problem. For instance, the 1992 SIGGRAPH conference presented a session
on "Humans and Clothing" which featured such papers as "Dressing Animated
Synthetic Actors with Complex Deformable Clothes"209 and "A Simple Method
for Extracting the Natural Beauty of Hair."210 Meanwhile, Hollywood has
created a new genre of films (Terminator 2, Jurassic Park, Casper, Flubber, etc.)
structured around "the state of the art" in digital actor simulation. In computer
graphics it is still easier to create the fantastic and extraordinary then to simulate
ordinary human beings. Consequently, each of these films is centered around an
unusual character which in fact, consists of a series of special effects — morphing
into different shapes, exploding into particles, and so on.
The preceding analysis applies to the period during which the techniques
of 3D animation were undergoing continuos development: from the middle 1970s
to the middle 1990s. By the end of this period the software tools became
relatively stable; at the same time, the dramatically decreased cost of hardware led
to the significant reduction of time it takes to render complex animations. Put
differently, the animators were now able to use more complex geometric and
rendering models, thus achieving stronger reality effect. Titanic (1997) featured
hundreds of computer animated “extras” while %95 of Stars Wars: Episode 1
(1999) were constructed on a computer. However, the dynamics which
characterized the early period of pre-rendered computer animation returned in
new areas of new media: computer games and virtual worlds (such as VRML and
Active Worlds scenes) which all use computer animation being generated in real-
time. Here Bazinian evolution towards fuller and fuller realism which
characterized the development of computer animation in the 1970s and the 1980s
was replayed once again at a accelerated speed. As the speed of CPUs and
graphics card kept increasing, computer games moved from flat shading of the
original Doom (1993) to the more detailed world of Unreal (Epic Games, 1997)
177
which featured shadows, reflections and transparency. In the area of virtual
worlds which were designed to run on typical computers without specialized
graphics accelerators, the same evolution proceeded at a much slower pace.
The icons of mimesis
While the privileging of certain areas in research can be attributed to the needs
of the sponsors, other areas received consistent attention for a different reason. To
support the idea of progress of computer graphics toward realism, researchers
privilege particular subjects that culturally connote the mastery of illusionistic
representation.
Historically, the idea of illusionism has been connected with the success in
representation of certain subjects. The original episode in the history of Western
painting, which I already invoked, is the story of the competition of Zeuxis and
Parrhasiuss. The grapes painted by Zeuxis symbolize his skill to create living
nature out of inanimate matter of paint. Further examples in the history of art
include the celebration of the mimetic skill of those painters who were able to
simulate another symbol of living nature — the human flesh. Not surprisingly,
throughout the history of computer animation, the simulation of a human figure
served as a yardstick used to measure the progress of the whole field.
While the painting tradition had its own iconography of subjects connoting
mimesis, moving image media relied on different set of subjects. Steven Neale
describes how early film demonstrated its authenticity by representing moving
nature: "What was lacking [in photographs] was the wind, the very index of real,
natural movement. Hence
the obsessive contemporary fascination, not just with movement, not just with
scale, but also with waves and sea spray, with smoke and spray."211 Computer
graphics researchers resort to similar subjects to signify the realism of animation.
"Moving nature" presented at SIGGRAPH conferences have included animations
of smoke, fire, sea
waves, and moving grass.212 These privileged signs of realism overcompensate
for the inability of computer graphics researches to fully simulate "real scenes."
In summary, the differences between cinematic and synthetic realism
begin on the level of ontology. New realism is partial and uneven, rather than
analog and uniform. The artificial reality which can be simulated with 3D
computer graphics is fundamentally incomplete, full of gaps and white spots.
Who determines what will be filled and what will remain a gap in the
simulated world? As I already noted, the available computer graphics techniques
reflect particular military and industrial needs which paid for their developments.
The ability of certain subjects to connote mastery of illusionism also makes
researchers pay more attention to some areas on the “map” and ignore others. In
178
addition, as computer graphics techniques migrate from specialized markets
towards mass consumers, they become biased in yet another way.
The amount of labor involved in constructing reality from scratch in a
computer makes it hard to resist the temptation to utilize pre-assembled,
standardized objects, characters and behaviors readily provided by software
manufacturers — fractal landscapes, checkerboard floors, complete characters,
and so on. As discussed in “selection” section, every program comes with
libraries of ready-to-use models, effects or even complete animations. For
instance, a user of the Dynamation program (a part of the popular
Alias|Wavefront 3D software) can access complete pre-assembled animations of
moving hair, rain, a comet's tail or smoke, with a single mouse click. If even
professional designers rely on ready-made objects and animations, the end users
of virtual worlds on the Internet, who usually don't have graphic or programming
skills, have no other choice. Not surprisingly, VRML software companies and
Web virtual world providers encourage users to choose from the libraries of 3D
objects and avatars they supply. Worlds Inc., the provider of Worlds software
used to create online virtual 3D chat environments, provides its users with a
library of 100 3D avatars.213 The Active Worlds which offers “3D community
based environments on the Internet” allows its over one million users (April 1999
data) to choose from over 1000 different worlds, some of which are provided by a
company and others were built by the users themselves.214 As the complexity of
these words increases, we can expect a whole market for detailed virtual sets,
characters with programmable behaviors, and even complete environments (a bar
with customers, a city square, a famous historical episode, etc.) from which a user
can put together her or his own "unique" virtual world. And although companies
such as Active Worlds provide end users with software which allows them to
quickly build and customize their virtual dwellings, avatars and whole virtual
universes, each of these constructs has to adhere to standards established by the
company. Thus behind the freedom on the surface lies standardization on a deeper
level. While a hundred years ago the user of a Kodak camera was asked just to
push a button, she still had the freedom to point the camera at anything. Now,
"you push the button, we do the rest" has become "you push the button, we create
your world."
I hope that this section demonstrated that the accounts of realism
developed in film theory can be usefully employed to talk about realism in new
media. But that does not mean that the question of computer realism is exhausted.
In the twentieth century, new technologies of representation and simulation
replace each other in rapid succession, therefore creating a perpetual lag between
our experience of their effects and our understanding of this experience. Reality
effect of a moving image is a case in point. As film scholars were producing
increasingly detailed studies of cinematic realism, film itself was already being
179
undermined by 3D computer animation. Indeed, consider the following
chronology.
Bazin's Evolution of the Language of Cinema is a compilation of three
articles written between 1952 and 1955. In 1951 the viewers of the popular
television show "See it Now" for the first time saw a computer graphics display,
generated by MIT computer Whirlwind, built in 1949. One animation was of a
bouncing ball, another of a rocket's trajectory.215
Comolli's Machines of the Visible was given as a paper at the seminal
conference on the cinematic apparatus in 1978. The same year saw the publication
of a crucial paper for the history of computer graphics research. It presented a
method to simulate bump textures which is still one of the most powerful
techniques of synthetic photorealism.216
Bordwell and Staiger's chapter Technology, Style and Mode of Production
forms a part of the comprehensive The Classical Hollywood Cinema: Film Style
& Mode of Production to 1960, published in 1985. By this year, most of the
fundamental photorealistic techniques were discovered and turnkey computer
animation systems were already employed by media production companies.
As 3D synthetic imagery is used more and more widely in contemporary
visual culture, the problem of realism has to be studied afresh. And while many
theoretical accounts developed in relation do cinema do hold when applied to
synthetic imaging, we can't assume that any concept or model can be taken for
granted. Redefining the very concepts of representation, illusion and simulation,
new media challenges us to understand in new ways how visual realism functions.
180
Synthetic Image and its Subject
As we saw, the achievement of photorealism is the main goal of research in the
field of computer graphics. The field defines photorealism as the ability to
simulate any object in such a way that its computer image is indistinguishable
from its photograph. Since this goal was articulated in the end of the 1970s, a
significant progress has been made towards getting closer to this goal: compare,
for instance, the computer images of Tron (1982) with those of Star Wars:
Episode 1 (1999). Yet the common opinion still holds that synthetic 3D images
generated by computer graphics are not yet (or perhaps will never be) as
“realistic” in rendering visual reality as images obtained through a photographic
lens. In this section I will suggest that this common opinion is mistaken. Such
synthetic photographs are already more “realistic” than traditional photographs. In
fact, they are too real.
This, at first sight, paradoxical argument will become less strange once we
place the current preoccupation with photorealiasm in a longer historical
framework, considering not only the present and recent past (computer imaging
and analog film, respectively) but also both more distant past and the future of
visual illusionism. For while the computer graphics field tries desperately to
replicate the particular kind of images created by twentieth century film
technology, these images represent only one episode in a longer history of visual
culture. We should not assume that the history of illusion ends with 35 mm
frames projected on the screen across the movie hall — even if a film camera is
substituted by computer software, a film projector is substituted by a digital
projector and the film reel itself is substituted by data transmitted over computer
network.
Georges Méliès, the father of computer graphics
When a future historian will write about the computerization of cinema in the
1990s, she will highlight such movies as Terminator 2 and Jurassic Park. Along
with a few others, these films by John Cameron and George Lucas were
responsible for turning Hollywood around: from still being highly skeptical about
computer animation in the early 1990s to fully embracing it by the middle of the
decade. These two movies, along with the host of others which followed, Titanic,
Star Wars: Episode 1 and so on, dramatically demonstrated that total synthetic
realism seemed to be in sight. Yet, they also exemplified the triviality of what at
first may appear to be an outstanding technical achievement — the ability to fake
visual reality. For what is faked is, of course, not reality but photographic reality,
181
reality as seen by the camera lens. In other words, what computer graphics has
(almost) achieved is not realism, but only photorealism — the ability to fake not
our perceptual and bodily experience of reality but only its photographic
image.217 This image exists outside of our consciousness, on a screen — a
window of limited size which presents a still imprint of a small part of outer
reality, filtered through the lens with its limited depth of field, and then filtered
through film's grain and its limited tonal range. It is only this film-based image
which computer graphics technology has learned to simulate. And the reason we
may think that computer graphics has succeeded in faking reality is that we, over
the course of the last hundred and fifty years, has come to accept the image of
photography and film as reality.
What is faked is only a film-based image. Once we came to accept the
photographic image as reality the way to its future simulation was open. What
remained were small details: the development of digital computers (1940s)
followed by a perspective-generating algorithm (early 1960s), and then working
out how to make a simulated object solid with shadow, reflection and texture
(1970s), and finally simulating the artifacts of the lens such as motion blur and
depth of field (1980s). So, while the distance from the first computer graphics
images circa 1960 to the synthetic dinosaurs of Jurassic Park in the 1990s is
tremendous, we should not be too impressed. For, conceptually, photorealistic
computer graphics had already appeared with Félix Nadar's photographs in the
1840s and certainly with the first films of Georges Méliès in the 1890s.
Conceptually, these are the inventors of 3D photorealistic computer graphics.
In saying this I do not want to negate the human ingenuity and the
tremendous amount of labor which today goes into creating computer-generated
special effects. Indeed, if our civilization has any equivalent to Medieval
cathedrals, it is special effects Hollywood films. They are truly epic both in their
scale and the attention to detail. Assembled by thousands of highly skilled
craftsmen over the course of years, each such movie is the ultimate display of
collective craftsmanship we have today. But if Medieval masters left after
themselves the material wonders of stone and glass inspired by religious faith,
today our craftsmen leave just the pixel sets to be projected on movie theater
screens or played on computer monitors. These are immaterial cathedrals made of
light; and appropriately, they often still have religious references, both in the
stories (consider for example Christian references in Star Wars: Episode 1:
Skywalker was conceived without a father, etc.) and in the grandeur and
transcendence of virtual sets.
Jurassic Park and Socialist Realism
182
Consider one of these immaterial cathedrals: George Lucas’s Jurassic Park. This
triumph of computer simulation took more than two years of work by dozens of
designers, animators, and programmers of Industrial Light and Magic (ILM), one
of the premier company specializing in the production of computer animation for
feature films in the world today. Because a few seconds of computer animation
often requires months and months of work, only the huge budget of a Hollywood
blockbuster could pay for such extensive and highly detailed computer generated
scenes as seen in Jurassic Park. Most of the 3D computer animation produced
today has a much lower degree of photorealism and this photorealism, as I shown
in the previous section, is uneven, higher for some kinds of objects and lower for
others. And even for ILM photorealistic simulation of human beings, the ultimate
goal of computer animation, still remains impossible. (Some scenes in 1997
Titanic feature hundreds of synthetic human figures, yet they appear for a few
seconds and are quite small, being far away from the camera.)
Typical images produced with 3D computer graphics still appear
unnaturally clean, sharp, and geometric looking. Their limitations especially stand
out when juxtaposed with a normal photograph. Thus one of the landmark
achievements of Jurassic Park was the seamless integration of film footage of real
scenes with computer simulated objects. To achieve this integration, computer-
generated images had to be degraded; their perfection had to be diluted to match
the imperfection of film's graininess.
First, the animators needed to figure out the resolution at which to render
computer graphics elements. If the resolution were too high, the computer image
would have more detail than the film image and its artificiality would become
apparent. Just as Medieval masters guarded their painting secrets now leading
computer graphics companies carefully guard the resolution of image they
simulate.
Once computer-generated images are combined with film images
additional tricks are used to diminish their perfection. With the help of special
algorithms, the straight edges of computer-generated objects are softened. Barely
visible noise is added to the overall image to blend computer and film elements.
Sometimes, as in the final battle between the two protagonists in Terminator 2,
the scene is staged in a particular location (in this example, a smoky factory)
which justifies addition of smoke or fog to further blend the film and synthetic
elements together.
So, while we normally think that synthetic photographs produced with
computer graphics are inferior to real photographs, in fact, they are too perfect.
But beyond that we can also say that paradoxically they are also too real.
The synthetic image is free of the limitations of both human and camera
vision. It can have unlimited resolution and an unlimited level of detail. It is free
of the depth-of-field effect, this inevitable consequence of the lens, so everything
is in focus. It is also free of grain — the layer of noise created by film stock and
by human perception. Its colors are more saturated and its sharp lines follow the
183
economy of geometry. From the point of view of human vision it is hyperreal.
And yet, it is completely realistic. Synthetic image is a result of a different, more
perfect than human, vision.
Whose vision is it? It is the vision of a computer, a cyborg, a automatic
missile. It is a realistic representation of human vision in the future when it will be
augmented by computer graphics and cleansed from noise. It is the vision of a
digital grid. Synthetic computer-generated image is not an inferior representation
of our reality, but a realistic representation of a different reality.
By the same logic, we should not consider clean, skinless, too flexible, and
in the same time too jerky, human figures in 3D computer animation as
unrealistic, as imperfect approximation to the real thing — our bodies. They are
perfectly realistic representation of a cyborg body yet to come, of a world reduced
to geometry, where efficient representation via a geometric model becomes the
basis of reality. The synthetic image simply represents the future. In other words,
if a traditional photograph always points to the past event, a synthetic photograph
points to the future event.
Is this a totally new situation? Was there already an aesthetics which
consistently pointed to the future? In order to help us locate this aesthetics
histroically, I will invoke a painting by Russian-born conceptual artists Komar
and Melamud. Called “Bolsheviks Returning Home after a Demonstration”
(1981-1982), it depicts two workers, one carrying a red flag, who came across a
tiny dinosaur, smaller than a human hand, standing in the snow. Part of “Nostalgic
Socialist Realism” series, this painting was created a few years after the painters
arrived to the United States, well before Hollywood embraced computer-
generated visuals. Yet it seems to comment on such movies as Jurassic Park and
on Hollywood as a whole, connecting its fictions with the fictions of Soviet
history as depicted by Socialist Realism, the official style of Soviet art from the
early 1930s until the late 1950s.
Taking the hint from this panting, we are now in a position to characterize
the aesthetics of Jurassic Park. This aesthetic is one of Soviet Socialist Realism.
Socialist Realism wanted to show the future in the present by projecting the
perfect world of future socialist society on a visual reality familiar to the viewer
— streets, interiors and faces of Russia in the middle of the twentieth century —
tired and underfed, scared and exhausted from fear, unkempt and gray. Socialist
realism had to retain enough of then everyday reality while showing how that
reality would look in the future when everyone's body will be healthy and
muscular, every street modern, every face transformed by the spirituality of
communist ideology. This is its difference from pure science fiction which does
not have to carry any feature of today reality into the future. In contrast, Socialist
realism had to superimpose future into the present, projecting the Communist
ideal into the very different reality familiar to the viewers. Importantly, Socialist
Realism never depicted this future directly: there is not a single Socialist Realist
work of art set in the future. Science fiction as a genre did not exist from early
184
1930s until Stalin’s death. The idea was not to make the workers dream about the
perfect future closing their eyes to imperfect reality, but rather to make them see
the signs of this future in the reality around them. This is one of the meanings
behind Vertov’s notion of “communist decoding of the world.” To decode the
world in such a way means to recognize the future all around you.
The same superimposition of future onto the present happens in Jurassic
Park. It tries to show the future of sight itself — the perfect cyborg vision which
is free of noise and capable of grasping infinite details. This vision is exemplified
by the original computer graphics images before they were blended with film
images. But just as Socialist Realist paintings blended the perfect future with the
imperfect reality, Jurassic Park blends the future super-vision of computer
graphics with the familiar vision of film image. In Jurassic Park, the computer
image bends down before the film image, its perfection is undermined by every
possible means and is also masked by the film's content. As I already described,
computer generated images, originally clean and sharp, free of focus and grain,
are degraded in a variety of ways: resolution is reduced, edges are softened, depth
of field and grain effect are artificially added. Additionally, the very content of
the film — the prehistoric dinosaurs which came to life — can be interpreted as
another way to mask the potentially disturbing reference to our cyborg future. The
dinosaurs are present to tell us that computer images belong safely to the past long
gone — even though we have every reason to believe that they are messengers
from the future still to come.
In that respect Jurassic Park and Terminator 2 are the opposites. If in
Jurassic Park the dinosaurs function to convince us that computer imagery
belongs to the past, the Terminator in Terminator 2 is more “honest.” He himself
is a messenger from the future. Accordingly, he is a cyborg who can take on the
human appearance. His true from is that of a futuristic alloy. In perfect
correspondence with this logic, this form is represented with computer graphics.
While his true body perfectly reflects its surrounding reality, the very nature of
these reflection shows to us the future of human and machine sight. The
reflections are extra-sharp and clean, without any blur. This is indeed the look
produced by the reflection mapping algorithm, one of the standard techniques to
achieve photorealism. Thus, to represent the Terminator who came from the
future the designers used the standard computer graphics techniques without
degrading them; in contrast, in Jurassic Park the dinosaurs which came from the
past were created by systematically degrading computer images. What of course
is the past in this movie is the film medium itself: its grain, its depth of focus, its
motion blur, its low resolution.
This is, then, the paradox of 3D photorealistic computer animation. Its
images are not inferior to the visual realism of traditional photography. They are
perfectly real — all too real.
185
Illusion, Narrative and Interactivity
Having analyzed computer illusionism from the points of view of its production
and the longer history of visual illusion, I now want to look at it from a different
perspective. While the existing theories of illusionism assume that the subject acts
strictly a viewer, the new media more often than not turns the subject into the
user. The subject is expected to interact with a representation: click on menus or
the image itself, making selections and decisions. What effect does interactivity
has on reality effect of an image? Is the fidelity of simulation of physical laws or
human motivation more important for “realism” of a representation than its purely
visual qualities? For instance, is a racing game which uses a more precise
collision model but poor visuals feels more real than the game which has richer
images but less precise model? Or do the simulation dimensions and visual
dimensions support each other, adding up to create a total effect?
In this section I will focus on a particular aspect of this more general
question: production of illusionism in interactive computer objects. The aspect
which I will consider has to do with time. Web sites, virtual worlds, computer
games and many other types of hypermedia applications are characterized by a
peculiar temporal dynamic: constant, repetitive shifts between an illusion and its
suspense. These new media objects keep reminding us about their artificiality,
incompleteness, and constructedness. They present us with a perfect illusion only
to reveal the underlying machinery next.
Web surfing in the 1990s provides a perfect example. A typical user may
be spending equal time looking at a page and waiting for the next page to
download. During waiting periods, the act of communication itself — bits
traveling through the network — becomes the message. The user keeps checking
whether the connection is being made, glancing back and forth between the
animated icon and the status bar. Using Roman Jakobson's model of
communication functions, we can say that communication comes to be dominated
by contact, or phatic function — it is centered around the physical channel and the
very act of connection between the addresser and the addressee.218
Jakobson writes about verbal communication between two people who, in
order to check whether the channel works, address each other: "Do you hear
me?," "Do you understand me?" But in Web communication there is no human
addresser, only a machine. So as the user keeps checking whether the information
is coming, she actually addresses the machine itself. Or rather, the machine
addresses the user. The machine reveals itself, it reminds the user of its existence
— not only because the user is forced to wait but also because she is forced to
witness how the message is being constructed over time. A page fills in part by
part, top to bottom; text comes before images; images arrive in low resolution and
186
are gradually refined. Finally, everything comes together in a smooth sleek image
— the image which will be destroyed with the next click.
Interaction with most 3D virtual worlds is characterized by the same
temporal dynamic. Consider the technique called "distancing" or "level of detail,"
which for years has been used in VR simulations and later was adapted to 3D
games and VRML scenes. The idea is to render the models more crudely when
the user is moving through virtual space; when the user stops, details gradually fill
in. Another variation of the same technique involves creating a number of models
of the same object, each with progressively less detail. When the virtual camera is
close to an object, a highly detailed model is used; if the object is far away, a
lesser detailed version is substituted to save unnecessary computation.
A virtual world which incorporates these techniques has a fluid ontology
that is affected by the actions of the user. As the user navigates through space the
objects switch back and forth between pale blueprints and fully fleshed out
illusions. The immobility of a subject guarantees a complete illusion; the slightest
movement destroys it.
Navigating a QuickTime VR movie is characterized by a similar dynamic.
In contrast to the nineteenth century panorama that it closely emulates,
QuickTime VR continuously deconstructs its own illusion. The moment you
begin to pan through the scene, the image becomes jagged. And, if you try to
zoom into the image, all you get are oversized pixels. The representational
machine keeps hiding and revealing itself.
Compare this dynamic to traditional cinema or realist theater which aims
at all costs to maintain the continuity of the illusion for the duration of the
performance. In contrast to such totalizing realism, new media aesthetics has a
surprising affinity to twentieth century leftist avant-garde aesthetics. Playwright
Bertold Brecht's strategy to reveal the conditions of an illusion's production,
echoed by countless other leftist artists, has become embedded in hardware and
software themselves. Similarly, Walter Benjamin's concept of "perception in the
state of distraction"219 has found a perfect realization. The periodic reappearance
of the machinery, the continuous presence of the communication channel in the
message prevent the subject from falling into the dream world of illusion for very
long, making her alternate between concentration and detachment.
While virtual machinery itself already acts as an avant-garde director, the
designers of interactive media, such as games, DVD titles, interactive cinema, and
interactive television programs, often consciously attempt to structure the
subject's temporal experience as a series of periodic shifts. The subject is forced
to oscillate between the roles of viewer and user, shifting between perceiving and
acting, between following the story and actively participating in it. During one
segment the computer screen presents the viewer with an engaging cinematic
narrative. Suddenly the image freezes, menus and icons appear and the viewer is
forced to act: make choices; click; push buttons. The most pure example of such
187
cyclical organization of user’s experience is the computer games which alternate
between FMV (full motion video) segments and the segments which require
user’s input, such as Wing Commander series. Moscow media theorist Anatoly
Prokhorov described these shifts in terms of two different identities of a computer
screen: transparent and opague. The screen keeps shifting from being transparent
to being opaque — from a window into a fictional 3D universe to a solid surface,
full of menus, controls, text and icons.220 Three-dimensional space becomes
surface; a photograph becomes a diagram; a character becomes an icon. To use
the opposition introduced in “Cultural Interfaces” section, we can say that the
screen keeps alternates between the dimensions of representation and control.
What at one moment was a fictional universe becomes a set of buttons which
demand action.
The effect of these shifts on the subject is hardly one of liberation and
enlightenment. While modernist avant-garde theater and film directors
deliberately highlighted machinery and conventions involved in producing and
keeping the illusion in their works — for instance, having actors directly address
the audience or pulling away the camera to show the crew and the set — the
systematic “auto-deconstruction” performed by computer objects, applications,
interfaces and hardware does not seem to distract the user from giving in to the
reality effect. The cyclical shifts between illusion and its destruction appear to
neither distract from it nor support it. It is tempting to compare these temporal
shifts to shot / counter-shot structure in cinema and to understand them as a new
kind of suturing mechanism. By having periodically to complete the interactive
text through active participation the subject is interpolated in it. Thus, if we adopt
the notion of suture, it would follow that the periodic shifts between illusion and
its suspension are necessary to fully involve the subject in the illusion.221
Yet clearly we are dealing with something which goes beyond old-style
realism of analog era. We can call this new realism meta-realism since it
incorporates its own critique inside itself. It emergence can be related to a larger
cultural change. Old realism corresponded to the functioning of ideology during
modernity: totalization of a semiotic field, "false consciousness," complete
illusion. But today ideology functions differently: it continuously and skillfully
deconstructs itself, presenting the subject with countless "scandals" and
"investigations." The leaders of the middle of the twentieth century were
presented as invincible; as being always right, and, in the case of Stalin and
Hitler, as true saints not capable of any human sin. Today we expect to learn
about the scandals involving our leaders, and these scandals do not really
diminish their credibility. Similarly, contemporary television commercials often
make fun of themselves and advertising in general; this does not prevent them
from selling whatever they are designed to sell. Auto-critique, scandal, revelation
of its machinery became a new structural component of modern ideology: witness
the 1998 episode when MTV created an illusion on its Web site that somebody
188
hacked it. The ideology does not demand that the subject blindly beliefs it, as it
did early in the twentieth century; rather, it puts the subject in a master position of
somebody who knows very well that she is being fooled, and generously lets her
be fooled. You know, for instance, that creating a unique identity through a
commercially mass produced style is meaningless — but anyway you buy the
expensively styled clothes, choosing from a menu: “military,” “bohemian,”
“flower child,” “inner city, “ clubbing,” and so on. The periodic shifts between
illusion and its suspension in interactive media, described here, can be seen as
another example of the same general phenomenon. Just as classical ideology,
classical realism demanded that the subject completely accepted the illusion for as
along as it lasted. In contrast, the new meta-realism is based on oscillation
between illusion and its destruction, between immersing a viewer in illusion and
directly addressing her. In fact, the user is even put in much stronger position of
mastery when she ever is by “auto-deconstructing” commercials, newspaper
reports of “scandals” and other traditional non-interactive media. Once illusion
stops, the user can make choices, re-direct game narrative or get additional
information from other Web sites conveniently linked by the designers. The user
invests into illusion precisely because she is given control over it.
If this analysis is correct, the counter-arguments that this oscillation is
simply an artifact of the current technology and that the advances in hardware will
eliminate it, would not work. The oscillation analyzed here is not an artifact of
computer technology but a structural feature of modern society, present not just in
interactive media but in numerous other social realms and on many different
levels.
This may explain the popularity of this particular temporal dynamics in
interactive media, but it does not address another question: does it work
aesthetically? Can Brett and Hollywood be married? Is it possible to create a new
temporal aesthetics, even a language, based on cyclical shifts between perception
and action? In my view, the most successful example of such an aesthetics already
in existence is a military simulator, the only mature form of interactive narrative.
It perfectly blends perception and action, cinematic realism and computer menus.
The screen presents the subject with an illusionistic virtual world while
periodically demanding quick actions: shooting at the enemy; changing the
direction of a vehicle; and so on. In this art form, the roles of a viewer and a
actant are blended perfectly — but there is a price to pay. The narrative is
organized around a single and clearly defined goal: staying alive.
The games modeled after simulators — first of all, first person shooters
such as Doom, Quake and Tomb Rider, but also flight and racing simulators —
have been also quite successful. In contrast to interactive narratives such as Wing
Commander, Myst, Riven, or Bad Day on the Midway which are based on
temporal oscillation between two distinct states, non-interactive movie-like
presentation and interactive game play, in these games these two states — which
are also two states of the subject (perception and action) and the two states of a
189
screen (transparent and opaque) — co-exist together. As you run through the
corridors shooting at enemies or control the car on the racetrack, you also keep
your eyes on the readouts which tell about the “health” of your character, the
damage level of your vehicle, the availability of ammunition, and so on.
As a conclusion, I would like to offer a different interpretation of the
temporal oscillation in new media which will relate it not to the social realm
outside of new media but to other similar effects specific to new media itself. The
oscillation between illusionary segments and interactive segments forces the user
to switch between different mental sets — different kinds of cognitive activity.
These switches are typical of using modern computer use in general. The user
analyses the quantitative data; next she is using a search engine; next she starts a
new application; next she navigates through space in a computer game; next she
may go back to using a search engine; and so on. In fact, the modern HCI which
allows the user to run a number of programs at the same time and to keep a
number of windows open on the screen at once posits multi-tasking as the social
and cognitive norm. This multi-tasking demands from the user “cognitive multi-
tasking” — rapidly alternating between different kinds of attention, problem
solving and other cognitive skills. All in all, modern computing requires from a
user intellectual problem solving, systematic experimentation and the quick
learning of new tasks. Thus, just as any particular software application is
embedded, both metaphorically and literally, within the larger framework of the
operating system, new media embeds cinema-style illusions within the larger
framework of an interactive control surface. Illusion is subordinated to action;
depth to surface; a window into an imaginary universe to a control panel. From
commanding a dark movie theater, this twentieth century illusion and therapy
machine par exellance, a cinema image becomes just a small window on a
computer screen; one stream among many others coming to us through the
network; one file among numerous others on our hard drives.
190
V. The Forms
August 5, 1999. I am sitting in the lobby of Razorfish Studios, which was named
by Adweek among 10 top interactive agencies in the world for 1998.222 The
company’s story is Silicon Alley legend. It was founded in 1995 by two partners
in their East Village loft; by 1997 it had 45 employees; by 1999 the number grew
to 600 (this includes a number of companies around the world Razorfish
acquired). Razorfish projects range from screen savers to Charles Shwabb online
trading Web site. At the time of my visit, the studios are housed in two floors of a
building on Grand Street in Soho, between Broadway and Mercer, a few blacks
from Prada, Hugo Boss and other designer shops. Open space houses loosely
positioned workspaces occupied mostly by 20-something (although I notice a
busy programmer who can’t be older then 18). The design of the space functions
(intentionally so) as a metaphor for computer culture’s key themes: interactivity,
lack of hierarchy, modularity. In contrast to traditional office architecture were
the reception area acts as a getaway between the visitor and the company, here
this desk looks like just another workstation, set aside from the entrance. On
entering the space you can go the reception desk or you can directly make your
way to any workstation on the floor. Stylishly dressed 20-somethings of both
genders appear and disappear in the elevator at regular intervals. It is pretty quite,
except the little noises made by numerous computers as they save and retrieve
files. One of the co-founders, still in his early 30s, gives me a quick tour of the
place. Although Razorfish is the established design leader in the virtual world of
computer screens and networks, our tour is focused on the physical world. He
proudly points out that the workers are scattered around the open space regardless
of their job titles: a programmer next to interface designer next to Web designer.
He notes that the reception area composed of a desk and two semi-circular sofas
mimics the image — Razorfish logo. He talks about Razorfish plans to venture
into product design. “Our goal is to provide total user experience. Right now a
client thinks that if he needs the design for buttons on the screen, he hires
Razorfish; but if he needs real buttons, he goes to another shop. We want to
change this.”
The original 1970s paradigm of Graphical User Interface (GUI) emulated
familiar physical interfaces: a file cabinet, a desk, a trash can, a control panel.
After leaving Razorfish Studios, I stop at Venus by Patricia Field, a funky store
on West Broadway where I buy an orange and blue valet which has two plastic
buttons on its cover, an emulation of forward and reverse buttons of a Web
browser. The buttons do not do anything (yet); they simply signify “computer.”
Over the course of twenty years, the culture came full circle. If, with GUI, the
191
physical environment migrated into computer screen, now the conventions of GUI
are migrating back into our physical reality. The same trajectory can be traced in
relation to other conventions, or forms, of computer media. A collection of
documents and a navigable space, these traditional methods to organize both data
and human experience of the world itself, became two of these forms which today
can be found in most areas of new media. The first form is a database, used to
store any kind of data — from financial records to digital movie clips; the second
form is a virtual interactive 3D space, employed in computer games, motion rides,
VR, computer animation, and human-computer interfaces. In migrating to a
computer environment, a collection and navigable space were not left unchanged;
on the contrary, they came to incorporate computer’s particular techniques for
structuring and accessing data, such as modularity, as well as its fundamental
logic: that of computer programming. So, for instance, a computer database is
quite different from a traditional collection of documents: it allows to quickly
access, sort and re-organize millions of records; it can contain different media
types, and it assumes multiple indexing of data, since each record besides the data
itself contains a number of fields with user-defined values.
Today, in a perfect illustration of the transcoding principle (see Chapter
1), these two computer-based forms migrate back into culture at large, both
literally and conceptually. A library, a museum, in fact, any large collection of
cultural data are being substituted by a computer database. At the same time, a
computer database becomes a new metaphor which we use to conceptualize
individual and collective cultural memory, a collection of documents or objects,
and other phenomena and experiences. Similarly, computer culture uses 3D
navigable space to visualize any kind of data — molecules and historical records,
files in a computer, the Internet as a whole, and the semantics of human language.
(For instance, the software from plumbdesign renders English thesaurus as a
structure in 3D space.223) And, with many computer games, the human
experience of being in a world and the narrative itself are being represented as
continuos navigation through space (think, for example, of Tomb Rider). In short,
a computer database and 3D computer-based virtual space became true cultural
forms — general ways used by culture to represent the human experience, the
world, and human existence in this world.
Why does computer culture privilege these forms over other
possibilities?224 We may associate the first genre with work (post-industrial labor
of information processing) and the second with leisure and fun (computer games),
yet this very distinction is no longer valid in computer culture. As I already noted
in the introduction to “Interface” chapter, increasingly the same metaphors and
interfaces are used at work and at home, for business and for entertainment. For
instance, the user navigates through a virtual space both to work and to play,
whether analyzing scientific data or killing enemies in Quake.
192
We may arrive at a better explanation if we look at how these two forms
are used in new media design. From one perspective, all new media design can be
reduced to these two approach. That is, creating works in new media can be
understood as either constructing the right interface to a multimedia database or as
defining navigation methods through spatialized representations. The first
approach is typically used in self-contained hypermedia and Web sites — in short,
whenever the main goal is to provide an interface to data. The second approach is
used in most computer games and virtual worlds. What is the logic here? Web
sites and hypermedia programs usually aim to give user efficient access to
information, while games and virtual worlds aim to psychologically “submerge”
the user in an imaginary universe. It is appropriate that database has emerged as
perfect vehicle for the first goal while navigable space meets the demands of the
second. It accomplishes the same effects which before were created by literary
and cinematic narrative.
Sometimes either of these two goals, information access and psychological
engagement with an imaginary world, solely shapes the design of a new media
object. The example of the former would be a search engine site; the example of a
later would be games such as Riven or Unreal. However in general these two
goals should be thought of as extreme cases of a single conceptual continuum.
Such supposedly “pure” example of an information-oriented object as a Yahoo,
Hotbot or other search sites aim to “immerse” the user in its universe, prevent her
from going to other sites. And such supposedly pure “psychological immersion”
objects as Riven or Unreal have a strong “information processing” dimension.
This dimension makes playing these games more like reading a detective story or
playing chess then being engaged with traditional literary and film fictional
narrative. Gathering clues and treasures; constantly updating a mental map of the
universe of the game, including the positions of pathways, doors, places to avoid
and so on; keeping track of one’s ammunition, health and other levels — all this
aligns playing a computer game with other “information processing” tasks typical
of computer culture, like searching Internet, scanning through news groups,
pulling records from a database, using a spreadsheet, or data mining large data
stores.
Often, the two goals of information access and psychological engagement
compete within the same new media object. Along with surface versus depth, the
opposition between information and “immersion” can be thought of as particular
expression of the more general opposition characteristic of new media: between
action and representation. And just as it is the case with surface and depth
opposition, discussed in “Cultural Interfaces” and “Illusion, Narrative and
Interactivity” sections, the results of this competition are often awkward and
uneasy. For instance, an image which embeds within itself a number of hyperlinks
offers neither a true psychological “immersion” nor easy navigation since the user
has to search for hyperlinks. Appropriately, games such as Jonny Mnemonic
(SONY, 1995) which inspired to become true interactive movies, chosen to avoid
193
either imbedding hyperlinks or displaying controls on the screen altogether,
instead relying on a keyboard the sole source of interactive control.
Narratology, the branch of modern literary theory devoted to the theory of
narrative, distinguishes between narration and description. Narration are parts of
the narrative which move the plot forward; description are the parts which do not.
The examples of description are passages which describe the landscape, or a city,
or character’s apartment In short, to use the language of information age,
description passages present the reader with descriptive information. As its name
itself implies, narratology paid most attention to narration and hardly any to
description. But in the information age narration and description has changed
roles. If traditional cultures provided people with well-defined narratives (myths,
religion) and little “stand-alone” information, today we have too much
information and too few narratives which can tie it all together. For better or
worse, information access become a key activity of a computer age. Therefore, we
need something which can be called “info-aesthetics” — a theoretical analysis of
the aesthetics of information access as well creation of new media objects which
“aestheticize” information processing. In the age when all design became
“information design,” and, to paraphrase the title of the famous book by the
architectural historian Sigfried Giedion225, “search engine takes command,”
information access is no longer just a key form of work but also a new key
category of culture. Thus it demands that we deal with it theoretically,
aesthetically and symbolically.
194
Database
The Database Logic
After the novel, and subsequently cinema privileged narrative as the key form of
cultural expression of the modern age, the computer age introduces its correlate
— database. Many new media objects do not tell stories; they don't have
beginning or end; in fact, they don't have any development, thematically, formally
or otherwise which would organize their elements into a sequence. Instead, they
are collections of individual items, where every item has the same significance as
any other.
Why does new media favor database form over others? Can we explain its
popularity by analyzing the specificity of the digital medium and of computer
programming? What is the relationship between database and another form,
which has traditionally dominated human culture — narrative? These are the
questions I will address in this section.
Before proceeding I need to comment on my use of the word database. In
computer science database is defined as a structured collection of data. The data
stored in a database is organized for fast search and retrieval by a computer and
therefore it is anything but a simple collection of items. Different types of
databases — hierarchical, network, relational and object-oriented — use different
models to organize data. For instance, the records in hierarchical databases are
organized in a treelike structure. Object-oriented databases store complex data
structures, called "objects," which are organized into hierarchical classes that may
inherit properties from classes higher in the chain.226 New media objects may or
may not employ these highly structured database models; however, from the point
of view of user's experience a large proportion of them are databases in a more
basic sense. They appear as a collections of items on which the user can perform
various operations: view, navigate, search. The user experience of such
computerized collections is therefore quite distinct from reading a narrative or
watching a film or navigating an architectural site. Similarly, literary or cinematic
narrative, an architectural plan and database each present a different model of
what a world is like. It is this sense of database as a cultural form of its own
which I want to address here. Following art historian Ervin Panofsky's analysis of
linear perspective as a "symbolic form" of the modern age, we may even call
database a new symbolic form of a computer age (or, as philosopher Jean-
Francois Lyotard called it in his famous 1979 book Postmodern Condition,
"computerized society"),227 a new way to structure our experience of ourselves
and of the world. Indeed, if after the death of God (Nietzche), the end of grand
195
Narratives of Enlightenment (Lyotard) and the arrival of the Web (Tim Berners-
Lee) the world appears to us as an endless and unstructured collection of images,
texts, and other data records, it is only appropriate that we will be moved to model
it as a database. But it is also appropriate that we would want to develops poetics,
aesthetics, and ethics of this database.
Let us begin by documenting the dominance of database form in new
media. The most obvious examples of this are popular multimedia encyclopedias,
which are collections by their very definition; as well as other commercial CD-
ROM titles which are collections as well — of recipes, quotations, photographs,
and so on.228 The identity of a CD-ROM as a storage media is projected onto
another plane, becoming a cultural form of its own. Multimedia works which have
"cultural" content appear to particularly favor the database form. Consider, for
instance, the "virtual museums" genre — CD-ROMs which take the user on a
"tour" through a museum collection. A museum becomes a database of images
representing its holdings, which can be accessed in different ways:
chronologically, by country, or by artist. Although such CD-ROMs often simulate
the traditional museum experience of moving from room to room in a continuous
trajectory, this "narrative" method of access does not have any special status in
comparison to other access methods offered by a CD-ROM. Thus the narrative
becomes just one method of accessing data among others. Another example of a
database form is a multimedia genre which does not has an equivalent in
traditional media — CD-ROMs devoted to a single cultural figure such as a
famous architect, film director or writer. Instead of a narrative biography we are
presented with a database of images, sound recordings, video clips and/or texts
which can be navigated in a variety of ways.
CD-ROMs and other digital storage media (floppies, DVD) proved to be
particularly receptive to traditional genres which already had a database-like
structure, such as a photo-album; they also inspired new database genres, like a
database biography. Where the database form really flourished, however, is on the
Internet. As defined by original HTML, a Web page is a sequential list of separate
elements: text blocks, images, digital video clips, and links to other pages. It is
always possible to add a new element to the list — all you have to do is to open a
file and add a new line. As a result, most Web pages are collections of separate
elements: texts, images, links to other pages or sites. A home page is a collection
of personal photographs. A site of a major search engine is a collection of
numerous links to other sites (along with a search function, of course). A site of a
Web-based TV or radio station offers a collections of video or audio programs
along with the option to listen to the current broadcast; but this current program is
just one choice among many other programs stored on the site. Thus the
traditional broadcasting experience, which consisted solely of a real-time
transmission, becomes just one element in a collection of options. Similar to the
CD-ROM medium, the Web offered fertile ground to already existing database
196
genres (for instance, bibliography) and also inspired the creation of new ones such
as the sites devoted to a person or a phenomenon (Madonna, Civil War, new
media theory, etc.) which, even if they contain original material, inevitably center
around the list of links to other Web pages on the same person or phenomenon.
The open nature of the Web as medium (Web pages are computer files
which can always be edited) means that the Web sites never have to be complete;
and they rarely are. The sites always grow. New links are being added to what is
already there. It is as easy to add new elements to the end of list as it is to insert
them anywhere in it. All this further contributes to the anti-narrative logic of the
Web. If new elements are being added over time, the result is a collection, not a
story. Indeed, how can one keep a coherent narrative or any other development
trajectory through the material if it keeps changing?
Commercial producers have experimented with ways to explore the
database form inherent to new media, with offerings ranging from multimedia
encyclopedias, to collections of software, to collections of pornographic images.
In contrast, many artists working with new media at first uncritically accepted the
database form as a given. Thus they became blind victims of database logic.
Numerous artists' Web sites are collections of multimedia elements documenting
their works in other media. In the case of many early artists' CD-ROMs as well,
the tendency was to fill all the available storage space with different material: the
main work, documentation, related texts, previous works and so on.
As the 1990s progressed, artists increasingly begun to approach database
more critically.229 A few examples of projects investigating database politics and
possible aesthetics.are Chris Marker's "IMMEMORY," Olga Lialina's "Anna
Karenina Goes to Paradise,"230 Stephen Mamber's "Digital Hitchcock," and
Fabian Wagmister's "...two, three, many Guevaras." The artist who have explored
possibilities of a database most systematically is George Legrady. In a series of
interactive multimedia works ("The Anecdoted Archive," 1994; "[the clearning],"
1994; "Slippery Traces, 1996; "Tracing," 1998) he used diffirent types of
databases to create “an information structure where stories/things are organized
according to mutiple thematic connections."231
Data and Algorithm
Of course not all new media objects are explicitly databases. Computer games, for
instance, are experienced by their players as narratives. In a game, the player is
given a well-defined task — winning the match, being first in a race, reaching the
last level, or reaching the highest score. It is this task which makes the player
experience the game as a narrative. Everything which happens to her in a game,
all the characters and objects she encounters either take her closer to achieving the
197
goal or further away from it. Thus, in contrast to the CD-ROM and Web
databases, which always appear arbitrary since the user knows that additional
material could have been added without in any way modifying the logic of the
database, in a game, from a user's point of view, all the elements are motivated (
i.e., their presence is justified).232
Often the narrative shell of a game ("you are the specially trained
commando who has just landed on a Lunar base; your task is to make your way to
the headquarters occupied by the mutant base personnel...") masks a simple
algorithm well-familiar to the player: kill all the enemies on the current level,
while collecting all treasures it contains; go to the next level and so on until you
reach the last level. Other games have different algorithms. Here is an algorithm
of the legendary "Tetris": when a new block appears, rotate it in such a way so it
will complete the top layer of blocks on the bottom of the screen making this
layer disappear. The similarity between the actions expected from the player and
computer algorithms is too uncanny to be dismissed. While computer games do
not follow database logic, they appear to be ruled by another logic — that of an
algorithm. They demand that a player executes an algorithm in order to win.
An algorithm is the key to the game experience in a different sense as
well. As the player proceeds through the game, she gradually discovers the rules
which operate in the universe constructed by this game. She learns its hidden
logic, in short its algorithm. Therefore, in games where the game play departs
from following an algorithm, the player is still engaged with an algorithm, albeit
in another way: she is discovering the algorithm of the game itself. I mean this
both metaphorically and literally: for instance, in a first person shooter, such as
"Quake," the player may eventually notice that under such and such condition the
enemies will appear from the left, i.e. she will literally reconstruct a part of the
algorithm responsible for the game play. Or, in a different formulation of the
legendary author of Sim games Will Wright, "Playing the game is a continuos
loop between the user (viewing the outcomes and inputting decisions) and the
computer (calculating outcomes and displaying them back to the user). The user is
trying to build a mental model of the computer model."233
What we encountered is another example of the general principle of
transcoding discussed in Chapter 1: the projection of the ontology of a computer
onto culture itself. If in physics the world is made of atoms and in genetics it is
made of genes, computer programming encapsulates the world according to its
own logic. The world is reduced to two kinds of software objects which are
complementary to each other: data structures and algorithms. Any process or task
is reduced to an algorithm, a final sequence of simple operations which a
computer can execute to accomplish a given task. And any object in the world —
be it the population of a city, or the weather over the course of a century, a chair,
a human brain — is modeled as a data structure, i.e. data organized in a particular
198
way for efficient search and retrieval.234 Examples of data structures are arrays,
linked lists and graphs. Algorithms and data structures have a symbiotic
relationship. The more complex the data structure of a computer program, the
simpler the algorithm needs to be, and vice versa. Together, data structures and
algorithms are two halves of the ontology of the world according to a computer.
The computerization of culture involves the projection of these two
fundamental parts of computer software — and of the computer's unique ontology
— onto the cultural sphere. If CD-ROMs and Web databases are cultural
manifestations of one half of this ontology — data structures, then computer
games are manifestations of the second half — algorithms. Games (sports, chess,
cards, etc.) are one cultural form which required algorithm-like behavior from the
players; consequently, many traditional games were quickly simulated on
computers. In parallel, new genres of computer games came into existence such as
a first person shooter ("Doom," "Quake"). Thus, as it was the case with database
genres, computer games both mimic already existing games and create new game
genres.
It may appear at first sight that data is passive and algorithm is active —
another example of passive-active binary categories so loved by human cultures.
A program reads in data, executes an algorithm, and writes out new data. We may
recall that before "computer science" and "software engineering" became
established names for the computer field, it was called "data processing." This
name remained in use for a few decades during which computers were mainly
associated with performing calculations over data. However, the passive/active
distinction is not quite accurate since data does not just exist — it has to be
generated. Data creators have to collect data and organize it, or create it from
scratch. Texts need to written, photographs need to be taken, video and audio
need to be recorded. Or they need to be digitized from already existing media. In
the 1990’s, when the new role of a computer as a Universal Media Machine
became apparent, already computerized societies went into a digitizing craze. All
existing books and video tapes, photographs and audio recordings started to be
fed into computers at an ever increasing rate. Steven Spielberg created the Shoah
Foundation which videotaped and then digitized numerous interviews with
Holocaust survivors; it would take one person forty years to watch all the
recorded material. The editors of Mediamatic journal, who devoted a whole issue
to the topic of "the storage mania" (Summer 1994) wrote: "A growing number of
organizations are embarking on ambitious projects. Everything is being collected:
culture, asteroids, DNA patterns, credit records, telephone conversations; it
doesn't matter."235 In 1996, financial company T. Rowe Price stored 800
gigabytes of data; by the Fall of 1999 this number rose to 10 terabytes.236
Once it is digitized, the data has to be cleaned up, organized, indexed. The
computer age brought with it a new cultural algorithm: reality-> media->data-
199
>database. The rise of the Web, this gigantic and always changing data corpus,
gave millions of people a new hobby or profession: data indexing. There is hardly
a Web site which does not feature at least a dozen links to other sites, therefore
every site is a type of database. And, with the rise of Internet commerce, most
large-scale commercial sites have become real databases, or rather front-ends to
company databases. For instance, in the Fall of 1998, Amazon.com, an online
book store, had 3 million books in its database; and the maker of leading
commercial database Oracle has offered Oracle 8i, fully integrated with the
Internet and featuring unlimited database size, natural-language queries and
support for all multimedia data types.237 Jorge Luis Borges's story about a map
which was equal in size to the territory it represented became re-written as the
story about indexes and the data they index. But now the map has become larger
than the territory. Sometimes, much larger. Porno Web sites exposed the logic of
the Web to its extreme by constantly re-using the same photographs from other
porno Web sites. Only rare sites featured the original content. On any given date,
the same few dozen images would appear on thousands of sites. Thus, the same
data would give rise to more indexes than the number of data elements
themselves.
Database and Narrative
As a cultural form, database represents the world as a list of items and it refuses to
order this list. In contrast, a narrative creates a cause-and-effect trajectory of
seemingly unordered items (events). Therefore, database and narrative are natural
enemies. Competing for the same territory of human culture, each claims an
exclusive right to make meaning out of the world.
In contrast to most games, most narratives do not require algorithm-like
behavior from their readers. However, narratives and games are similar in that the
user, while proceeding through them, must uncover its underlying logic — its
algorithm. Just like a game player, a reader of a novel gradually reconstructs an
algorithm (here I use it metaphorically) which the writer used to create the
settings, the characters, and the events. From this perspective, I can re-write my
earlier equations between the two parts of the computer's ontology and its
corresponding cultural forms. Data structures and algorithms drive different forms
of computer culture. CD-ROMs, Web sites and other new media objects which
are organized as databases correspond to the data structure; while narratives,
including computer games, correspond to the algorithms.
In computer programming, data structures and algorithms need each other;
they are equally important for a program to work. What happens in a cultural
sphere? Do databases and narratives have the same status in computer culture?
200
Some media objects explicitly follow database logic in their structure
while others do not; but behind the surface practically all of them are databases.
In general, creating a work in new media can be understood as the construction of
an interface to a database. In the simplest case, the interface simply provides the
access to the underlying database. For instance, an image database can be
represented as a page of miniature images; clicking on a miniature will retrieve
the corresponding record. If a database is too large to display all of its records at
once, a search engine can be provided to allow the user to search for particular
records. But the interface can also translate the underlying database into a very
different user experience. The user may be navigating a virtual three-dimensional
city composed from letters, as in Jeffrew Shaw's interactive installation "Legible
City."238 Or she may be traversing a black and white image of a naked body,
activating pieces of text, audio and video embedded in its skin (Harwood's CD-
ROM "Rehearsal of Memory.")239 Or she may be playing with virtual animals
which come closer or run away depending upon her movements (Scott Fisher et
al, VR installation, "Menagerie.")240 Although each of these works engages the
user in a set of behaviors and cognitive activities which are quite distinct from
going through the records of a database, all of them are databases. "Legible City"
is a database of three-dimensional letters which make up the city. "Rehearsal of
Memory" is a database of texts and audio and video clips which are accessed
through the interface of a body. And "Menagerie" is a database of virtual animals,
including their shapes, movements and behaviors.
Database becomes the center of the creative process in the computer age.
Historically, the artist made a unique work within a particular medium. Therefore
the interface and the work were the same; in other words, the level of an interface
did not exist. With new media, the content of the work and the interface become
separate. It is therefore possible to create different interfaces to the same material.
These interfaces may present different versions of the same work, as in David
Blair's WaxWeb.241 Or they may be radically different from each other, as in
Moscow WWWArt Centre.242 This is one of the ways in which the already
discussed principle of variability of new media manifests itself. But now we can
give this principle a new formulation. The new media object consists of one or
more interfaces to a database of multimedia material. If only one interface is
constructed, the result will be similar to a traditional art object; but this is an
exception rather than the norm.
This formulation places the opposition between database and narrative in a
new light, thus redefining our concept of narrative. The "user" of a narrative is
traversing a database, following links between its records as established by the
database's creator. An interactive narrative (which can be also called "hyper-
narrative" in an analogy with hypertext) can then be understood as the sum of
multiple trajectories through a database. A traditional linear narrative is one,
201
among many other possible trajectories; i.e. a particular choice made within a
hyper-narrative. Just as a traditional cultural object can now be seen as a
particular case of a new media object (i.e., a new media object which only has one
interface), traditional linear narrative can be seen as a particular case of a hyper-
narrative.
This "technical," or "material" change in the definition of narrative does
not mean that an arbitrary sequence of database records is a narrative. To qualify
as a narrative, a cultural object has to satisfy a number of criteria, which cultural
theorist Mieke Bal, the author of a standard textbook on narrative theory, defines
as follows: it should contain both an actor and a narrator; it also should contain
three distinct levels consisting of the text, the story, and the fabula; and its
"contents" should be "a series of connected events caused or experienced by
actors."243 Obviously, not all cultural objects are narratives. However, in the
world of new media, the word “narrative” is often used as all-inclusive term, to
cover up the fact that we have not yet developed a language to describe these new
strange objects. It is usually paired with another over-used word — interactive.
Thus, a number of database records linked together so that more than one
trajectory is possible, is assumed to be constitute "interactive narrative." But to
just create these trajectories is of course not sufficient; the author also has to
control the semantics of the elements and the logic of their connection so that the
resulting object will meet the criteria of narrative as outlined above. Another
erroneous assumption frequently made is that by creating her own path (i.e.,
choosing the records from a database in a particular order) the user constructs her
own unique narrative. However, if the user simply accesses different elements,
one after another, in a usually random order, there is no reason to assume that
these elements will form a narrative at all. Indeed, why should an arbitrary
sequence of database records, constructed by the user, result in "a series of
connected events caused or experienced by actors"?
In summary, database and narrative do not have the same status in
computer culture. In the database / narrative pair, database is the unmarked
term.244 Regardless of whether new media objects present themselves as linear
narratives, interactive narratives, databases, or something else, underneath, on the
level of material organization, they are all databases. In new media, the database
supports a range of cultural forms which range from direct translation (i.e., a
database stays a database) to a form whose logic is the opposite of the logic of the
material form itself — a narrative. More precisely, a database can support
narrative, but there is nothing in the logic of the medium itself which would foster
its generation. It is not surprising, then, that databases occupy a significant, if not
the largest, territory of the new media landscape. What is more surprising is why
the other end of the spectrum — narratives — still exist in new media.
202
Paradigm and Syntagm
The dynamics which exist between database and narrative are not unique in new
media. The relation between the structure of a digital image and the languages of
contemporary visual culture is characterized by the same dynamics. As defined by
all computer software, a digital image consists of a number of separate layers,
each layer containing particular visual elements (see “Compositing” section for a
discussion of moving image compositing and its use to simulate cinematographic
look). Throughout the production process, artists and designers manipulate each
layer separately; they also delete layers and add new ones. Keeping each element
as a separate layer allows the content and the composition of an image to be
changed at any point: deleting a background, substituting one person for another,
moving two people closer together, blurring an object, and so on. What would a
typical image look like if the layers were merged together? The elements
contained on different layers will become juxtaposed resulting in a montage look.
Montage is the default visual language of composite organization of an image.
However, just as database supports both the database form and its opposite —
narrative, a composite organization of an image on the material level (and
compositing software on the level of operations) support two opposing visual
languages. One is modernist-MTV montage — two-dimensional juxtaposition of
visual elements designed to shock due to its impossibility in reality. The other is
the representation of familiar reality as seen by a photo of film camera (or its
computer simulation, in the case of 3D graphics). During the 1980s and 1990s all
image making technologies became computer-based thus turning all images into
composites. In parallel, a Renaissance of montage took place in visual culture, in
print, broadcast design and new media. This is not unexpected — after all, this is
the visual language dictated by the composite organization. What needs to be
explained is why photorealist images continue to occupy such a significant space
in our computer-based visual culture.
It would be surprising, of course, if photorealist images suddenly
disappeared completely. The history of culture does not contain such sudden
breaks. Similarly, we should not expect that new media would completely
substitute narrative by database. New media does not radically break with the
past; rather, it distributes weight differently between the categories which hold
culture together, foregrounding what was in the background, and vice versa. As
Frederick Jameson writes in his analysis of another shift, from modernism to post-
modernism: "Radical breaks between periods do not generally involve complete
changes but rather the restructuration of a certain number of elements already
given: features that in an earlier period of system were subordinate became
dominant, and features that had been dominant again become secondary."245
Database — narrative opposition is the case in point. To further
understand how computer culture redistributes weight between the two terms of
203
opposition in computer culture I will bring in a semiological theory of syntagm
and paradigm. According to this model, originally formulated by Ferdinand de
Saussure to describe natural languages such as English and later expanded by
Roland Barthes and others to apply to other sign systems (narrative, fashion, food,
etc.), the elements of a system can be related on two dimensions: syntagmatic and
paradigmatic.246 As defined by Barthes, "the syntagm is a combination of signs,
which has space as a support." To use the example of natural language, the
speaker produces an utterance by stringing together the elements, one after
another, in a linear sequence. This is the syntagmatic dimension. Now, lets look at
the paradigm. To continue with an example of a language user, each new element
is chosen from a set of other related elements. For instance, all nouns form a set;
all synonyms of a particular word form another set. In the original formulation of
Saussure, "the units which have something in common are associated in theory
and thus form groups within which various relationships can be found."247 This
is the paradigmatic dimension.
The elements on a syntagmatic dimension are related in praesentia, while
the elements on a paradigmatic dimension are related in absentia. For instance, in
the case of a written sentence, the words which comprise it materially exist on a
piece of paper, while the paradigmatic sets to which these words belong only exist
in writer's and reader's minds. Similarly, in the case of a fashion outfit, the
elements which make it, such as a skirt, a blouse, and a jacket, are present in
reality, while pieces of clothing which could have been present instead —
different skirt, different blouse, different jacket — only exist in the viewer's
imagination. Thus, syntagm is explicit and paradigm is implicit; one is real and
the other is imagined.
Literary and cinematic narratives work in the same way. Particular words,
sentences, shots, scenes which make up a narrative have a material existence;
other elements which form an imaginary world of an author or a particular literary
or cinematic style and which could have appeared instead exist only virtually. Put
differently, the database of choices from which narrative is constructed (the
paradigm) is implicit; while the actual narrative (the syntagm) is explicit.
New media reverses this relationship. Database (the paradigm) is given
material existence, while narrative (the syntagm) is de-materialised. Paradigm is
privileged, syntagm is downplayed. Paradigm is real, syntagm is virtual. To see
this, consider the new media design process. The design of any new media object
begins with assembling a database of possible elements to be used. (Macromedia
Director calls this database "cast," Adobe Premiere calls it "project", ProTools
calls it a “session," but the principle is the same.) This database is the center of
the design process. It typically consists from a combination of original and stock
material distributed such as buttons, images, video and audio sequences; 3D
objects; behaviors and so on. Throughout the design process new elements are
added to the database; existing elements are modified. The narrative is
204
constructed by linking elements of this database in a particular order, i.e.
designing a trajectory leading from one element to another. On the material level,
a narrative is just a set of links; the elements themselves remain stored in the
database. Thus the narrative is more virtual than the database itself. (Since all data
is stored as electronic signals, the word "material" seem to be no longer
appropriate. Instead we should talk about different degrees of virtuality.)
The paradigm is privileged over syntagm in yet another way in interactive
objects presenting the user with a number of choices at the same time — which is
what typical interactive interfaces do. For instance, a screen may contain a few
icons; clicking on each icon leads the user to a different screen. On the level of an
individual screen, these choices form a paradigm of their own which is explicitly
presented to the user. On the level of the whole object, the user is made aware that
she is following one possible trajectory among many others. In other words, she is
selecting one trajectory from the paradigm of all trajectories which are defined.
Other types of interactive interfaces make the paradigm even more explicit
by presenting the user with an explicit menu of all available choices. In such
interfaces, all of the categories are always available, just a mouse click away. The
complete paradigm is present before the user, its elements neatly arranged in a
menu. This is another example of how new media makes explicit the
psychological processes involved in cultural communication. Other examples
include the already discussed shift from creation to selection, which externalizes
and codifies the database of cultural elements existing in the creator's mind; as
well as the very phenomena of interactive links. As I noted in Chapter 1, new
media takes "interaction" literally, equating it with a strictly physical interaction
between a user and a computer, at the sake of psychological interaction. The
cognitive processes involved in understanding any culltural text are erroneously
equated with an objectively existing structure of interactive links.
Interactive interfaces foreground the paradigmatic dimension and often
make explicit paradigmatic sets. Yet, they are still organized along the
syntagmatic dimension. Although the user is making choices at each new screen,
the end result is a linear sequence of screens which she follows. This is the
classical syntagmatic experience. In fact, it can be compared to constructing a
sentence in a natural language. Just as a language user constructs a sentence by
choosing each successive word from a paradigm of other possible words, a new
media user creates a sequence of screens by clicking on this or that icon at each
screen. Obviously, there are many important differences between these two
situations. For instance, in the case of a typical interactive interface, there is no
grammar and paradigms are much smaller. Yet, the similarity of basic experience
in both cases is quite interesting; in both cases, it unfolds along a syntagmatic
dimension.
Why does new media insist on this language-like sequencing? My
hypothesis is that it follows the dominant semiological order of the twentieth
century — that of cinema. As noted in the previous chapter, cinema replaced all
205
other modes of narration with a sequential narrative, an assembly line of shots
which appear on the screen one at a time. For centuries, a spatialized narrative
where all images appear simultaneously dominated European visual culture; then
it was delegated to "minor" cultural forms as comics or technical illustrations.
"Real" culture of the twentieth century came to speak in linear chains, aligning
itself with the assembly line of an industrial society and the Turing machine of a
post-industrial era. New media continues this mode, giving the user information
one screen at a time. At least, this is the case when it tries to become "real"
culture (interactive narratives, games); when it simply functions as an interface to
information, it is not ashamed to present much more information on the screen at
once, be it in the form of tables, normal or pull-down menus, or lists. In particular,
the experience of a user filling in an on-line form can be compared to pre-
cinematic spatialized narrative: in both cases, the user is following a sequence of
elements which are presented simultaneously.
A Database Complex
To what extent is the database form intrinsic to modern storage media? For
instance, a typical music CD is a collection of individual tracks grouped together.
The database impulse also drives much of photography throughout its history,
from William Henry Fox Talbot's "Pencil of Nature" to August Sander's
monumental typology of modern German society "Face of Our Time," to the
Bernd and Hilla Becher's equally obsessive cataloging of water towers. Yet, the
connection between storage media and database forms is not universal. The prime
exception is cinema. Here the storage media supports the narrative
imagination.248 Why then, in the case of photography storage media, does
technology sustain database, while in the case of cinema it gives rise to a modern
narrative form par excellence? Does this have to do with the method of media
access? Shall we conclude that random access media, such as computer storage
formats (hard drives, removable disks, CD-ROMs), favors database, while
sequential access media, such as film, favors narrative? This does not hold either.
For instance, a book, this perfect random-access medium, supports database
forms, such as photo-albums, and narrative forms, such as novels.
Rather than trying to correlate database and narrative forms with modern
media and information technologies, or deduce them from these technologies, I
prefer to think of them as two competing imaginations, two basic creative
impulses, two essential responses to the world. Both have existed long before
modern media. The ancient Greeks produced long narratives, such as Homer's
epic poems The Iliad and The Odyssey; they also produced encyclopedias. The
first fragments of a Greek encyclopedia to have survived were the work of
Speusippus, a nephew of Plato. Diderot wrote novels — and also was in charge of
206
monumental Encyclopédie, the largest publishing project of the 18th century.
Competing to make meaning out of the world, database and narrative produce
endless hybrids. It is hard to find a pure encyclopedia without any traces of a
narrative in it and vice versa. For instance, until alphabetical organization became
popular a few centuries ago, most encyclopedias were organized thematically,
with topics covered in a particular order (typically, corresponding to seven liberal
arts.) At the same time, many narratives, such as the novels by Cervantes and
Swift, and even Homer's epic poems — the founding narratives of the Western
tradition — traverse an imaginary encyclopedia.
Modern media is the new battlefield for the competition between database
and narrative. It is tempting to read the history of this competition in dramatic
terms. First the medium of visual recording — photography — privileges
catalogs, taxonomies and lists. While the modern novel blossoms, and
academicians continue to produce historical narrative paintings all through the
nineteenth century, in the realm of the new techno-image of photography,
database rules. The next visual recording medium — film — privileges narrative.
Almost all fictional films are narratives, with few exceptions. Magnetic tape used
in video does not bring any substantial changes. Next storage media — computer
controlled digital storage devices (hard drives, removable drives, CD-ROMs,
DVD) privilege database once again. Multimedia encyclopedias, virtual
museums, pornography, artists' CD-ROMs, library databases, Web indexes, and,
of course, the Web itself: database is more popular than ever before.
Digital computer turns out to be the perfect medium for the database form.
Like a virus, databases infect CD-ROMs and hard drives, servers and Web sites.
Can we say that database is the cultural form most characteristic of a computer?
In her 1978 article "Video: The Aesthetics of Narcissism," probably the single
most well-known article on video art, art historian Rosalind Krauss argued that
video is not a physical medium but a psychological one. In her analysis, "video's
real medium is a psychological situation, the very terms of which are to withdraw
attention from an external object — an Other — and invest it in the Self."249 In
short, video art is a support for the psychological condition of narcissism.250
Does new media similarly function to play out a particular psychological
condition, something which can be called a database complex? In this respect, it is
interesting that database imagination has accompanied computer art from its very
beginning. In the 1960s, artists working with computers wrote programs to
systematically explore the combinations of different visual elements. In part they
were following art world trends such as minimalism. Minimalist artists executed
works of art according to pre-existent plans; they also created series of images or
objects by systematically varying a single parameter. So, when minimalist artist
Sol LeWitt spoke of an artist's idea as "the machine which makes the work," it
was only logical to substitute the human executing the idea by a computer.251At
207
the same time, since the only way to make pictures with a computer was by
writing a computer program, the logic of computer programming itself pushed
computer artists in the same directions. Thus, for artist Frieder Nake a computer
was a "Universal Picture Generator," capable of producing every possible picture
out of a combination of available picture elements and colors.252 In 1967 he
published a portfolio of 12 drawings which were obtained by successfully
multiplying a square matrix by itself. Another early computer artist Manfred
Mohr produced numerous images which recorded various transformations of a
basic cube.
Even more remarkable were films by John Witney, the pioneer of
computer filmmaking. His films such as "Permutations" (1967), "Arabesque"
(1975) and others systematically explored the transformations of geometric forms
obtained by manipulating elementary mathematical functions. Thus they
substituted successive accumulation of visual effects for narrative, figuration or
even formal development. Instead they presented the viewer with databases of
effects. This principle reaches its extreme in Witney's earlier film which was
made using analog computer and was called "Catalog." In his important book on
new forms of cinema of the 1960s entitled Expanded Cinema (1970) critic Gene
Youngblood writes about this remarkable film: "The elder Whitney actually
never produced a complete, coherent movie on the analog computer because he
was continually developing and refining the machine while using it for
commercial work... However, Whitney did assemble a visual catalogue of the
effects he had perfected over the years. This film, simply titled Catalog, was
completed in 1961 and proved to be of such overwhelming beauty that many
persons still prefer Whitney's analogue work over his digital computer films."253
One is tempted to read "Catalog" as one of the founding moments of new media.
As discussed in "Selection" section, today all software for media creation arrives
with endless "plug-ins" — the banks of effects which with a press of a button
generate interesting images from any input whatsoever. In parallel, much of the
aesthetics of computerised visual culture is effects driven, especially when a new
techno-genre (computer animation, multimedia, Web sites) is just getting
established. For instance, countless music videos are variations of Witney's
"Catalog" — the only difference is that the effects are applied to the images of
human performers. This is yet another example of how the logic of a computer —
in this case, the ability of a computer to produce endless variations of elements
and to act as a filter, transforming its input to yield a new output — becomes the
logic of culture at large.
Database Cinema: Greenaway and Vertov
208
Although database form may be inherent to new media, countless attempts to
create "interactive narratives" testify to our dissatisfaction with the computer in
the sole role of an encyclopedia or a catalog of effects. We want new media
narratives, and we want these narratives to be different from the narratives we saw
or read before. In fact, regardless of how often we repeat in public that the
modernist notion of medium specificity ("every medium should develop its own
unique language") is obsolete, we do expect computer narratives to showcase new
aesthetic possibilities which did not exist before digital computers. In short, we
want them to be new media specific. Given the dominance of database in
computer software and the key role it plays in the computer-based design process,
perhaps we can arrive at new kinds of narrative by focusing our attention on how
narrative and database can work together. How can a narrative take into account
the fact that its elements are organized in a database? How can our new abilities
to store vast amounts of data, to automatically classify, index, link, search and
instantly retrieve it lead to new kinds of narratives?
Peter Greenaway, one of the very few prominent film directors concerned
with expanding cinema's language, complained that "the linear pursuit — one
story at a time told chronologically — is the standard format of cinema." Pointing
out that cinema lags behind modern literature in experimenting with narrative, he
asked: "Could it not travel on the road where Joyce, Eliot, Borges and Perec have
already arrived?"254 While Greenaway is right to direct filmmakers to more
innovative literary narratives, new media artists working on the database —
narrative problem can learn from cinema "as it is." For cinema already exists right
in the intersection between database and narrative. We can think of all the
material accumulated during shooting forming a database, especially since the
shooting schedule usually does not follow the narrative of the film but is
determined by production logistics. During editing the editor constructs a film
narrative out of this database, creating a unique trajectory through the conceptual
space of all possible films which could have been constructed. From this
perspective, every filmmaker engages with the database-narrative problem in
every film, although only a few have done this self-consciously.
One exception is Greenaway himself. Throughout his career, he has been
working on a problem of how to reconcile database and narrative forms. Many of
his films progress forward by recounting a list of items, a catalog which does not
have any inherent order (for example, different books in Prospero's Books).
Working to undermine a linear narrative, Greenaway uses different systems to
order his films. He wrote about this approach: "If a numerical, alphabetic color-
coding system is employed, it is done deliberately as a device, a construct, to
counteract, dilute, augment or compliment the all-pervading obsessive cinema
interest in plot, in narrative, in the 'I'am now going to tell you a story school of
film-making."255 His favorite system is numbers. The sequence of numbers acts
as a narrative shell which "convinces" the viewer that she is watching a narrative.
209
In reality the scenes which follow one another are not connected in any logical
way. By using numbers, Greenaway "wraps" a minimal narrative around a
database. Although Greenaway's database logic was present already in his "avant-
garde" films such as The Falls (1980), it has also structured his "commercial"
films from the beginning. Draughtsman's Contract (1982) is centered around
twelve drawings being made by the draftsman. They do not form any order;
Greenaway emphasizes this by having draftsman to work on a few drawings at
once. Eventually, Greenaway's desire to take "cinema out of cinema" led to his
work on a series of installations and museum exhibitions in the 1990s. No longer
having to conform to the linear medium of film, the elements of a database are
spatialized within a museum or even the whole city. This move can be read as the
desire to create a database at its most pure form: the set of elements not ordered in
any way. If the elements exist in one dimension (time of a film, list on a page),
they will be inevitably ordered. So the only way to create a pure database is to
spatialise it, distributing the elements in space. This is exactly the path which
Greenaway took. Situated in three-dimensional space which does not have an
inherent narrative logic, a 1992 installation "100 Objects to Represent the World"
in its very title proposes that the world should be understood through a catalog
rather than a narrative. At the same time, Greenaway does not abandon narrative;
he continues to investigate how database and narrative can work together. Having
presented "100 Objects" as an installation, Greenaway next turned it into an opera
set. In the opera, the narrator Thrope uses the objects to conduct Adam and Eve
through the whole of human civilization, thus turning a 100 objects into a
sequential narrative.256 In another installation "The Stairs-Munich-Projection"
(1995) Greenaway put up a hundred screens — each for one year in the history of
cinema — throughout Munich. Again, Greenaway presents us with a spatialized
database — but also with a narrative. By walking from one screen to another, one
follows cinema’s history. The project uses Greenaway's favorite principle of
organization by numbers, pushing it to the extreme: the projections on the screens
contain no figuration, just numbers. The screens are numbered from 1895 to 1995,
one screen for each year of cinema's history. Along with numbers, Greenaway
introduces another line of development. Each projection is slightly different in
color.257 The hundred colored squares form an abstract narrative of their own
which runs in parallel to the linear narrative of cinema’s history. Finally,
Greenaway superimposes yet a third narrative by dividing the history of cinema
into five sections, each section staged in a different part of the city. The apparent
triviality of the basic narrative of the project — one hundred numbers, standing
for one hundred years of cinema’s history — "neutralizes" the narrative, forcing
the viewer to focus on the phenomenon of the projected light itself, which is the
actual subject of this project.
Along with Greenaway, Dziga Vertov can be thought of as a major
"database filmmaker" of the twentieth century. Man with a Movie Camera is
210
perhaps the most important example of database imagination in modern media art.
In one of the key shots repeated few times in the film we see an editing room with
a number of shelves used to keep and organize the shot material. The shelves are
marked "machines," "club," "the movement of a city," "physical exercise," "an
illusionist," and so on. This is the database of the recorded material. The editor —
Vertov's wife, Elizaveta Svilova — is shown working with this database:
retrieving some reels, returning used reels, adding new ones.
Although I pointed out that film editing in general can be compared to
creating a trajectory through a database, in the case of Man with a Movie Camera
this comparison constitutes the very method of the film. Its subject is the
filmmaker's struggle to reveal (social) structure among the multitude of observed
phenomena. Its project is a brave attempt at an empirical epistemology which
only has one tool — perception. The goal is to decode the world purely through
the surfaces visible to the eye (of course, its natural sight enhanced by a movie
camera). This is how the film's co-author Mikhail Kaufman describes it:
An ordinary person finds himself in some sort of environment, gets lost
amidst the zillions of phenomena, and observes these phenomena from a
bad vantage point. He registers one phenomenon very well, registers a
second and a third, but has no idea of where they may lead... But the man
with a movie camera is infused with the particular thought that he is
actually seeing the world for other people. Do you understand? He joins
these phenomena with others, from elsewhere, which may not even have
been filmed by him. Like a kind of scholar he is able to gather empirical
observations in one place and then in another. And that is actually the way
in which the world has come to be understood.258
Therefore, in contrast to standard film editing which consists in selection and
ordering of previously shot material according to a pre-existent script, here the
process of relating shots to each other, ordering and reordering them in order to
discover the hidden order of the world constitutes the film's method. Man with a
Movie Camera traverses its database in a particular order to construct an
argument. Records drawn from a database and arranged in a particular order
become a picture of modern life — but simultaneously an argument about this
life, an interpretation of what these images, which we encounter every day, every
second, actually mean.259
Was this brave attempt successful? The overall structure of the film is
quite complex, and on the first glance has little to do with a database. Just as new
media objects contain a hierarchy of levels (interface — content; operating system
— application; web page — HTML code; high-level programming language —
assembly language — machine language), Vertov's film consists of at least three
levels. One level is the story of a cameraman filming material for the film. The
211
second level is the shots of an audience watching the finished film in a movie
theater. The third level is this film, which consists from footage recorded in
Moscow, Kiev and Riga and is arranged according to a progression of one day:
waking up — work — leisure activities. If this third level is a text, the other two
can be thought of as its meta-texts.260 Vertov goes back and forth between the
three levels, shifting between the text and its meta-texts: between the production
of the film, its reception, and the film itself. But if we focus on the film within
the film (i.e., the level of the text) and disregard the special effects used to create
many of the shots, we discover almost a linear printout, so to speak, of a database:
a number of shots showing machines, followed by a number of shots showing
work activities, followed by different shots of leisure, and so on. The paradigm is
projected onto syntagm. The result is a banal, mechanical catalog of subjects
which one can expect to find in the city of the 1920s: running trams, city beach,
movie theaters, factories...
Of course watching Man with a Movie Camera is anything but a banal
experience. Even after the 1990s during which computer-based image and video-
makers systematically exploited every avant-garde device, the original still looks
striking. What makes its striking is not its subjects and the associations Vertov
tries to establish between them to impose "the communist decoding of the world"
but the most amazing catalog of the film techniques contained within it. Fades and
superimpositions, freeze-frames, acceleration, split screens, various types of
rhythm and intercutting, different montage techniques261 — what film scholar
Annette Michelson called "a summation of the resources and techniques of the
silent cinema"262 — and of course, a multitude of unusual, "constructivist" points
of view are stringed together with such density that the film can't be simply
labeled avant-garde. If a "normal" avant-garde film still proposes a coherent
language different from the language of mainstream cinema, i.e. a small set of
techniques which are repeated, Man with a Movie Camera never arrives at
anything like a well-defined language. Rather, it proposes an untamed, and
apparently endless unwinding of cinematic techniques, or, to use contemporary
language, "effects," as cinema's new way of speaking.
Traditionally, a personal artistic language or a style common to a group of
cultural objects or a period requires the stability of paradigms and consistent
expectations as to which elements of paradigmatic sets may appear in a given
situation. For example, in a case of classical Hollywood style, a viewer may
expect that a new scene will begin with an establishing shot or that a particular
lighting convention such as high key or low key will be used throughout the film.
(David Bordwell defines a Hollywood style in terms of paradigms which are
ranked in terms of probabilities.263)
The endless new possibilities provided by computer software hold the
promise of new cinematic languages, but in the same time they prevent such
212
languages from coming into being. (I am using the example of film but the same
logic applies to all other areas of computer-based visual culture.) Since every
software comes with numerous sets of transitions, 2D filters, 3D transformations
and other effects and “plug-ins,” the artist, especially the beginner, is tempted to
use many of them in the same work. In such a case a paradigm becomes the
syntagm. That is, rather than making singular choices from the sets of possible
techniques, or, to use the term of Russian formalists, devices, and then repeating
them throughout the work (for instance, using only cuts, or only cross-dissolves),
the artist ends up using many options in the same work. Ultimately, a digital film
becomes a list of different effects, which appear one after another. Witney’s
Catalog is the extreme expression of this logic.
The possibility of creating a stable new language is also subverted by the
constant introduction of new techniques over time. Thus the new media
paradigms not only contain many more options than in the old media but they also
keep growing over time. And in culture ruled by the logic of fashion, i.e., the
demand for constant innovation, the artists tend to adopt newly available options
while simultaneously dropping the already familiar ones. Every year, every month
new effects found their way into the media works, displacing the previously
prominent ones and destabilizing any stable expectations which viewers could
have begin to form.
And this is why Vertov’s film has a particular relevance to new media. It
proves that it is possible to turn “effects” into a meaningful artistic language. Why
in the case of Witney's computer films and music videos are the effects just
effects, while in the hands of Vertov they acquire meaning? Because in Vertov's
film they are motivated by a particular argument, this being that the new
techniques to obtain images and manipulate them, summed up by Vertov in his
term "kino-eye," can be used to decode the world. As the film progresses,
"straight" footage gives way to manipulated footage; newer techniques appear one
after one, reaching a roller coaster intensity by the film's end, a true orgy of
cinematography. It is as though Vertov re-stages his discovery of the kino-eye for
us. Along with Vertov, we gradually realize the full range of possibilities offered
by the camera. Vertov's goal is to seduce us into his way of seeing and thinking,
to make us share his excitement, his gradual process of discovery of film's new
language. This process of discovery is film's main narrative and it is told through
a catalog of discoveries being made. Thus, in the hands of Vertov, a database, this
normally static and "objective" form, becomes dynamic and subjective. More
importantly, Vertov is able to achieve something which new media designers and
artists still have to learn — how to merge database and narrative merge into a
new form.
213
Navigable space
Doom and Myst
Looking at the first decade of new media — the 1990s — one can point at a
number of objects which exemplify new media’s potential to give rise to
genuinely original and historically unprecedented aesthetic forms. Among them,
two stand out. Both are computer games. Both were published in the same year,
1993. Each became a phenomenon whose popularity has extended beyond the
hard core gaming community, spilling into sequels, books, TV, films, fashion and
design. Together, they defined the new field and its limits. These games are Doom
(id Software, 1993) and Myst (Cyan, 1993).
In a number of ways, Doom and Myst are completely different. Doom is
fast paced; Myst is slow. In Doom the player runs through the corridors trying to
complete each level as soon as possible, and then moves to the next one. In Myst,
the player is moving through the world literally one step at a time, unraveling the
narrative along the way. Doom is populated with numerous demons lurking
around every corner, waiting to attack; Myst is completely empty. The world of
Doom follows the convention of computer games: it consists of a few dozen
levels. Although Myst also contains four separate worlds, each is more like a self-
contained universe than a traditional computer game level. While the usual levels
are quite similar to each other in structure and the look, the worlds of Myst are
distinctly different.
Another difference lies in the aesthetics of navigation. In Doom’s world,
defined by rectangular volumes, the player is moving in straight lines, abruptly
turning at right angles to enter a new corridor. In Myst, the navigation is more
free-form. The player, or more precisely, the visitor, is slowly exploring the
environment: she may look around for a while, go in circles, return to the same
place over and over, as though performing an elaborate dance.
Finally, the two objects exemplify two different types of cultural
economy. With Doom, id software pioneered the new economy which the critic of
computer games J.C. Herz summarizes as follows: "It was an idea whose time has
come. Release a free, stripped-down version through shareware channels, the
Internet, and online services. Follow with a spruced-up, registered retail version
of the software." 15 million copies of the original Doom game were downloaded
around the world.264 By releasing detailed descriptions of game files formats and
a game editor, id software also encouraged the players to expand the game,
creating new levels. Thus, hacking and adding to the game became its essential
part, with new levels widely available on the Internet for anybody to download.
Here was a new cultural economy which transcended the usual relationship
214
between producers and consumers or between “strategies” and “tactics” (de
Certeau): the producers define the basic structure of an object, and release few
examples and the tools to allow the consumers to build their own versions, shared
with other consumers. In contrast, the creators of Myst followed an older model
of cultural economy. Thus, Myst is more similar to a traditional artwork than to a
piece of software: something to behold and admire, rather than to take apart and
modify. To use the terms of the software industry, it is a closed, or proprietary
system, something which only the original creators can modify or add to.
Despite all these differences in cosmogony, gameplay, and the underlying
economic model, the two games are similar in one key respect. Both are spatial
journeys. The navigation though 3D space is an essential, if not the key
component, of the gameplay. Doom and Myst present the user with a space to be
traversed, to be mapped out by moving through it. Both begin by dropping the
player somewhere in this space. Before reaching the end of the game narrative,
the player must visit most of it, uncovering its geometry and topology, learning it
logic and its secrets. In Doom and Myst — and in a great many other computer
games — narrative and time itself are equated with the movement through 3D
space, the progression through rooms, levels, or words. In contrast to modern
literature, theater, and cinema which are built around the psychological tensions
between the characters and the movement in psychological space, these computer
games return us to the ancient forms of narrative where the plot is driven by the
spatial movement of the main hero, traveling through distant lands to save the
princess, to find the treasure, to defeat the Dragon, and so on. As J.C. Herz writes
about the experience of playing a classical text-based adventure game Zork, "you
gradually unlocked a world in which the story took place, and the receeding edge
of this world carried you through to the story's conclusion."265 Stripping away the
representation of inner life, psychology and other modernist nineteenth century
inventions, these are the narratives in the original Ancient Greek sense, for, as
Michel de Certau reminds us, "In Greek, narration is called 'diagesis': it
establishes an itinerary (it 'guides) and it passes through (it 'transgresses").266
In the introduction to this chapter I invoked the opposition between
narration and dsecription from narratology. As stated by Mieke Bal, the standard
theoretical premise of narratology was that “descriptions interrupt the line of
fabula.”267 For me this opposition, in which description was defined negatively,
as absence of narration, was always problematic. It automatically privileged
certain types of narrative (myths, fairy tales, detective stories, classical
Hollywood cinema), while making it difficult to think about other forms where
actions of characters do not dominate the narrative (for instance, films by Andrey
Tarkovskiy and Hirokazu Kore-eda, the director of Maborosi and After Life).268
Games structured around first-person navigation through space further challenge
narration-description opposition.
215
Instead of narration and description, we may be better off thinking about
games in terms of narrative actions and exploration. Rather than being narrated to,
the player herself has to perform actions to move narrative forward: talking to
other characters she encounters in the game world, picking up objects, fighting the
enemies, and so on. If the player does not do anything, the narrative stops. From
this perspective, movement through the game world is one of the main narrative
actions. But this movement also serves a self-sufficient goal of exploration.
Exploring the game world, examining its details and enjoying its images is as
important for the success of games such as Myst and its followers, as progressing
through the narrative. Thus while from one point of view game narratives can be
aligned with ancient narratives which also were structured around movement
through space, from another perspective they are the exact opposite. The
movement through space allows the player to progress through the narrative; but
it is also valuable in itself. It is a way for the player to explore the environment.
Narratology’s analysis of description can be a useful start in thinking
about exploration of space in computer game and other new media objects. Bal
states that descriptive passages in fiction are motivated by speaking, looking and
acting. Motivation by looking works as follows: “A character sees an object. The
description of reproduction of what it sees.” Motivation by acting means that
“The actor carries out an action with an object. The description is then made fully
narrative. The example of this is the scene in Zola’s La Bête in which Jacques
polishes [strokes] every individual component of his beloved locomotive.”269
In contrast to modern novel, action oriented games do not have that much
dialog, but looking and acting are indeed the key activities performed by a player.
And if in modern fiction looking and acting are usually separate activities, in
games they more often than note occur together. As the player comes across a
door leading to another level, a new passage, ammunition for his machine gun, an
enemy, or a “health potion” he immediately acts on these objects: opens a door,
picks up ammunition or “health potion,” fires at the enemy. Thus narrative action
and exploration are closely linked together.
The central role of navigation through space, both as a tool of narration
and of exploration, is acknowledged by the games’ designers themselves. Robyn
Miller, one of the two co-designers of Myst pointed out that "We' are creating
environments to just wonder around inside of. People have been calling it a game
for lack of anything better, and we've called it a game at times. But that's not what
it really is; it's a world."270 Richard Garriott, the designer of classical RPG
Ultima series, contrasts game design and fiction writing: "A lot of them [fiction
writers] develop their individual characters in detail, and they say what is their
problem in the beginning, and what they are going to grow to learn in the end.
That's not the method I've used... I have the world. I have the message. And then
the characters are there to support the world and the message."271
216
Structuring the game as a navigation through space is common to games
across all the game genres. This includes adventure games (for instance, Zork, 7th
Level, The Journeyman Project, Tomb Raider, Myst), strategy games (Command
and Conquer) role-playing games (Diablo, Final Fantasy), flying, driving, and
other simulators (Microsoft Flight Simulator), action games (Hexen, Mario), and,
of course, first person shooters which have followed in Doom’s steps (Quake,
Unreal). These genres follow different conventions. In adventure games, the user
is exploring an universe, gathering resources. In strategy games, the user is
engaged in allocating and moving resources and in risk management. In RPGs
(role playing games), the user is building a character, acquiring the skills; the
narrative is one of self-improvement. The genre conventions by themselves do not
make it necessary for these games to employ a navigable space interface.
Therefore, the fact that they all consistently do use it suggests to me that
navigable space represents a larger cultural form. In other words, it is something
which transcends computer games, and in fact, as we will see later, computer
culture as well. Just like a database, navigable space is a form which already
exists before computers; however, the computer becomes its perfect medium.
Indeed, the use of navigable space is common to all areas of new media.
During the 1980s, numerous 3D computer animations were organized around a
single, uninterrupted camera move through a complex and extensive set. In a
typical animation, a camera would fly over mountain terrain, or move through a
series of rooms, or maneuver past geometric shapes. In contrast to both ancient
myths and computer games, this journey had no goal, no purpose. In short, there
was no narrative. Here was the ultimate "road movie" where the navigation
through the space was sufficient in itself.
In the 1990s, these 3D fly-throughs have come to constitute the new genre
of
post-computer cinema and location-based entertainment — the motion
simulator.272 By using the first person point of view and by synchronizing the
movement of the platform housing the audience with the movement of a virtual
camera, motion simulators recreate the experience of traveling in a vehicle.
Thinking about the historical precedents of a motion simulator, we begin to
uncover some places where the form of navigable space already manifested itself.
They include Hale's Tours and Scenes of the World , a popular film-based
attraction which debuted at the St. Louis Fare in 1904; roller-coaster rides; flight,
vehicle and military simulators, which used a moving base since the early 1930s;
and the fly-through sequences in 2001: A Space Odyssey (Kubrick, 1968) and
Star Wars (Lucas, 1977). Among these, A Space Odyssey plays particularly
important role; Douglas Trumbull, who since the late 1980s produced some of the
most well-known motion simulator attractions and was the key person behind the
rise of the whole motion simulator phenomenon begun his career by creating ride
sequences for this film.
217
Along with providing a key foundation for new media aesthetics,
navigable space also became a new tool of labor. It is now a common way to
visualize and work with any data. From scientific visualization to walk-throughs
of architectural designs, from models of a stock market performance to statistical
datasets, the 3D virtual space combined with a camera model is the accepted way
to visualize all information (see the section "The Language of Cultural
Interfaces"). It is as accepted in computer culture as charts and graphs were in a
print culture.273
Since navigable space can be used to represent both physical spaces and
abstract information spaces, it is only logical that it also emerged as an important
paradigm in human-computer interfaces. Indeed, on one level HCI can be seen as
a particular case of data visualization, the data being computer files rather than
molecules, architectural models or stock market figures. The examples of 3D
navigable space interfaces are the Information Visualizer (Xerox Parc) which
replaces a flat desktop with 3D rooms and planes rendered in perspective; 274
T_Vision (ART+COM) which uses a navigable 3D representation of the earth as
its interface;275 and The Information Landscape (Silicon Graphics) in which the
user flies over a plane populated by data objects.276
The original (i.e. the 1980’s) vision of cyberspace called for a 3D space of
information to be traversed by a human user, or, to use the term of William
Gibson, a "data cowboy."277 Even before Gibson's fictional descriptions of
cyberspace were published, cyberspace was visualized in the film Tron (Disney,
1982). Although Tron takes place inside a single computer rather than a network,
its vision of users zapping through the immaterial space defined by lines of light
is remarkably similar to the one articulated by Gibson in his novels. In an article
which appeared in the 1991 anthology Cyberspace: First Steps Marcos Novak
still defined cyberspace as "a completely spatialized visualization of all
information in global information processing systems."278 In the first part of the
1990s, this vision has survived among the original designers of VRML (The
Virtual Reality Modeling Language). In designing the language, they aimed to
"create a unified conceptualization of space spanning the entire Internet, a spatial
equivalent of WWW."279 They saw VRML as a natural stage in the evolution of
the Net from an abstract data network toward a "'perceptualized' Internet where
the data has been sensualized," i.e., represented in three dimensions.280
The term cyberspace itself is derived from another term— cybernetics. In
his 1947 book Cybernetics mathematician Norbert Wiener has defined it as "the
science of control and communications in the animal and machine." Wiener
conceived of cybernetics during World War II when he was working on problems
concerning gunfire control and automatic missile guidance. He derived the term
cybernetics from the ancient Greek word kybernetikos which refers to the art of
218
the steersman and can be translated as “good at steering.” Thus, the idea of
navigable space lies at the very origins of computer era. The steersman navigating
the ship and the missile traversing space on its way to the target have given rise to
a whole number of new figures: the heroes of William Gibson, the “data
cowboys” moving through the vast terrains of cyberspace; the "driver" of a
motion simulator; a computer user, navigating through the scientific data sets and
computer data structures, molecules and genes, earth's atmosphere and the human
body; and last but not least, the player of Doom, Myst and their endless
imitations.
From one point of view, navigable space can be legitimately seen as a
particular kind of an interface to a database, and thus something which does not
deserve a special focus. I would like, however, to also think of it as a cultural
form of its own, not only because of its prominence across the new media
landscape and, as we will see later, its persistence in new media history, but also
because, more so than a database, it is a new form which may be unique to new
media. Of course both the organization of space and its use to represent or
visualize something else have always been a fundamental part of human culture.
Architecture and ancient mnemonics, city planing and diagramming, geometry
and topology are just some of the disciples and techniques which were developed
to harness space's symbolic and economic capital.281 Spatial constructions in new
media draw on all these existing traditions — but they are also fundamentally
different in one key respect. For the first time, space becomes a media type. Just
as other media types — audio, video, stills, and text — it can be now instantly
transmitted, stored and retrieved, compressed, reformatted, streamed, filtered,
computed, programmed and interacted with. In other words, all operations which
are possible with media as a result of its conversion to computer data can also
now apply to representations of 3D space.
Recent cultural theory has paid increasing attention to the category of
space. The examples are Henri Lefebvre's work on the politics and anthropology
of everyday space; Michel Foucault's analysis of the Panopticon's topology as a
model of modern subjectivity; the writings of Frederick Jameson and David
Harvey on the post-modern space of global capitalism; Edward Soja’s work on
political geography.282 At the same time, new media theoreticians and
practitioners have come with many formulations of how cyberspace should be
structured and how computer-based spatial representations can be used in new
ways.283 What received little attention, however, both in cultural theory and in
new media theory, is a particular category of navigation through space. And yet,
this category characterizes new media as it actually exists; in other words, new
media spaces are always spaces of navigation. At the same time, as we will see
later in this section, this category also fits a number of developments in other
cultural fields such as anthropology and architecture.
219
To summarize, along with a database, navigable space is another key form
of new media. It is already an accepted way for interacting with any type of data;
an interface of computer games and motion simulators and, potentially, of any
computer in general. Why does computer culture spatialize all representations and
experiences (the library is replaced by cyberspace; narrative is equated with
traveling through space; all kinds of data are rendered in three dimensions through
computer visualization)? Shall we try to oppose this spatialization (i.e., what
about time in new media?) And, finally, what are the aesthetics of navigation
through virtual space?
Computer Space
The very first coin-op arcade game was called Computer Space. The game
simulated the dogfight between a spaceship and a flying saucer. Released in 1971,
it was a remake of the first computer game Spacewar programmed on PDP-1 at
MIT in 1962.284 Both of these legendary games included the word space in their
titles; and appropriately, space was one of the main characters in each of them. In
the original Spacewar the player was navigating two spaceships around the screen
while shooting torpedoes at one another. The player also had to be careful in
maneuvering the ships to make sure they would not get too close to the star in the
center of the screen which pulled them towards it. Thus, along with the
spaceships, the player also had to interact with space itself. And although, in
contrast to such films as 2001, Star Wars, or Tron, the space of Spacewar and
Computer Space was not navigable — one could not move through it — the
simulation of gravity made it truly an active presence. Just as the player had to
engage with the spaceships, he had to engage with the space itself.
This active treatment of space is an exception rather than the rule in new
media. Although new media objects favor the use of space for representations of
all kinds, most often virtual spaces are not true spaces but collections of separate
objects. Or, to put this in a slogan: there is no space in cyberspace.
To explore this thesis further we can borrow the categories developed by
art historians early in this century. Alois Riegl, Heinrich Wölfflin, and Erwin
Panofsky, the founders of modern art history, defined their field as the history of
the representation of space. Working within the paradigm of cyclic cultural
development, they related the representation of space in art to the spirit of entire
epochs, civilizations, and races. In his 1901 Die Spätrömische Kunstindustrie
(“The late-Roman art industry”), Riegl characterized mankind’s cultural
development as the oscillation between two ways of understanding space, which
he called haptic and optic. Haptic perception isolates the object in the field as a
discrete entity, while optic perception unifies objects in a spatial continuum.
Riegl’s contemporary, Heinrich Wölfflin, similarly proposed that the
220
temperament of a period or a nation expresses itself in a particular mode of seeing
and representing space. Wölfflin’s Principles of Art History (1913) plotted the
differences between Renaissance and baroque styles along five axes:
linear/painterly; plane/recession; closed form/open form; multiplicity/unity; and
clearness/unclearness.285
Erwin Panofsky, another founder of modern art history,
contrasted the “aggregate” space of the Greeks with the “systematic” space of the
Italian Renaissance in his famous essay Perspective as Symbolic Form (1924-
25).286
Panofsky established a parallel between the history of spatial
representation and the evolution of abstract thought. The former moves from the
space of individual objects in antiquity, to the representation of space as
continuous and systematic in modernity. Correspondingly, the evolution of
abstract thought progresses from ancient philosophy’s view of the physical
universe as discontinuous and “aggregate”, to the post-Renaissance understanding
of space as infinite, homogeneous, isotropic, and with ontological primacy in
relation to objects — in short, as systematic.
We don’t have to believe in grand evolutionary schemes in order to
usefully retain such categories. What kind of space is virtual space? At first
glance the technology of 3D computer graphics exemplifies Panofsky’s concept
of systematic space, which exists prior to the objects in it. Indeed, the Cartesian
coordinate system is built into computer graphics software and often into the
hardware itself.287 A designer launching a modeling program is typically
presented with an empty space defined by a perspectival grid; the space will be
gradually filled by the objects created. If the built-in message of a music
synthesizer is a sine wave, the built-in world of computer graphics is an empty
Renaissance space: the coordinate system itself.
Yet computer-generated worlds are actually much more haptic and
aggregate than optic and systematic. The most commonly used computer-graphics
technique of creating 3D worlds is polygonal modeling. The virtual world created
with this technique is a vacuum containing separate objects defined by rigid
boundaries. What is missing from computer space is space in the sense of
medium: the environment in which objects are embedded and the effect of these
objects on each other. This is what Russian writers and artists call
prostranstvennaya sreda. Pavel Florensky, a legendary Russian philosopher and
art historian has described it in the following way in the early 1920s: “The space-
medium is objects mapped onto space... We have seen the inseparability of
Things and space, and the impossibility of representing Things and space by
themselves.”288 This understanding of space also characterizes a particular
tradition of modern painting which stretches from Seurat to Giacommetti and De
Kooning. These painters tried to eliminate the notions of a distinct object and an
empty space as such. Instead they depicted a dense field that occasionally hardens
into something which we can read as an object. Following the example of Gilles
221
Deleuze’s analysis of cinema as activity of articulating new concepts, akin to
philosophy,289 it can be said that modern painters which belong to this tradition
worked to articulate the particular philosophical concept in their painting — that
of space-medium. This concept is something mainstream computer graphics still
has to discover.
Another basic technique used in creating virtual worlds also leads to
aggregate space. It involves superimposing animated characters, still images,
digital movies, and other elements over a separate background. Traditionally this
technique was used in video and computer games. Responding to the limitations
of the available computers, the designers of early games would limit animation to
a small part of a screen. 2D animated objects and characters called sprites were
drawn over a static background. For example, in Space Invaders the abstract
shapes representing the invaders would fly over a blank background, while in
Pong the tiny character moved across the picture of a maze. The sprites were
essentially animated 2D cutouts thrown over the background image at game time,
so no real interaction between them and the background took place. In the second
half of the 1990s much faster processors and 3D graphics cards made it possible
for games to switch to real-time 3D rendering. This allowed for modeling of
visual interactions between the objects and the space they are in, such as
reflections and shadows. Consequently, the game space became more of a
coherent, true 3D space, rather than a set of 2D planes unrelated to each other.
However, the limitations of earlier decades returned in another area of new media
— online virtual worlds. Because of the limited bandwidth of the 1990s Internet,
virtual world designers have to deal with constraints similar to and sometimes
even more severe than the games designers two decades earlier. In online virtual
worlds, a typical scenario may involve an avatar — a 2D or 3D graphic
representing the user — animated in real time in response to the user’s
commands. The avatar is superimposed on a picture of a room, in the same way as
in video games the sprites were superimposed over the background. The avatar is
controlled by the user; the picture of the room is provided by a virtual-world
operator. Because the elements come from different sources and are put together
in real time, the result is a series of 2D planes rather than a real 3D environment.
Although the image depicts characters in a 3D space, it is an illusion since the
background and the characters do not “know” about each other, and no interaction
between them is possible.
Historically, we can connect the technique of superimposing animated
sprites over the background to traditional cell animation. In order to save labor,
animators similarly divide the image between a static background and animated
characters. In fact the sprites of computer games can be thought of as reincarnated
animation characters. Yet the use of this technique did not prevent Fleischer and
Disney animators from thinking of space as space-medium (to use Floresky's
term), although they created this space-medium in a different way than the
222
modern painters. (Thus while the masses run away from the serious and
“difficult” abstract art to enjoy the funny and figurative images of cartoons, what
they saw was not that different from Giacommetti’s and De Kooning’s canvases.)
Although all objects in cartoons have hard edges, the total anthropomorphism of
the cartoon universe breaks the distinctions both between subjects and objects and
objects and space. Everything is subjected to the same laws of stretch and squash,
everything moves and twists in the same way, everything is alive to the same
extent. It is as though everything — the character’s body, chairs, walls, plates,
food, cars and so on — is made from the same bio-material. This monism of the
cartoon worlds stands in opposition to the binary ontology of computer worlds in
which the space and the sprites characters appear to be made from two
fundamentally different substances.
In summary, although 3D computer-generated virtual worlds are usually
rendered in linear perspective, they are really collections of separate objects,
unrelated to each other. In view of this, the common argument that 3D computer
simulations return us to Renaissance perspective and therefore, from the
viewpoint of twentieth-century abstraction, should be considered regressive, turns
out to be ungrounded. If we are to apply the evolutionary paradigm of Panofsky to
the history of virtual computer space, we must conclude that it has not reached its
Renaissance stage yet. It is still at the level of ancient Greece, which could not
conceive of space as a totality.
Computer space is also aggregate yet in another sense. As I already noted
using the example of Doom, traditionally the world of a computer game is not a
continuous space but a set of discrete levels. In addition, each level is also discrete
— it is a sum of rooms, corridors, and arenas built by the designers. Thus, rather
conceiving space as a totality, one is dealing with a set of separate places. The
convention of levels is remarkably stable, persisting across genres and numerous
computer platforms.
If the World Wide Web and original VRML are any indications, we are
not moving any closer toward systematic space; instead, we are embracing
aggregate space as a new norm, both metaphorically and literally. The space of
the Web in principle can’t be thought of as a coherent totality: it is a collection of
numerous files, hyperlinked but without any overall perspective to unite them.
The same holds for actual 3D spaces on the Internet. A 3D scene as defined by a
VRML file is a list of separate objects that may exist anywhere on the Internet,
each created by a different person or a different program. A user can easily add or
delete objects without taking into account the overall structure of the scene.290
Just as, in the case of a database, the narrative is replaced by a list of items, here a
coherent 3D scene becomes a list of separate objects.
With its metaphors of navigation and home steading, The Web has been
compared to the American Wild West. The spatialized Web envisioned by VRML
(itself a product of California) reflects the treatment of space in American culture
223
generally, in its lack of attention to any zone not functionally used. The marginal
areas that exist between privately owned houses, businesses and parks are left to
decay. The VRML universe, as defined by software standards and the default
settings of software tools, pushes this tendency to the limit: it does not contain
space as such but only objects that belong to different individuals. Obviously, the
users can modify the default settings and use the tools to create the opposite of
what the default values suggest. In fact, the actual muti-user spaces built on the
Web can be seen precisely as the reaction against the anti-communal and discrete
nature of American society, the attempt to substitute for the much discussed
disappearance of traditional community by creating virtual ones. (Of course, if we
are to follow the nineteenth century sociologist Ferdinand Tönnies, the shift from
traditional close-knit scale community to modern impersonal society already took
place in the nineteenth century and is an inevitable side-effect as well as a
prerequisite for modernization.291) However, it is important that the ontology of
virtual space as defined by software itself is fundamentally aggregate, a set of
objects without a unifying point of view.
If art historians, literary and film scholars have traditionally analyzed the
structure of cultural objects as reflecting larger cultural patterns (for instance,
Panofsky's reading of perspective), in the case of new media we should look not
only at the finished objects but first of all at the software tools, their organization
and default settings.292 This is particularly important because in new media the
relation between the production tools and the products is one of continuity; in
fact, it is often hard to establish the boundary between them. Thus, we may
connect the American ideology of democracy with its paranoid fear of hierarchy
and centralized control with the flat structure of the Web, where every page exists
on the same level of importance as any other and where any two sources
connected through hyperlinking have equal weight. Similarly, in the case of
virtual 3D spaces on the Web, the lack of a unifying perspective in U.S. culture,
whether in the space of an American city, or in the space of an increasingly
fragmented public discourse, can be correlated with the design of VRML, which
substitutes a collection of objects for a unified space.
The Poetics of Navigation
In order to analyze the computer representations of 3D space, I have used theories
from early art history; but it would not be hard to find other theories which can
work as well. However, navigation through space is a different matter. While art
history, geography, anthropology, sociology and other disciplines have came up
with many approaches to analyze space as a static, objectively existing structure,
we don’t have the same wealth of concepts to help us think about the poetics of
navigation through space. And yet, if I am right to claim that the key feature of
224
computer space is that it is navigable, we need to be able to address this feature
theoretically.
As a way to begin, we may take a look at some of the classical navigable
computer spaces. The 1978 project Aspen Movie Map, designed at the MIT
Architecture Machine Group, headed by Nicholas Negroponte (which later
expanded into MIT Media Laboratory) is acknowledged as the first publicly
shown interactive virtual navigable space, and also as the first hypermedia
program. The program allowed the user to "drive" through the city of Aspen,
Colorado. At each intersection the user was able to select a new direction using a
joystick. To construct this program, the MIT team drove through Aspen in a car
taking pictures every three meters. The pictures were then stored on a set of
videodiscs. Responding to the information from the joystick, the appropriate
picture or sequence of pictures was displayed on the screen. Inspired by a mockup
of an airport used by the Israeli commandos to train for the Entebbe hostage-
freeing raid of 1973,Aspen Movie Map was a simulator and therefore its
navigation modeled the real-life experience of moving in a car, with all its
limitations.293 Yet its realism also opened a new set of aesthetic possibilities
which, unfortunately, later designers of navigable spaces did not explore further.
All of them relied on interactive 3D computer graphics to construct their spaces.
In contrast, Aspen Movie Map utilized a set of photographic images; in addition,
because the images were taken every three meters, this resulted in an interesting
sampling of three dimensional space. Although in the 1990s Apple’s QuickTime
VR technology made this technique itself quite accessible, the idea of
constructing a large-scale virtual space from photographs or a video of a real
space was never tried out systematically again, although it opens up unique
aesthetic possibilities not available with 3D computer graphics.
Jeffrew Shaw's Legible City (1988-1991), another well-known and
influential computer navigable space, is also based on the exiting city.294 As in
Aspen Movie Map, the navigation also simulates a real physical situation, in this
case driving a bicycle. Its virtual space, however, is not tied to the simulation of
physical reality: it is an imaginary city made from 3D letters. In contrast to most
navigable spaces whose parameters are chosen arbitrarily, in Legible City
(Amsterdam and Karlsruhe versions) every value of its virtual space is derived
from the actual existing physical space it replaces. Each 3D letter in the virtual
city corresponds to an actual building in a physical city; the letter’s proportions,
color and location are derived from the building it replaces. By navigating
through the space, the user reads the texts composed by the letters; these texts are
drawn from the archive documents describing the city history. Through this
mapping Jeffrew Shaw foregrounds, or, more precisely, “stages,” one of the
fundamental problematics of new media and the computer age as a whole: the
relation between the virtual and the real. In his other works Shaw systematically
“staged” other key aspects of new media such as the interactive relation between
225
the viewer and the image, or the discrete quality of all computer-based
representations. In the case of Legible City, it functions not only as a unique
navigable virtual space of its own, but also as a comment on all the other
navigable spaces. It suggests that instead of creating virtual spaces which have
nothing to do with actual physical spaces, or the spaces which are closely
modeled after existing physical structures, such as towns or shopping malls, (this
holds for most commercial virtual worlds and VR works), we may take a middle
road. In Legible City, the memory of the real city is carefully preserved without
succumbing to illusionism; the virtual representation encodes the city’s genetic
code, its deep structure rather than its surface. Through this mapping Shaw
proposes an ethics of the virtual. Shaw suggests that the virtual can at least
preserve the memory of the real it replaces, encoding its structure, if not aura, in a
new form.
While Legible City was a landmark work in that it presented a symbolic
rather than illusionistic space, its visual appearance in many ways reflected the
default real-time graphics capability of SGI workstations on which it was running:
flat-shaded shapes attenuated by a fog. Char Davies and her development team at
SoftImage have consciously addressed the goal of creating a different, more
painterly aesthetic for the navigable space in their interactive VR installation
Osmose (1994-1995).295 From the point of view of history of modern art the
result hardly represented an advancement. Osmose simply replaced the usual
hard-edge polygonal Cézanne-like look of 3D computer graphics look with a
softer, more atmospheric, Renoir or late Monet-like environment made of
translucent textures and flowing particles. Yet in the context of other 3D virtual
worlds it was an important advance. The “soft” aesthetic of Osmose is further
supported through the use of slow cinematic dissolves between its dozen or so
worlds. Like in Aspen Movie Map and in Legible City, the navigation in Osmose
is modeled on a real-life experience, in this case, of scuba diving. The
"immersant” is controlling navigation by breathing: breathing in sends the body
upward, while breathing out makes it fall. The resulting experience, according to
the designers, is one of floating, rather than flying or driving, typical of virtual
worlds. Another important aspect of Osmose's navigation is its collective
character. While only one person can be "immersed" at a time, the audience can
witness her or his journey through the virtual worlds as it unfolds on a large
projection screen. At the same size, another translucent screen enables the
audience to observe the body gestures of the “immersant” as a shadow-silhouette.
The "immersant" thus becomes a kind of ship captain, taking the audience along
on a journey; like the captain, she occupies a visible and symbolically marked
position, being responsible for the audience's aesthetic experience.
Tamás Waliczky’s The Forest (1993) liberated the virtual camera from its
typical enslavement to the simulation of humanly possible navigation, be it
walking, driving a car, pedaling a bicycle or scuba diving. In The Forest the
226
camera slides through the endless black and white forest in a series of complex
and melancholic moves. If modern visual culture exemplified by MTV can be
thought of as a Mannerist stage of cinema, its perfected techniques of
cinematography, mise-en-scene and editing self-consciously displayed and
paraded for its own sake, Waliczky's film presents an alternative response to
cinema’s classical age, which is now behind us. In this meta-film, the camera, part
of cinema’s apparatus, becomes the main character (in this we may connect The
Forest to another meta-film, A Man with a Movie Camera). On first glance, the
logic of camera movements can be identified as the quest of a human being trying
to escape from the forest (which, in reality, is just a single picture of a tree
repeated over and over). Yet, just as in some of the Brothers Quay animated films
such as The Street of Crocodiles, the virtual camera of The Forest neither
simulates natural perception nor does it follow the standard grammar of cinema’s
camera; instead, it establishes a distinct system of its own. In The Street of
Crocodiles the camera suddenly takes off, rapidly moving in a straight line
parallel to an image plane, as though mounted on some robotic arm, and just as
suddenly stops to frame a new corner of the space. The logic of these movements
is clearly non-human; this is the vision of some aliean creature. In contrast, in The
Forest the camera never stops at all, the whole film being one uninterrupted
camera trajectory. The camera system of The Forest can be read as a comment
on a fundamentally ambiguous nature of computer space. On the one hand, not
indexically tied up to physical reality or human body, computer space is isotropic.
In contrast to human space, in which the verticality of the body and the direction
of the horizon are two dominant directions, computer space does not privilege any
particular axis. In this way it is similar to the space of El Lissitzky's Prouns and
Kazimir Malevich's suprematist compositions — an abstract cosmos,
unencumbered by either Earth’s gravity or the weight of a human body. (Thus the
game Spacewar with its simulated gravity got it wrong!) William Gibson’s term
“matrix’ which he used in his novels to refer to cyberspace, captures well this
isotropic quality. But, on the other hand, computer space is also a space of a
human dweller, something which is used and traversed by a user, who brings her
own anthropological framework of horizontality and verticality. The camera
system of The Forest foregrounds this double character of computer space. While
no human figures or avatars appear in the film and we never get to see either the
ground or the sky, it is centered around the stand-in for the human subject — a
tree. The constant movements of the camera along the vertical dimension
throughout the film — sometimes getting closer to where we imagine the ground
plane is located, sometimes moving towards (but again, never actually showing)
the sky — can be interpreted as an attempt to negotiate between isotropic space
and the space of human anthropology, with its horizontality of the ground plane
and the horizontal and vertical dimension of human bodies. The navigable space
of The Forest thus mediates between human subjectivity and the very different
and ultimately alien logic of a computer — the ultimate and omnipresent Other of
227
our age.
While the works discussed so far all created virtual navigable spaces,
George’s Legrady interactive computer installation Transitional Spaces (1999)
moves back from virtual into physical. Legrady locates already existing
architectural navigable space (Siemens headquarters building in Munich) and
makes it into an “engine” which triggers three cinematic projections. As regular
office stuff and vistors move through the main entrance section and second level
exit/entrance passage ways, their motions are picked up cameras and are used to
control the projections. Legrady writes in his installation proposal:
As the speed, location, timing, and number of individuals in the space
control the sequence and timing of pojection sequences, the audience will
have the opportunity to “play” the system, that is, engage consciously by
interacting with the camera sensing to conrol the narrative flow of the
installation.
All three projections will comment on the notion of “transitional
space” and narrative development. Images sequences will represent
transitional states: from noise covered to clear, from empty to full, from
open to close, from dark to light, from out of focus to in-focus.296
Legrady’s installtion begins to explore one element in the “vocabulary” of
navigable space “althabet”: transition from one state to another. (Other potential
elements, or rather dimensions, include the character of a trajectory; the pattern of
user’s movement — for instance, rapid geometric momevement in Doom versus
wondering in Myst — the possible interactions between user and the space, such
as the character acting as a center of perspective in Waliczky’s The Garden
(1992); and, of course, the architecture of space itself). While the definition of
narrative by Mieke Bal which I invoked earlier may be too restrictive in relation
to new media, Legrady quotes another, much broader defintion by literary theorist
Tzvetan Todorov. According to him minimal narrative involves the passage from
“one equilibium to another” (or, in diffirent words, from one state to another.)
Legrady’s installation suggests that we can think of subject’s movement from one
“stable” point in space to another (for insance, moving from an lobby to a
building to an office) like a narrative; by analogy, we may also think of a
transition from on state of a new media object to another (for instance, from a
noisy image to a noise-free image) as a minimal narrative. For me, the second
equisation is more problematic than the first, because, in contrast to literara
narrative, it is hard to say what constitues a “state of equilibrium” in a typical new
media object. Nethertheless, rather than concluding that in Legrady’s installation
does not really create narratives, we should recognize it instead is an important
example of a whole trend among new media artists: to explore the minimal
228
condition of a narrative. In the later section “New Temporality: Loop as a
Narrative Engine” I will discuss these investigations in relation to another new
media convention: the loop.
The computer spaces just discussed, from Aspen Movie Map to Forest,
each establish a distinct aesthetic of their own. However, the majority of
navigable virtual spaces mimic existing physical reality without proposing any
coherent aesthetic programs. What artistic and theoretical traditions can the
designers of navigable spaces draw upon to make them more interesting? One
obvious candidate is modern architecture. From Melnikov, Le Corbusier and
Frank Lloyd Wright to Arhigram and Bernard Tschumi, modern architects
elaborated a variety of schemes for structuring and conceptualizing space to be
navigated by users. Using a few examples from these architects, we can look at
the 1925 USSR Pavilion (Melnikov,), Villa Savoye (Le Corbusier), Walking City
(Arhigram), and Parc de la Villette (Tschumi).297 Even more relevant is the
tradition of "paper architecture" — the designs which were not intended to be
built and whose authors therefore felt unencumbered by the limitations of
materials, gravity and budgets.298 Another highly relevant tradition is film
architecture.299 As discussed in the "Theory of Cultural Interfaces" section, the
standard interface to computer space is the virtual camera modeled after a film
camera, rather than a simulation of unaided human sight. After all, film
architecture is The architecture designed for navigation and exploration by a film
camera.
Along with different architectural traditions, designers of navigable spaces
can find a wealth of relevant ideas in modern art. They may consider, for instance,
the works of modern artists which exist between art and architecture and which,
like projects of paper architects, display spatial imagination not tied up to the
questions of utility and economy: warped worlds of Jean Dubuffet, mobiles by
Alexander Calder, earth works by Robert Smithson, moving text spaces by Jenny
Holzer. While many modern artists felt compelled to create 3D structures in real
spaces, others were satisfied with painting their virtual worlds: think, for,
instance, of melancholic cityscapes by Giorgio de Chirico, biomorphic worlds by
Yves Tanguy, economical wireframe structures by Alberto Giacometti, existential
landscapes by Anselm Kiefer. Besides providing us with many examples of
imaginative spaces, both abstract and figurative, modern painting is relevant to
the design of virtual navigable spaces in two additional ways. First, since new
media is most often experienced, like painting, via a rectangular frame (see “The
Screen and the User”), virtual architects can study how painters organized their
spaces within the constraints of a rectangle. Second, modern painters who belong
to what I call the “space-medium” tradition elaborated the concept of space as a
homogeneous dense field, where everything is made from the same “stuff” — in
contrast to architects which always have to work with a basic dichotomy between
229
the build structure and the empty space. And although virtual spaces realized until
now, with the possible exception of Osmose, follow the same dichotomy between
rigid objects and a void between them, on the level of material organization they
are intrinsically related to the monistic ontology of modern painters such as
Matta, Giacometti, or Pollock, for everything in them is also made from the same
material — pixels, on the level of surface; polygons or voxels, on the level of 3D
representation). Thus virtual computer space is structurally closer to modern
painting than to architecture.
Along with painting, a genre of modern art which has a particular
relevance to the design of navigable virtual spaces is installation. Seen in the
context of new media, many installations can be thought of as dense multimedia
information spaces. They combine images, video, texts, graphics and 3D elements
within a spatial layout. While most installations leave it up to the viewer to
determine the order of “information access” to their elements, one of the most
well-known installation artists, Ilya Kabakov, elaborated a system of strategies to
structure the viewer's navigation through his spaces.300 According to Kabakov, in
most installations "the viewer is completely free because the space surrounding
her and the installation remain completely indifferent to the installation it
encloses."301 In contrast, by creating a separate enclosed space with carefully
chosen proportions, colors and lighting within the larger space of a museum or a
gallery, Kabakov aims to completely "immerse" the viewer inside his installation.
He calls this installation type a "total installation."
For Kabakov, "total" installation has a double identity. On the one hand, it
belongs to plastic arts designed to be viewed by an immobile spectator —
painting, sculpture, architecture. On the other hand, it also belongs to time-based
arts such as theater and cinema. We can say the same about virtual navigable
spaces. Another concept of Kabakov’s theory which is directly applicable to
virtual space design is his distinction between the spatial structure of an
installation and its dramaturgy, i.e. the time-space structure created by the
movement of a viewer through an installation.302 Kabakov’s strategies of
dramaturgy include dividing the total space of an installation into two or more
connected spaces; creating a well-defined path through the space which does not
preclude the viewer from wandering on her own, yet prevents her from feeling
being lost and being bored. To make such a path, Kabakov constructs corridors
and abrupt openings between objects, he also places objects in strange places to
obstruct passage where one expects to discover a clear pathway. Another strategy
of “total installation” is the choice of particular kinds of narratives which lead
themselves to spatialization. These are the narratives which take place around a
main event which becomes the center of an installation: "the beginning [of the
installation] leads to the main event [of the narrative] while the last part exists
after the event took place." Yet another strategy involves the positioning of text
230
within the space of an installation as a way to orchestrate the attention and
navigation of the viewer. For instance, placing two to three pages of texts at a
particular point in the space creates a rhythmic stop in the navigation rhythm.303
Finally, Kabakov "directs" the viewer to keep alternating between focusing her
attention on particular details and the installation as a whole. He describes these
two kinds of spatial attention (which we can also correlate with haptic and optic
perception as theorized by Riegl and others) as follows: "wandering, total
("summarnaia") orientation in space — and active, well-aimed 'taking in' of
partial, small, the unexpected."304
All these strategies can be directly applied to the design of virtual
navigable spaces (and interactive multimedia in general). In particular, Kabakov
is very successful in making the viewers of his installations carefully read
significant amounts of text included in them — something which represents a
constant challenge for new media designers. His constant emphasis on always
thinking about the viewer's attention and reaction to what she will encounter —
"the reaction of the viewer during her movement through the installation is the
main concern of her designer… The loss of the viewer's attention is the end of the
installation"305 — is also an important lesson to new media designers who often
forgot that what they are designing is not an object in itself but a viewer's
experience in time and space.
I have used the word "strategy" to refer to Kabakov’s techniques on
purpose. To evoke the terminology of The Practice of Everyday Life by French
writer Michel de Certeau, Kabakov uses strategies to impose a particular matrix
of space, time, experience and meaning on his viewers; they, in their turn, use
"tactics" to create their own trajectories (this is a term actually used by de
Certeau) within this matrix. If Kabakov is perhaps the most accomplished
architect of navigable spaces, de Certeau can very well be their best theoretician.
Like Kabakov, he never dealt with computer media directly, and yet his The
Practice of Everyday Life has a multitude of ideas directly applicable to new
media. His general notion of how a user's “tactics” which create their own
trajectories through the spaces defined by others (both metaphorically, and, in the
case of spatial tactics, literally) is a good model to think about computer users
navigating through computer spaces they did not design:
Although they are composed with the vocabularies of established
languages (those of television, newspapers, supermarkets of established
sequences) and although they remain subordinated to prescribed
syntactical forms (temporal modes of schedules, paradigmatic orders of
spaces, etc.), the trajectories trace out the rules of other interests and
desires that are neither determined, nor captured by, the system in which
they develop.306
231
The Navigator and the Explorer
Why is navigable space such a popular construct in new media? What are the historical
origins and precedents of this form?
In his famous 1863 essay "The Painter of Modern Life", Charles Baudelaire
documented the new modern male urban subject — the flâneur.307 (Recent history
of visual culture, film theory, cultural history and writings on cyberculture has
already invoked the figure of the flâneur much too often; my justification for
invoking it once again here is that I hope to use it in new ways.) An anonymous
observer, the flâneur navigates through the space of a Parisian crowd, recording
and immediately erasing the faces and the figures of the passers-by in his memory.
From time to time, his gaze meets the gaze of a passing woman, engaging her in a
split-second virtual affair, only to be unfaithful to her with the next female passer-
by. The flâneur is only truly at home in one place — moving through the crowd.
Baudelaire writes: "To the perfect spectator, the impassioned observer, it is an
immense joy to make his domicile amongst numbers, amidst fluctuation and
movement, amidst the fugitive and infinite… To be away from home, and yet to
feel at home; to behold the world, to be in the midst of the world and yet to remain
hidden from the world." There is a theory of navigable virtual spaces hidden here,
and we can turn to Walter Benjamin to help us in articulating it. According to
Benjamin, the flâneur’s navigation transforms the space of the city: "The Crowd is
the veil through which the familiar city lures the flâneur like a phantasmargonia. In
it the city is now a landscape, now a room."308 The navigable space thus is a
subjective space, its architecture responding to the subject’s movement and
emotion. In the case of the flâneur moving through the physical city, this
transformation of course only happens in the flâneur’s perception, but in the case
of navigation through a virtual space, the space can literally change, becoming a
mirror of the user’s subjectivity. The virtual spaces built on this principle can be
found in such films as Waliczky's The Garden and The Dark City (Alex Proyas,
1998).
Following European tradition, the subjectivity of the flâneur is determined
by his interaction with a group — even though it is a group of strangers. In place
of a close-knit community of a small-scale traditional society (Gemeinschaft) we
now have an anonymous association of a modern society (Gesellshaft).309 We can
interpret the flâneur’s behavior as a response to this historical shift. It is as though
he is trying to compensate for the loss of a close relationship with his group by
inserting himself into the anonymous crowd. He thus exemplifies the historical
shift from Gemeinschaft to Gesellshaft, and the fact that he only feels at home in
the crowd of strangers shows the psychological price paid for modernization. Still,
the subjectivity of the flâneur is, in its essence, intersubjectivity: the exchange of
232
glances between him and the other human beings.
A very different image of a navigation through space — and of
subjectivity — is presented in the novels of nineteenth century American writers
such as James Fenimore Cooper (1789-1851) or Mark Twain (1835-1910). The
main character of Cooper's novels, the wilderness scout Natty Bumppo, alias
Leatherstocking, navigates through spaces of nature rather than culture. Similarly,
in Twain's Huckleberry Finn, the narrative is organized around the voyage of the
two boy heroes down the Mississippi River. Instead of the thickness of the urban
human crowd which is the milieu of a Parisian flâneur, the heroes of these
American novels are most at home in the wilderness, away from the city. They
navigate forests and rivers, overcoming obstacles and fighting enemies. The
subjectivity is constructed through the conflicts between the subject and nature,
and between the subject and his enemies, rather than through interpersonal
relations within a group. This structure finds its ultimate expression in the unique
American form, the Western, and its hero, the cowboy — a lonely explorer who
only occasionally shows up in town to get a drink at the bar. Rather than
providing the home for the cowboy, as it does for the flâneur, the town is a hostile
place, full of conflict, which eventually erupts into the inevitable showdown.
Both the flâneur and the explorer find their expression in different subject
positions, or phenotypes, of new media users. Media theoretician and activist
Geert Lovink describes the figure of the present-day media user and Net surfer
whom he calls the Data Dandy. Although Lovink's reference is Oscar Wilde
rather than Baudelaire, his Data Dandy exhibits the behaviors which also qualify
him to be called a Data Flâneur. "The Net is to the electronic dandy what the
metropolitan street was for the historical dandy."310 A perfect aesthete, the Data
Dandy loves to display his private and totally irrelevant collection of data to other
Net users. "Wrapped in the finest facts and the most senseless gadgets, the new
dandy deregulates the time economy of the info = money managers... if the
anonymous crowd in the streets was the audience of the Boulevard dandy, the
logged-in Net-users are that of the data dandy."311 While displaying his
dandyism, the data dandy does not want to be above the crowd; like Baudelaire's
flâneur, he wants to lose himself in its mass, to be moved by the semantic vectors
of mass media icons, themes and trends. As Lovink points out, a data dandy "can
only play with the rules of the Net as a non-identity. What is exclusivity in the age
of differentiation?...Data dandyism is born of an aversion of being exiled into a
subculture of one's own."312 Although Lovink positions Data Dandy exclusively
in data space ("Cologne and pink stockings have been replaced by precious
Intel"), the Data Dandy does have a dress code of his own. This look is popular
with new media artists of the 1990s: no labels, no distinct design, no bright colors
or extravagant shapes — a non-identity which is nevertheless paraded as style and
which in fact is carefully constructed (as I learned while shopping in Berlin in
233
1997 with Russian net.artist Alexei Shulgin.) The designers who exemplify this
style in the 1990s are Hugo Boss and Prada, whose restrained no-style style
contrasts with the opulence of Versace and Gucci, the stars of the 1980s era of
exess. The new style of non-identity perfectly corresponds to the rise of the Net,
where endless mailing lists, newsgroups, and sites delude any single topic, image
or idea — "On the Net, the only thing which appears as a mass is information
itself... Today's new theme is tomorrow's 23 newsgroups."313
If the Net surfer, who keeps posting to mailing lists and newsgroups and
accumulating endless data, is a reincarnation of Baudelaire's flâneur, the user
navigating a virtual space assumes the position of the nineteenth century explorer,
a character from Cooper and Twain. This is particularly true for the navigable
spaces of computer games. The dominance of spatial exploration in games
exemplifies the classical American mythology in which the individual discovers
his identity and builds character by moving through space. Correspondingly, in
many American novels and short stories (O’Henry, Hemingway) narrative is
driven by the character’s movements in the outside space. In contrast, in the 19th
century European novels there is not much movement in physical space, because
the action takes place in a psychological space. From this perspective, most
computer games follow the logic of American rather than European narrative.
Their heroes are not developed and their psychology is not represented. But, as
these heroes move through space, defeating enemies, acquiring resources and,
more importantly, skill, they are "building character." This is particularly true for
Role Playing Games (RPG) whose narrative is one of self-improvement. But it
also holds for other game genres (action, adventure, simulators) which put the
user in command of a character (Doom, Mario, Tomb Rider). As the character
progresses through the game, the user herself or himself acquires new skills and
knowledge. She learns how to outwit the mutants lurking in Doom levels, how to
defeat the enemies with just a few kicks in Tomb Rider, how to solve the secrets
of the playful world in Mario, and so on.314
While movement through space as a means of building character is one
theme of American frontier mythology, another is exploring and "culturing"
unknown space. This theme is also reflected in computer games’ structure. A
typical game begins at some point in a large unknown space; in the course of the
game, the player has to explore this space, mapping out its geography and
unraveling its secrets. In the case of games organized into discrete levels such as
Doom, the player has to systematically investigate all the spaces of a given level
before he can move to the next level. In other game which takes place over one
large territory, the game play gradually involves larger and larger parts of this
territory (Adventure, War Craft).
This is one possible theory, one historical trajectory: from flâneur to Net
surfer; from nineteenth century American explorer to the explorer of navigable
virtual space. Although this section focuses on navigating a space in a literal
234
sense, i.e. moving through a 3D virtual space, this concept is also a key metaphor
used to conceptualize new media. From the 1980s concept of cyberspace to the
1990s software such as Netscape Navigator, interacting with computerized data
and media has been consistently framed in spatial terms. Computer scientists
adopted this metaphor as well: they use the term navigation to refer to different
methods of organizing and accessing hypermedia, even though a 3D virtual space
interface is not at all the most common method. For instance, in his Elements of
Hypermedia Design Peter Gloor lists “seven design concepts for navigation in
dataspace”: linking, searching, sequentialization, hierarchy, similarity, mapping,
guides and agents.315 Thus, “navigating the Internet” includes following the
hyperlinks, using menus commonly provided by Web sites, as well as using
search engines. If we accept this spatial metaphor, both the nineteenth century
European flâneur and the American explorer find their reincarnation in the figure
of the net surfer. We may even correlate these two historical figures with the
names of two most popular Web browsers: the flâneur of Baudelaire — Netscape
Navigator; an explorer of Cooper, Twain and Hemingway — Internet Explorer.
Of course, names apart, these two browsers are functionally quite similar.
However, given that they both focus on a single user navigating through the Web
sites rather than more communal experiences, such as newsgroups, mailing lists,
text-based chat and IRC, we can say that they privilege the explorer rather than
the flâneur — single user navigating through an unknown territory rather than a
member of a group, even is this group is a crowd of strangers. And although
different software solutions have been developed to make Internet navigation
more of a social experience — for instance, allowing remote users to
simultaneously navigate the same Web site together; or allowing the user to see
who already accessed a particular document — an individual navigation through
the “history-free” data stilled remained the norm at the end of the 1990s.
Kino-Eye and Simulators
It is also possible to construct a different trajectory which will lead from the
Parisian flaneurie to navigable computer spaces. In Window Shopping film
historian Anne Friedberg presents an archeology of a mode of perception which,
according to her, characterizes modern cinematic, televisual, and cyber cultures
and which she calls a “mobilized virtual gaze.”316 This mode combines two
conditions: “a received perception mediated through representation” and a travel
“in an imaginary flanerie through an imaginary elsewhere and an imaginary
elsewhen.”317 According to Friedberg’s archeology, this mode emerged when a
new nineteenth century technology of virtual representation — photography —
merged with the mobilized gaze of tourism, urban shopping and flanerie.318 As
235
can be seen, Friedberg connects Baudelaire's flâneur with a range of other modern
practices: “The same impulses which send flâneurs through the arcades,
traversing the pavement and wearing thin their shoe leather, sent shoppers into the
department stores, tourists to exhibitions, spectators into the panorama, diaroma,
wax museum, and cinema.”319 The flâneur occupies the privileged position
among these practices because he embodied most strongly the desire to combine
perception with motion through a space. All that remained in order to arrive at a
“mobilized virtual gaze” was to virtualize this perception — something which
cinema accomplished in the last decade of the nineteenth century.
While Friederg’s account ends with television and does consider new
media, the form of navigable virtual space fits well in her historical trajectory.
Navigation through a virtual space, whether in a computer game, a motion
simulator, data visualizations or a 3D human-computer interface, follows the logic
of a “virtual mobile gaze.” Instead of Parisian streets, shopping windows and the
faces of the passers-by, the virtual flâneur travels through virtual streets,
highways and planes of data; the eroticism of a split-second virtual affair with a
passer-by of the opposite sex is replaced with the excitement of locating and
opening a particular file or zooming into the virtual object. Just as the original
flâneur of Baudelaire, the virtual flâneur is happiest on the move, clicking from
one object to another, traversing room after room, level after level, data volume
after data volume.
Thus, just as a database form can be seen as an expression of ‘database
complex,’ an irrational desire to preserve and store everything, navigable space is
not just a purely functional interface. Ii is also an expression and gratification of
psychological desire; a state of being; a subject position — or rather, a subject’s
trajectory. If the subject of modern society was looking for refuge from the chaos
of the real world in the stability and balance of the static composition of a
painting, and later in cinema’s image, the subject of the information society finds
peace in the knowledge that she can slide over endless fields of data, locating any
morsel of information with the click of a button, zooming through file systems
and networks. She is comforted not by the equilibrium of shapes and colors, but
by the variety of data manipulation operations at her control.
Does this mean that we have reached the end of the trajectory described by
Friederg? While still enjoying a privileged place in computer culture, flanerie now
shows its age. Here we can make an analogy with the history of GUI (Graphical
User Interface). Developed at Xerox Park in the 1970s and commercialized by
Apple in the early 1980s, it was appropriate when a typical user’s hard drive
contained dozens or even hundreds of files. But for the next stage of Net-based
computing in which the user is accessing millions of files it is no longer
sufficient.320 Bypassing the ability to display and navigate the files graphically,
the user resorts to a text-based search engine. Similarly, while a “mobilized
virtual gaze,” described by Friederg, was a significant advancement over earlier
236
more static methods of data organization and access (static image, text, catalog,
library), in the information age its “bandwidth” is too limited. Moreover, a simple
simulation of movement through a physical space defeats a computer’s new
capabilities of data access and manipulation. Thus, for a virtual flâneur such
operations as search, segmentation, hyperlinking and visualization and data
mining are more satisfying than just navigating through a simulation of a physical
space.
In the 1920s Dziga Vertov already understood this very well. A Man with
a Movie Camera is an important point in the trajectory which leads from
Baudelaire's flanerie to Aspen Movie Map, Doom and VRML worlds not simply
because Vertov’s film is structured around the camera’s active exploration of city
spaces, and not only because it fetishizes the camera’s mobility. Vertov wanted to
overcome the limits of human vision and human movement through space to
arrive at more efficient ways of data access. However, the data he worked with is
raw visible reality — not reality digitized and stored in computer’s memory as
numbers. Similarly, his interface was a film camera, i.e. an anthropomorphic
simulation of human vision — not computer algorithms. Thus Vertov stands half-
way between Baudelaire's flâneur and computer user: no longer just a pedestrian
walking through a street, but not yet Gibson’s data cowboy who zooms through
pure data armed with data mining algorithms.
In his research on what can be called “kino-eye interface,” Vertov
systematically tried different ways to overcome what he thought were the limits of
human vision. He mounted cameras on the roof of a building and a moving
automobile; he slowed and speed up film speed; he superimposed a number of
images together in time and space (temporal montage and montage within a shot).
A Man with a Movie Camera is not only a database of city life in the 1920s, a
database of film techniques, and a database of new operations of visual
epistemology, but it is also a database of new interface operations which together
aim to go beyond a simple human navigation through a physical space.
Along with A Man with a Movie Camera, another key point in the
trajectory, from the navigable space of a nineteenth century city to the virtual
navigable computer space, is flight simulators. At the same time when Vertov was
working on his film, young American engineer E.A. Link, Jr. developed the first
commercial flight simulator. Significantly, Link’s patent for his simulator filed in
1930 refers to it as a “Combination Training Device for Student Aviators and
Entertainment Apparatus.”321 Thus, rather than being an after-thought, the
adaptation of flight simulator technology to consumer entertainment which took
place in the 1990s was already envisioned by its inventor. Link’s design was a
simulation of a pilot’s cockpit with all the controls, but, in contrast to a modern
simulator, it had no visuals. In short, it was a motion ride without a movie. In the
1960s, visuals were added by using new video technology. A video camera was
mounted on a movable arm positioned over a room size model of an airport. The
237
movement of the camera was synchronized with the simulator controls; its image
was transmitted to a video monitor in the cockpit. While useful, this approach was
limited because it was based on physical reality of an actual model set. As we saw
in the “Compositing” section, a filmed and edited image is a better simulation
technology than a physical construction; and a virtual image controlled by a
computer is better still. Not surprisingly, soon after interactive 3D computer
graphics technology was developed, it was applied to produce visuals for the
simulators by one of his developers. In 1968, Ivan Sutherland, who already
pioneered interactive computer-aided design (“Sketchpad,” 1962) and virtual
reality (1967), formed a company to produce computer-based simulators. In the
1970s and 1980s simulators were one of the main applications of real-time 3D
computer graphics technology, thus determining to a significant degree the way
this technology was developed (see “Synthetic Realism as Bricolage.”) For
instance, simulation of particular landscape features which are typically seen by a
pilot, such as flat and mountain terrain, sky with clouds, and fog, all became
important research problems.322 The application of interactive graphics for
simulators has also shaped the imagination of researchers regarding how this
technology can be used. It naturalized a particular idiom: flying through a
simulated spatial environment.
Thus, one of the most common forms of navigation used today in
computer culture — flying through spatialized data — can be traced back to the
1970s military simulators. From Baudelaire's flâneur strolling through physical
streets we move to Vertov's camera mounted on a moving car and then to the
virtual camera of a simulator which represents the viewpoint of a military pilot.
Although it was not an exclusive factor, the end of the Cold War played an
important role in the extension of this military mode of perception into general
culture. Until 1990, such companies as Evans and Sutherland, Boeing and
Lockheed were busy developing multi-million simulators. As the military orders
dried up, they had to look for consumer applications of their technology. During
the 1990s, these and other companies converted their expensive simulators into
arcade games, motion rides and other forms of location-based entertainment. By
the end of the decade, Evans and Sutherland’s list of products included image
generators for use in military and aviation simulators; a virtual set technology for
use in television production; Cyber Fighter, a system of networked game stations
modeled after networked military simulators; and Virtual Glider, an immersive
location-based entertainment station.323 As the military budgets continued to
diminish and entertainment budgets soared, entertainment and military often came
to share the same technologies and to employ the same visual forms. Probably the
most graphic example of the ongoing circular transfer of technology and
imagination between the military and the civilian sector in new media is the case
of Doom. Originally developed and released over the Internet as a consumer game
in 1993 by id software, it was soon picked by the U.S. Marine Corps who
238
customized it into a military simulator for group combat training.324 Instead of
using multi-million dollar simulators, the Army could now train soldiers on a $50
game. The Marines, who were involved in the modifications, then went on to
form their own company in order to market the customized Doom as a
commercial game.
The discussion of the military origins of navigable space form would be
incomplete without acknowledging the pioneering work of Paul Virilio. In his
brilliant 1984 book War and Cinema Virilio documented numerous parallels
between military and film cultures of the twentieth century, including the use of a
mobile camera moving through space in film in military aerial surveillance and
cinematography.325 Virilio went on to suggest that while space was the main
category of the nineteenth century, the main category of the twentieth century was
time. As already discussed in “Teleaction,” for Virilio, telecommunication
technology eliminates the category of space altogether as it makes every point on
Earth as accessible as any other — at least in theory. This technology also leads to
real time politics, which require instant reactions to the events transmitted at the
speed of light, and ultimately can only be handled efficiently by computers
responding to each other without human intervention. From a post-Cold War
perspective, Virilio’s theory can be seen as another example of the imagination
transfer from the military to civilian sector. In this case, techno-politics of the
Cold War nuclear arms equilibrium between the two super powers, which at any
moment were able to strike each other at any point on Earth, came to be seen by
Virilio as a fundamentally new stage of culture, where real time triumphs over
space.
Although Virilio did not write on computer interface, the logic of his
books suggests that the ideal computer interface for a culture of real time politics
would be the War Room in Dr. Strangelove or: How I Learned to Stop Worrying
and Love the Bomb (Stanley Kubrick, 1964) with its direct lines of
communication between the generals and the pilots; or DOS command lines with
their military economy of command and response, rather than the more
spectacular but inefficient VRML worlds. Yet, uneconomical and inefficient as it
may be, navigable space interface is thriving across all areas of new media. How
can we explain its popularity? Is it simply a result of cultural inertia? A left-over
from the nineteenth century? A way to make the ultimately Alien space of a
computer compatible with humans by anthropomorphizing it, superimposing a
simulation of a Parisian flanerie over abstract data? A relic of Cold War culture?
While all these answers make sense, it would be unsatisfactory to see
navigable space as only the end of a historical trajectory, rather than as a new
beginning. The few computer spaces discussed here point toward some of the
aesthetic possibilities of this form; more possibilities are contained in the works of
modern painters, installation artists and architects. Theoretically as well,
navigable space represents a new challenge. Rather than only considering
239
topology, geometry and logic of a static space, we need to take into account the
new way in which space functions in computer culture: as something traversed by
a subject, as a trajectory rather than an area. But computer culture is not the only
field where the use of the category of navigable space makes sense. I will now
briefly look at two other fields — anthropology and architecture — where we find
more examples of “navigable space imagination.”
In his book Non-places. Introduction to an Anthropology of
Supermodernity French anthropologist Marc Auge advances the hypothesis that
“supermodernity produces non-places, meaning spaces which are not themselves
anthropological places and which, unlike Baudelairean modernity, do not
integrate with earlier places.”326 Place is what anthropologists have studied
traditionally; it is characterized by stability, and it supports stable identity,
relations and history.327 Auge's main source for his distinction between place and
space, or non-place, is Michel de Certeau: “Space, for him, is a ‘frequent place,’
‘an intersection of moving bodies’: it is the pedestrians who transform a street
(geometrically defined as a place by town planners) into a space”; it is an
animation of a place by the motion of a moving body.328 Thus, from one
perspective we can understand place as a product of cultural producers, while
non-places are created by users; in other words, non-place is an individual
trajectory through a place. From another perspective, in supermodernity,
traditional places are replaced by equally institutionalized non-places, a new
architecture of transit and impermanence: hotel chains and squats, holiday clubs
and refugee camps, supermarkets, airports and highways. Non-place becomes the
new norm, the new way of existence.
It is interesting that as the subject who exemplifies the condition of
supermodernity, Auge picks up the counterpart to the pilot or a user of a flight
simulator — an airline passenger. “Alone, but one of many, the user of a non-place
has contractual relations with it.” This contract relieves the person of his usual
determinants. “He becomes no more than what he does or experiences in the role
of passenger, customer or driver.”329 Auge concludes that “as anthropological
places create the organically social, so non-places create solitary contractuality,”
something which he sees as the very opposite of a traditional object of sociology:
“Try to imagine a Durkheimian analysis of a transit lounge at Roissy!”330
Architecture by its very definition stands on the side of order, society and
rules; it is thus a counterpart of sociology as it deals with regularities, norms and
"strategies" (to use de Certeau’s term). Yet the very awareness of these
assumptions underlying architecture led many contemporary architects to focus
their attention on the activities of users who through their "speech acts"
"reappropriate the space organized by the techniques of sociocultural production"
(de Certeau).331 Architects come to accept that the structures they design will be
240
modified by users’ activities, and that these modifications represent an essential
part of architecture. They also took up the challenge of "a Durkheimian analysis
of a transit lounge at Roissy," putting their energy and imagination into design of
non-places such as an airport (Kansai International Airport in Osaka by Renzo
Piano), a train terminal (Waterloo International Terminal in London by Nicholas
Grimshaw) or a highway control station (Steel Cloud or Los Angeles West Coast
Gateway by Asymptote Architecture group).332 Probably the ultimate in non-
place architecture has been one million square meter Euralille project which
redefined the existing city of Lille, France as the transit zone between the
Continent and London. The project attracted some of the most interesting
contemporary architects: Rem Koolhaas designed the masterplan while Jean
Nouvel built Centre Euralille containing a shopping center, a school, a hotel, and
apartments next to the train terminal. Centered around the entrance to the
Chunnel, the underground tunnel for cars which connects the Continent and
England, and the terminal for the high speed train which travels between Lille,
London, Brussels and Paris, Euralille is a space of navigation par excellence; a
mega-non-place. Like the network players of Doom, Euralille users emerge from
trains and cars to temporarily inhabit a zone defined through their trajectories; an
environment "to just wander around inside of" (Robyn Miller); "an intersection of
moving bodies" (de Certeau).
EVE and Place
We have come a long way since Spacewar (1962) and Computer Space (1971) —
at least, in terms of graphics. The images of these early computer games seem to
have more in common with abstract paintings of Malevich and Mondrian than
with the photorealistic renderings of Quake (1996) and Unreal (1997). But
whether this graphics evolution was also accompanied by a conceptual evolution
is another matter. Given the richness of modern concepts of space developed by
artists, architects, filmmakers, art historians and anthropologists, our computer
spaces have a long way to go.
Often the way to go forward is to go back. As this section suggested, the
designers of virtual spaces may find a wealth of relevant ideas by looking at
twentieth century art, architecture, film and other arts. Similarly, some of the
earliest computer spaces, such as Spacewar and Aspen Movie Map, contained
aesthetic possibilities which are still waiting to be explored. As a conclusion, I
will discuss two more works by Jeffrey Shaw who draws upon various cultural
traditions of space construction and representation probably more systematically
more than any other new media artist.
While Friedberg’s concept of virtual mobile gaze is useful in allowing us
to see the connections between a number of technologies and practices of spatial
241
navigation, such as Panorama, cinema and shopping, it can also make us blind to
the important differences between them. In contrast, Shaw’s EVE (1993 — ) and
Place: A User’ Manual (1995) emphasize both similarities and differences
between various technologies of navigation.333 In these works, Shaw evokes the
navigation methods of Panorama, cinema, video and VR. But rather than
collapsing different technologies into one, Shaw "layers" them on side by side.
That is, he literally encloses the interface of one technology within the interface of
another. For instance, in the case of EVE the visitors find themselves inside a
large semi-sphere reminiscent of the 19th century Panorama. The projectors
located in the middle of the sphere throw a rectangular image on the inside
surface of the semi-sphere. In this way, the interface of cinema (an image
enclosed by a rectangular frame) is placed inside the interface of Panorama (a
semi-spherical enclosed space). In Place: A User’ Manual a different "layering"
takes place: Panorama interface is placed inside a typical computer space
interface. The user navigates a virtual landscape using first-person perspective
characteristic of VR, computer games and navigable computer spaces in general.
Inside this landscape are eleven cylinders with photographs mapped on them.
Once the user moves inside one of these cylinders, she switches to a mode of
perception typical of Panorama tradition.
By placing interfaces of different technologies next to each other within a
single work, Shaw foregrounds the unique logic of seeing, spatial access and
user’s behavior characteristic of each technology. The tradition of the framed
image , i.e. a representation which exists within the larger physical space which
contains the viewer (painting, cinema, computer screen), meets the tradition of
the "total" simulation, or “immersion,’ i.e. a simulated space which encloses the
viewer (Panorama, VR).
Another historical dichotomy staged for us by Shaw is between the
traditions of collective and individualized viewing in screen-based arts. The first
tradition spans from magic lantern shows to twentieth century cinema. The second
passes from the camera obscura, stereoscope and kinescope to head-mounted
displays of VR. Both have their dangers. In the first tradition, individual's
subjectivity can be dissolved in a mass-induced response. In the second,
subjectivity is being defined through the interaction of isolated subject with an
object at the expense of intersubjective dialogue. In the case of viewers'
interactions with computer installations, as I already noted when talking about
Osmose, something quite new begins to emerge: a combination of individualized
and collective spectatorship. The interaction of one viewer with the work (via a
joystick, a mouse, or a head mounted sensor) becomes in itself a new text for
other viewers, situated within the work's arena, so to speak. This affects the
behavior of this viewer who acts as a representative for the desires of others, and
who is now oriented both to them and to the work.
242
EVE rehearses the whole Western history of simulation, functioning as a
kind of Plato's cave in reverse: visitors progress from the real world inside the
space of simulation where instead of mere shadows they are presented with
technologically enhanced (via stereo) images, which look more real than their
normal perceptions.334 At the same time, EVE's enclosed round shape refers us
back to the fundamental modern desire to construct a perfect self-sufficient
utopia, whether visual (the nineteenth-century panorama) or social. (For instance,
after 1917 Russian Revolution architect G.I. Gidoni designed a monument to the
Revolution in the form of a semi-transparent globe which could hold several
thousand spectators.) Yet, rather than being presented with a simulated world
which has nothing to do with the real space of the viewer (as in typical VR), the
visitors who enter EVE's enclosed space discover that EVE's apparatus shows the
outside reality they just left. Moreover, instead of being fused in a single
collective vision (Gesamtkunstwerk, cinema, mass society) the visitors are
confronted with a subjective and partial view. The visitors only see what one
person wearing a head mounted sensor chooses to show them, i.e. they are
literally limited by this person's point of view. In addition, instead of a 360o view
they see a small rectangular image — a mere sample of the world outside. The
one visitor wearing a sensor, and thus literally acting as an eye for the rest of the
audience, occupies many positions at once — a master subject, a visionary who
shows the audience what is worth seeing and at the same time just an object, an
interface between them and outside reality, i.e., a tool for others; a projector, a
light and a reflector all at once.
Having examined the two key forms of new media — database and navigable
space — it is tempting to see their privileged role in computer culture as a sign of
a larger cultural change. If we use Auge’s distinction between modernity and
supermodernity, the following scheme can be established:
• modernity — "supermodernity"
• narrative (= hierarchy) — database, hypermedia, network (= flattening
of hierarchy)
• space — navigable space (trajectory through space)
• static architecture — “liquid architecture.” 335
• geometry and topology as theoretical models for cultural and social
analysis — trajectory, vector, flow as theoretical categories
As can seen from this scheme, the two “supermodern” forms of database and
navigable space are complimentary in their effects on the forms of modernity. On
the one hand, a narrative is “flattened” into a database. A trajectory through
events and/or time becomes a flat space. On the other hand, a flat space of
243
architecture or topology is narrativized, becoming a support for individual users’
trajectories.
But this is only one possible scheme. What is, however, clear, is that we
have left modernity for something else. We are still searching for names to
describe it. Yet the names which we come up with — “supermodernity,”
“transmodernity,” “second modern” – all seems to reflect the sense of the
continuity of this new stage with the old. If the 1980s concept of “post-
modernism” implied a break with modernity, we now seem to
prefer to think of cultural history continuos trajectory through a single conceptual
and aesthetic space. Having lived through the twentieth century we learned all too
well the human price of “breaking with the past,” “building from scratch,”
“making new” and other similar claims — be it in the case of an aesthetic, moral
or a social systems. The claim that new media should be totally new is only one in
the long list of such claims.
Such notion of a continuos trajectory is more compatible with human
anthropology and phenomenology. Just as a human body moves through physical
space in a continuos trajectory, the notion of history as a continuos trajectory is, in
my view, preferable to the one which postulates epistemological breaks or
paradigms shifts from one era to the next. This notion of Michel Foucault and
Thomas Kuhn articulated in the 1960s belong to the aesthetics of modernist
montage of Eisenstein and Godard rather than to our own era of the aesthetics of
continuity as exemplified by compositing, morphing and navigable spaces.336
They also seem to have projected onto a diachronic plane of history the
traumatic synchronic division of their time — the split between the Capitalist
West and the Communist East. But, with the official (although not necessary
actual) collapse of this split in the 1990, we have seen how history reasserted its
continuity in powerful and dangerous ways. The comeback of nationalism and
religion; the desire to erase everything associated with the Communist regime and
to return to the pre-1917 or pre-1945 (in the case of Russia and Eastern Europe,
respectively) are only some of the more dramatic signs of this process. The price
of radical break with the past is that the historical trajectory suddenly stopped in
its development simply keeps accumulating potential energy until one day its
reasserts itself with new force, breaking up into the open and crushing whatever
new was created while it was stopped.
In this book I have chosen to emphasize the continuities between the new
media and the old, the interplay between historical repetition and innovation. I
wanted to show how new media appropriates old forms and conventions of
different media, in particular cinema. Like a river, cultural history can’t suddenly
change its course; its movement is that of a spline rather than a set of straight lines
between points. In short I wanted to create trajectories through the space of
cultural history which would pass through new media thus grounding it in what
came back before.
244
VI. What is Cinema?
It is useful to think about the relations between cinema and new media in terms of
two main vectors. The first vector goes from cinema to new media, and it
constitutes the backbone of this book. Chapters I — V used history and theory of
cinema to map out the logic which drives the technical and stylistic development
of new media. I also traced the key role which cinematic language is placing in
new media interfaces — both traditional HCI (interface of the operating system
and software applications) and what I called “cultural interfaces” — the interfaces
between the human user and cultural data.
The second vector goes in the opposite direction: from computers to
cinema. How does computerization affects our very concept of a moving images?
Does if offer new possibilities for film language? Did it led to the development of
totally new forms of cinema? This last chapter is devoted to these questions. In
part I already started dealing them in “Compositing” section and in “Illusion”
chapter. Since the main part of this chapter focused on the new identity of a still
computer generated image, it is logical that we now extend our inquiry to include
moving images.
Before proceeding I would like to offer two lists. My first list of the
summary of how (at the time of writing — 1999) I think about the effects of
computerization on cinema proper:
1. Use of computer techniques in traditional filmmaking:
1.1. 3D computer animation / digital composing. Example:
"Titanic" (James Cameron, 1997); “"The City of Lost Children" (Marc Caro
and J.P. Jeunet, 1995).
1.2. Digital painting. Example: "Forest Gump" (Robert Zemeckis, 1994).
1.3. Virtual Sets. Example: "Ada" (Lynn Hershman,1997).
1.4. Virtual Actors / Motion capture. Example: “Titanic.”
2. New forms of computer-based cinema
2.1. Motion rides / location-based entertainment. Example: rides produced
by
Douglas Trumball.
2.2. “Typographic cinema”: film + graphic design + typography.
Examples: film title sequences.
2.3. Net.cinema: films designed exclusively for Internet distribution.
Example: New Venue, one of the first onlines sites devoted to showcasing short
digital films. In 1998 it accepted only QuickTime files under 5 MG.
245
2.4. Hypermedia interfaces to a film which allows non-linear access at
different scales. Examples: "WaxWeb” (David Blair, 1994-1999); Stephen
Mamber’s database interface to Hitchock’s “Psycho” (Mamber, 1996-).
2.5. Interactive movies and games which are structured around film-like
sequences. These sequences can be created using traditional film techniques
(example: “Jonny Mnemonic” game) or computer animation (example: “Blade
Runner” game). (The pioneer of interactive cinema is experimental filmmaker
Graham Weinbren whose laserdisks Sonata and The Erl King are the true classics
of this new form.) Note that it is hard to draw a strict line between such
interactive movies and many other games which may not use traditional film
sequences yet follow many other conventions of film language in their structure.
From this perspective, the majority of 1990s computer games can be actually
considered interactive movies.
2.6. Animated, filmed, simulated or hybrid sequences which follow film
language, and appear in HCI, Web sites, computer games and other areas of new
media. Examples: transitions and QuickTime movies in Myst; FMV (full motion
video) opening in Tomb Rider and many other games.
The first section of this chapter, “Digital Cinema and the History of a Moving
Image,” will focus on 1.1 — 1.3. The second section, “New Language of
Cinema,” will use examples drawn from 2.3 — 2.6.337
Note that this list does not include such new production technologies as
DV (digital video) or new distribution technologies such as digital film projection
or network film distribution which by 1999 was already used in Hollywood on a
experimental basis; nor do I mention growing number of Web sites devoted to
distribution of films.338 Although all these developments will undoubtedly have
important effect on the economics of film production and distribution, they do not
appear to have a direct effect of film language, which is my main concern here.
My second, and a highly tentative list, summarizes some of the distinct
qualities of a computer-based image. This list pulls together arguments presented
throughout the book so far. As I already noted in Chapter 1, I feel that it is
important to pay attention not only to the new properties of a computer image
which can be logically deduced from its new “material” status, but also to how
images are actually used in computer culture. Therefore the number of properties
on this list reflect the typical usage of images, rather some “essential” properties it
may have because of its digital status. It is also legitimate to think of some of
these qualities as particular consequences of the oppositions which define a
concept of representation, summarized in the Introduction:
246
1. Computer-based image is discrete, since it is broken into pixels. This makes it
more like a human language (but not in the semiotic sense of having distinct
units of meaning).
2. Computer-based image is modular, since it typically consists from a number
of layers whose contents often correspond to meaningful parts of the image.
3. Computer-based image consists from two levels, a surface appearance and the
underlying code (which may be the pixel values, a mathematical function or
HTML code). In terms of its “surface,” an image participates in the dialog
with other cultural objects. In terms of its code, an image exist on the same
conceptual plane as other computer objects. (The surface-code pain can be
related to signifier — signified, base — superstructure, unconscious —
conscious pairs. So, just as a signifier exists in a structure with other signifiers
of a language, a “surface” of an image, i.e. its “contents” enters in dialog with
all other images in a culture.)
4. Computer-based images are typically compressed using lossy compression
techniques, such as JPEG. Therefore, presence of noise (in a sense of
undesirable artifacts and loss of original information) is its essential, rather
than accidental, quality.
5. An image acquires the new role of an interface (for instance, imagemaps on
the Web, or the image of a desktop as a whole in GUI ). Thus image becomes
image-interface. In this role it functions as a portal into another world, like an
icon in Middle Ages or a mirror in modern literature and cinema. Rather than
staying on its surface, we expect to go “into” the image. In effect, every
computer user becomes Carrol’s Alice. Image can function as an interface
because it can be “wired” to programming code; thus clicking on the image
activates a computer program (or its part).
6. The new role of an image as image-interface competes with is older role as
representation. Therefore, conceptually, a computer image is situated between
two opposing poles: an illusionistic window into a fictional universe and a
tool for computer control. The task of new media design and art is learn how
to combine these two competing roles of an image.
7. Visually, this conceptual opposition translates into the opposition between a
depth and surface, between a window into a fictional universe and a control
panel.
8. Along with functioning as image-interfaces, computer images also functions
as image-instruments. If image-interface controls a computer, an image-
instrument allows the user to remotely affect physical reality in real time. This
ability not just to act but to “teleact” distinguishes new computer-based
image-instrument from old image-instruments. Additionally, if before image-
instruments such as maps were clearly distinguished from illusionistic images,
such as paintings (although recall Alpers’s argument that classical Dutch
painting combines both concepts), computer images often combine both
functions.
247
9. A computer image is frequently hyperlinked to other images, texts, and other
media elements. Rather than being a self-enclosed entity it points, leads to,
directs the user outside of itself towards something else. A moving image may
also include hyperlinks (for instance, in QuickTime format.) We can say that a
hyperlinked image, and hypermedia in general, “externalizes” Pierce’s idea of
infinite semiosis and Derrida’s concept of infinite deferral of meaning —
although this does not mean that this “externalization” automatically
legitimizes these concepts. Rather than celebrating “the convergence of
technology and critical theory,” we should use new media technology as an
opportunity to question our accepted critical concepts and models.
10. Variability and automation, these general principles of new media, also apply
to images. For example, using a computer program a designer can
automatically generate infinite versions of the same image which can vary in
size, resolution, colors, composition and so on.
11. From a single image which represented the “cultural unit” of a previous period
we move to a database of images. Thus if the hero of Antonioni’s Blow-up
(1966) was looking for truth within a single photographic image, the
equivalent of this operation in a computer age is to work with a whole
database of many images, searching and comparing them with each other.
(Although many contemporary films include scenes of image search, none of
them makes it a subject of a film the way Blow-up focuses on zooming into a
photograph. From this perspective, it is interesting that fifteen years later
Blade Runner still applies “old” cinematic logic in relation to a computer-
based image. In a well-known scene the hero uses voice commands to direct a
futuristic computer device to pan and zoom into an image. In reality already
since the 1950s military used different computer techniques for image analysis
to automatically identify objects represented in an image, detect changes in
images over time, etc. which relied on databases of images.339) Any unique
image you may desire probably already exists on the Internet or in some
database. As I already noted, today the problem is no longer how to create the
right image, but how to find already existing one.
Since a computer-based moving image, just as its analog predecessor, is simply a
sequence of still images, all these properties apply to it as well. To delineate the
new qualities of a computer-based still image I compared it with other types of
modern images commonly used before it — drawing, a map, a painting and most
importantly, a still photograph. It would be logical to begin discussion of the
computer-based moving image by also relating it to two most common types of
moving images it replaces in its turn — the film image and an animated image.
The first section, “Digital Cinema and the History of a Moving Image” does
precisely this. It asks how the shift to computer-based representation and
production processes redefines the identity of a moving image and the
248
relationship between cinema and animation. This section also invokes the
question of computer-based illusionism, considering it in relation to animation,
analog cinema and digital cinema. The following section “The New Language of
Cinema” presents the examples of some of the new directions for film language
— or, more generally, the language of moving images — opened up by
computerization. My examples come from different areas where computer-based
moving image are used: digital films, net.films, self-contained hypermedia, and
Web sites.
249
Digital Cinema and the History of a Moving Image
Cinema, the Art of the Index
Most discussions of cinema in the computer age have focused on the possibilities
of interactive narrative. It is not hard to understand why: since the majority of
viewers and critics equate cinema with storytelling, computer media is understood
as something which will let cinema tell its stories in a new way. Yet as exciting as
the ideas of a viewer participating in a story, choosing different paths through the
narrative space and interacting with characters may be, they only address one
aspect of cinema which is neither unique nor, as many will argue, essential to it:
narrative.
The challenge which computer media poses to cinema extends far beyond
the issue of narrative. Computer media redefines the very identity of cinema. In a
symposium which took place in Hollywood in the Spring of 1996, one of the
participants provocatively referred to movies as "flatties" and to human actors as
"organics" and "soft fuzzies."340 As these terms accurately suggest, what used to
be cinema's defining characteristics have become just the default options, with
many others available. When one can "enter" a virtual three-dimensional space, to
view flat images projected on the screen is hardly the only option. When, given
enough time and money, almost everything can be simulated in a computer, to
film physical reality is just one possibility.
This "crisis" of cinema's identity also affects the terms and the categories
used to theorize cinema's past. French film theorist Christian Metz wrote in the
1970s that "Most films shot today, good or bad, original or not, 'commercial' or
not, have as a common characteristic that they tell a story; in this measure they all
belong to one and the same genre, which is, rather, a sort of 'super-genre' ['sur-
genre']."341 In identifying fictional films as a "super-genre' of twentieth century
cinema, Metz did not bother to mention another characteristic of this genre
because at that time it was too obvious: fictional films are live action films, i.e.
they largely consist of unmodified photographic recordings of real events which
took place in real physical space. Today, in the age of photorealistic 3D computer
animation and digital compositing, invoking this characteristic becomes crucial in
defining the specificity of twentieth century cinema. From the perspective of a
future historian of visual culture, the differences between classical Hollywood
films, European art films and avant-garde films (apart from abstract ones) may
appear less significance than this common feature: that they relied on lens-based
250
recordings of reality. This section is concerned with the effect of computerization
on cinema as defined by its "super genre" as fictional live action film.342
During cinema's history, a whole repertoire of techniques (lighting, art
direction, the use of different film stocks and lens, etc.) was developed to modify
the basic record obtained by a film apparatus. And yet behind even the most
stylized cinematic images we can discern the bluntness, the sterility, the banality
of early nineteenth century photographs. No matter how complex its stylistic
innovations, the cinema has found its base in these deposits of reality, these
samples obtained by a methodical and prosaic process. Cinema emerged out of
the same impulse which engendered naturalism, court stenography and wax
museums. Cinema is the art of the index; it is an attempt to make art out of a
footprint.
Even for director Andrey Tarkovsky, film-painter par excellence, cinema's
identity lay in its ability to record reality. Once, during a public discussion in
Moscow sometime in the 1970s he was asked the question as to whether he was
interested in making abstract films. He replied that there can be no such thing.
Cinema's most basic gesture is to open the shutter and to start the film rolling,
recording whatever happens to be in front of the lens. For Tarkovsky, an abstract
cinema is thus impossible.
But what happens to cinema's indexical identity if it is now possible to
generate photorealistic scenes entirely in a computer using 3D computer
animation; to modify individual frames or whole scenes with the help a digital
paint program; to cut, bend, stretch and stitch digitized film images into
something which has perfect photographic credibility, although it was never
actually filmed?
This section will address the meaning of these changes in the filmmaking
process from the point of view of the larger cultural history of the moving image.
Seen in this context, the manual construction of images in digital cinema
represents a return to nineteenth century pre-cinematic practices, when images
were hand-painted and hand-animated. At the turn of the twentieth century,
cinema was to delegate these manual techniques to animation and define itself as
a recording medium. As cinema enters the digital age, these techniques are again
becoming the commonplace in the filmmaking process. Consequently, cinema can
no longer be clearly distinguished from animation. It is no longer an indexical
media technology but, rather, a sub-genre of painting.
This argument will be developed in two stages. I will first follow a
historical trajectory from nineteenth century techniques for creating moving
images to twentieth-century cinema and animation. Next I will arrive at a
definition of digital cinema by abstracting the common features and interface
metaphors of a variety of computer software and hardware which are currently
replacing traditional film technology. Seen together, these features and metaphors
suggest a distinct logic of a digital moving image. This logic subordinates the
251
photographic and the cinematic to the painterly and the graphic, destroying
cinema's identity as a media art. In the beginning of the next section “New
Language of Cinema” I will examine different production contexts which already
use digital moving images — Hollywood films, music videos, CD-ROM-based
games and other stand-alone hypermedia — in order to see if and how this logic
has begun to manifest itself.
A Brief Archeology of Moving Pictures
As testified by its original names (kinetoscope, cinematograph, moving pictures),
cinema was understood, from its birth, as the art of motion, the art which finally
succeeded in creating a convincing illusion of dynamic reality. If we approach
cinema in this way (rather than the art of audio-visual narrative, or the art of a
projected image, or the art of collective spectatorship, etc.), we can see it
superseding previous techniques for creating and displaying moving images.
These earlier techniques shared a number of common characteristics. First,
they all relied on hand-painted or hand-drawn images. The magic lantern slides
were painted at least until the 1850s; so were the images used in the
Phenakistiscope, the Thaumatrope, the Zootrope, the Praxinoscope, the
Choreutoscope and numerous other nineteenth century pro-cinematic devices.
Even Muybridge's celebrated Zoopraxiscope lectures of the 1880s featured not
actual photographs but colored drawings painted after the photographs.343
Not only were the images created manually, they were also manually
animated. In Robertson's Phantasmagoria, which premiered in 1799, magic
lantern operators moved behind the screen in order to make projected images
appear to advance and withdraw.344 More often, an exhibitor used only his hands,
rather than his whole body, to put the images into motion. One animation
technique involved using mechanical slides consisting of a number of layers. An
exhibitor would slide the layers to animate the image.345 Another technique was
to slowly move a long slide containing separate images in front of a magic lantern
lens. Nineteenth century optical toys enjoyed in private homes also required
manual action to create movement — twirling the strings of the Thaumatrope,
rotating the Zootrope's cylinder, turning the Viviscope's handle.
It was not until the last decade of the nineteenth century that the automatic
generation of images and their automatic projection were finally combined. A
mechanical eye became coupled with a mechanical heart; photography met the
motor. As a result, cinema — a very particular regime of the visible — was born.
Irregularity, non-uniformity, the accident and other traces of the human body,
which previously inevitably accompanied moving image exhibitions, were
replaced by the uniformity of machine vision.346 A machine, which like a
252
conveyer belt, was now spitting out images, all sharing the same appearance, all
the same size, all moving at the same speed, like a line of marching soldiers.
Cinema also eliminated the discrete character of both space and movement
in moving images. Before cinema, the moving element was visually separated
from the static background as with a mechanical slide show or Reynaud's
Praxinoscope Theater (1892).347 The movement itself was limited in range and
affected only a clearly defined figure rather than the whole image. Thus, typical
actions would include a bouncing ball, a raised hand or eyes, a butterfly moving
back and forth over the heads of fascinated children — simple vectors charted
across still fields.
Cinema's most immediate predecessors share something else. As the
nineteenth-century obsession with movement intensified, devices which could
animate more than just a few images became increasingly popular. All of them —
the Zootrope, the Phonoscope, the Tachyscope, the Kinetoscope — were based on
loops, sequences of images featuring complete actions which can be played
repeatedly. The Thaumatrope (1825), in which a disk with two different images
painted on each face was rapidly rotated by twirling a strings attached to it, was in
its essence a loop in its most minimal form: two elements replacing one another in
succession. In the Zootrope (1867) and its numerous variations, approximately a
dozen images were arranged around the perimeter of a circle.348 The Mutoscope,
popular in America throughout the 1890s, increased the duration of the loop by
placing a larger number of images radially on an axle.349 Even Edison's
Kinetoscope (1892-1896), the first modern cinematic machine to employ film,
continued to arrange images in a loop.350 50 feet of film translated to an
approximately 20 second long presentation — a genre whose potential
development was cut short when cinema adopted a much longer narrative form.
From Animation to Cinema
Once the cinema was stabilized as a technology, it cut all references to its origins
in artifice. Everything which characterized moving pictures before the twentieth
century — the manual construction of images, loop actions, the discrete nature of
space and movement — all of this was delegated to cinema's bastard relative, its
supplement, its shadow — animation. Twentieth century animation became a
depository for nineteenth century moving image techniques left behind by
cinema.
The opposition between the styles of animation and cinema defined the
culture of the moving image in the twentieth century. Animation foregrounds its
artificial character, openly admitting that its images are mere representations. Its
visual language is more aligned to the graphic than to the photographic. It is
253
discrete and self-consciously discontinuous: crudely rendered characters moving
against a stationary and detailed background; sparsely and irregularly sampled
motion (in contrast to the uniform sampling of motion by a film camera — recall
Jean-Luc Godard's definition of cinema as "truth 24 frames per second"), and
finally space constructed from separate image layers.
In contrast, cinema works hard to erase any traces of its own production
process, including any indication that the images which we see could have been
constructed rather than recorded. It denies that the reality it shows often does not
exist outside of the film image, the image which was arrived at by photographing
an already impossible space, itself put together with the use of models, mirrors,
and matte paintings, and which was then combined with other images through
optical printing. It pretends to be a simple recording of an already existing reality
— both to a viewer and to itself.351 Cinema's public image stressed the aura of
reality "captured" on film, thus implying that cinema was about photographing
what existed before the camera, rather than "creating the 'never-was'" of special
effects.352 Rear projection and blue screen photography, matte paintings and
glass shots, mirrors and miniatures, push development, optical effects and other
techniques which allowed filmmakers to construct and alter the moving images,
and thus could reveal that cinema was not really different from animation, were
pushed to cinema's periphery by its practitioners, historians and critics.353
In the 1990s, with the shift to computer media, these marginalized
techniques moved to the center.
Cinema Redefined
A visible sign of this shift is the new role which computer generated special
effects have come to play in Hollywood industry in the 1990s. Many blockbusters
have been driven by special effects; feeding on their popularity. Hollywood has
even created a new-mini genre of "The Making of..." videos and books which
reveal how special effects are created.
I will use special effects from 1990s Hollywood films for illustrations of
some of the possibilities of digital filmmaking. Until recently, Hollywood studios
were the only ones who had the money to pay for digital tools and for the labor
involved in producing digital effects. However, the shift to digital media affects
not just Hollywood, but filmmaking as a whole. As traditional film technology is
universally being replaced by digital technology, the logic of the filmmaking
process is being redefined. What I describe below are the new principles of digital
filmmaking which are equally valid for individual or collective film productions,
regardless of whether they are using the most expensive professional hardware
and software or its amateur equivalents.
254
Consider, then, the following principles of digital filmmaking:
1. Rather than filming physical reality it is now possible to generate film-like
scenes directly in a computer with the help of 3D computer animation.
Therefore, live action footage is displaced from its role as the only possible
material from which the finished film is constructed.
2. Once live action footage is digitized (or directly recorded in a digital format),
it loses its privileged indexical relationship to pro-filmic reality. The computer
does not distinguish between an image obtained through the photographic
lens, an image created in a paint program or an image synthesized in a 3D
graphics package, since they are made from the same material — pixels. And
pixels, regardless of their origin, can be easily altered, substituted one for
another, and so on. Live action footage is reduced to be just another graphic,
no different than images which were created manually.354
3. If live action footage was left intact in traditional filmmaking, now it
functions as raw material for further compositing, animating and morphing.
As a result, while retaining visual realism unique to the photographic process,
film obtains the plasticity which was previously only possible in painting or
animation. To use the suggestive title of a popular morphing software, digital
filmmakers work with "elastic reality." For example, the opening shot of
Forest Gump (Robert Zemeckis, Paramount Pictures, 1994; special effects by
Industrial Light and Magic) tracks an unusually long and extremely intricate
flight of a feather. To create the shot, the real feather was filmed against a
blue background in different positions; this material was then animated and
composited against shots of a landscape.355 The result: a new kind of realism,
which can be described as "something which looks is intended to look exactly
as if it could have happened, although it really could not."
4. Previously, editing and special effects were strictly separate activities. An
editor worked on ordering sequences of images together; any intervention
within an image was handled by special effects specialists. The computer
collapses this distinction. The manipulation of individual images via a paint
program or algorithmic image processing becomes as easy as arranging
sequences of images in time. Both simply involve "cut and paste." As this
basic computer command exemplifies, modification of digital images (or other
digitized data) is not sensitive to distinctions of time and space or of
differences of scale. So, re-ordering sequences of images in time, compositing
them together in space, modifying parts of an individual image, and changing
individual pixels become the same operation, conceptually and practically.
Given the preceding principles, we can define digital film in this way:
digital film = live action material + painting + image processing +
255
compositing + 2D computer animation + 3D computer animation
Live action material can either be recorded on film or video or directly in a digital
format.356 Painting, image processing and computer animation refer to the
processes of modifying already existent images as well as creating new ones. In
fact, the very distinction between creation and modification, so clear in film-based
media (shooting versus darkroom processes in photography, production versus
post-production in cinema) no longer applies to digital cinema, since each image,
regardless of its origin, goes through a number of programs before making it to
the final film.357
Let us summarize these principles. Live action footage is now only raw
material to be manipulated by hand: animated, combined with 3D computer
generated scenes and painted over. The final images are constructed manually
from different elements; and all the elements are either created entirely from
scratch or modified by hand. Now we can finally answer the question "what is
digital cinema?" Digital cinema is a particular case of animation which uses live
action footage as one of its many elements.
This can be re-read in view of the history of the moving image sketched
earlier. Manual construction and animation of images gave birth to cinema and
slipped into the margins...only to re-appear as the foundation of digital cinema.
The history of the moving image thus makes a full circle. Born from animation,
cinema pushed animation to its boundary, only to become one particular case of
animation in the end.
The relationship between "normal" filmmaking and special effects is
similarly reversed. Special effects, which involved human intervention into
machine recorded footage and which were therefore delegated to cinema's
periphery throughout its history, become the norm of digital filmmaking.
The same logic applies for the relationship between production and post-
production. Cinema traditionally involved arranging physical reality to be filmed
though the use of sets, models, art direction, cinematography, etc. Occasional
manipulation of recorded film (for instance, through optical printing) was
negligible compared to the extensive manipulation of reality in front of a camera.
In digital filmmaking, shot footage is no longer the final point but just raw
material to be manipulated in a computer where the real construction of a scene
will take place. In short, the production becomes just the first stage of post-
production.
The following example illustrates this new relationship between different
stages of the filmmaking process . Traditional on-set filming for Stars Wars:
Episode 1 — The Phantom Menace (George Lucas, 1999) was done in just 65
days. The post-production, however, stretched over two years, since ninety-five
256
percent of the film (approximately 2,000 shots out of the total 2,200) was
constructed on a computer.358
Here are two more examples to further illustrate the shift from re-
arranging reality to re-arranging its images. From the analog era: for a scene in
Zabriskie Point (1970), Michaelangelo Antonioni, trying to achieve a particularly
saturated color, ordered a field of grass to be painted. From the digital era: to
create the launch sequence in Apollo 13 (Universal Studious, 1995; special
effects by Digital Domain), the crew shot footage at the original location of the
launch at Cape Canaveral. The artists at Digital Domain scanned the film and
altered it on computer workstations, removing recent building construction,
adding grass to the launch pad and painting the skies to make them more
dramatic. This altered film was then mapped onto 3D planes to create a virtual set
which was animated to match a 180-degree dolly movement of a camera
following a rising rocket.359
The last example brings us to another conceptualization of digital cinema
— as painting. In his book-length study of digital photography, William J.
Mitchell focuses our attention on what he calls the inherent mutability of a digital
image: "The essential characteristic of digital information is that it can be
manipulated easily and very rapidly by computer. It is simply a matter of
substituting new digits for old... Computational tools for transforming,
combining, altering, and analyzing images are as essential to the digital artist as
brushes and pigments to a painter."360 As Mitchell points out, this inherent
mutability erases the difference between a photograph and a painting. Since a film
is a series of photographs, it is appropriate to extend Mitchell's argument to digital
film. With an artist being able to easily manipulate digitized footage either as a
whole or frame by frame, a film in a general sense becomes a series of
paintings.361
Hand-painting digitized film frames, made possible by a computer, is
probably the most dramatic example of the new status of cinema. No longer
strictly locked in the photographic, it opens itself towards the painterly. It is also
the most obvious example of the return of cinema to its nineteenth century origins
— in this case, to hand-crafted images of magic lantern slides, the
Phenakistiscope, the Zootrope.
We usually think of computerization as automation, but here the result is
the reverse: what was previously automatically recorded by a camera now has to
be painted one frame at a time. But not just a dozen images, as in the nineteenth
century, but thousands and thousands. We can draw another parallel with the
practice, common in the early days of silent cinema, of manually tinting film
frames in different colors according to a scene's mood.362 Today, some of the
most visually sophisticated digital effects are often achieved using the same
simple method: painstakingly altering by hand thousands of frames. The frames
257
are painted over either to create mattes ("hand drawn matte extraction") or to
directly change the images, as, for instance, in Forest Gump, where President
Kennedy was made to speak new sentences by altering the shape of his lips, one
frame at a time.363 In principle, given enough time and money, one can create
what will be the ultimate digital film: 90 minutes, i.e., 129600 frames completely
painted by hand from scratch, but indistinguishable in appearance from live
photography.
The concept of digital cinema as painting can be also developed in a
different way. I would like to compare the shift from analog to digital filmmaking
to the shift from fresco and tempera to oil painting in early Renaissance. A painter
making fresco has limited time before the paint dries; and once it is dried, no
further changes to the image are possible. Similarly, a traditional filmmaker has
limited means to modify images once they are recorded on film. In the case of
Medieval tempera painting, this can be compared to the practice of special effects
during the analog period of cinema. A painter working with tempera could modify
and re-work the image, but the process was quite painstaking and slow. Medieval
and early Renaissance masters would spend up to six months on a painting a few
inches tall. The switch to oils greatly liberated painters by allowing them to
quickly create much larger compositions (think, for instance, of the works by
Veronese and Tician) as well as to modify them as long as necessary. This change
in painting technology led the Renaissance painters to create new kinds of
compositions, new pictorial space and even narratives. Similarly, by allowing a
filmmaker to treat a film image as an oil painting, digital technology redefines
what can be done with cinema.
If digital compositing and digital painting can be thought of as an
extension of the cell animation techniques (since composited images are stacked
in depth parallel to each other, as cells on a animation stand), the newer method of
computer-based post-production, makes filmmaking a subset of animation in a
different way. In this method the live action, photographic stills and/or graphic
elements are positioned in a 3D virtual space. This gives the director the ability to
freely move the virtual camera through this space, dolling and panning. Thus
cinematography is subordinated to 3D computer animation. We may think of this
method as an extension of multiplane animation camera. However, if the camera
mounted over a multiplane stand could only move perpendicular to the images,
now it can move in a arbitrary trajectory. The example of a commercial film
which rely on this newer method which one day may become the standard of
filmmaking (because it gives the director most flexibility) is Disney’s Alladin; the
example of an independent work which fully explores the new aesthetic
possibilities of this method without subordinating it to the traditional cinematic
realism is The Forest by Tamas Waliczky (1994).
In discussing digital compositing in “Compositing” section I pointed out
that it can be thought off as an intermediary step from 2D images to 3D computer
258
representation. The newer post-production method represents the next logical step
towards %100 3D computer generated scenes. Instead of 2D space of “traditional”
composite, we now have the layers of moving images positioned in a virtual 3D
space.
The reader who followed my analysis of the new possibilities of digital
cinema may wonder why I have stressed the parallels between digital cinema and
the pre-cinematic techniques of the nineteenth century but did not mention
twentieth century avant-garde filmmaking. Did not the avant-garde filmmakers
already explore many of these new possibilities? To take the notion of cinema as
painting, Len Lye, one of the pioneers of abstract animation, was painting directly
on film as early as 1935; he was followed by Norman McLaren and Stan
Brackage, the later extensively covering shot footage with dots, scratches,
splattered paint, smears and lines in an attempt to turn his films into equivalents
of Abstract Expressionst painting. More generally, one of the major impulses in
all of avant-garde filmmaking, from Leger to Godard, was to combine the
cinematic, the painterly and the graphic — by using live action footage and
animation within one film or even a single frame, by altering this footage in a
variety of ways, or by juxtaposing printed texts and filmed images.
When the avant-garde filmmakers collaged multiple images within a
single frame, or painted and scratched film, or revolted against the indexical
identity of cinema in other ways, they were working against "normal" filmmaking
procedures and the intended uses of film technology. (Film stock was not be
designed to be painted on). Thus they operated on the periphery of commercial
cinema not only aesthetically but also technically.
One general effect of the digital revolution is that avant-garde aesthetic
strategies became embedded in the commands and interface metaphors of
computer software.364 In short, the avant-garde became materialized in a
computer. Digital cinema technology is a case in point. The avant-garde strategy
of collage reemerged as a "cut and paste" command, the most basic operation one
can perform on digital data. The idea of painting on film became embedded in
paint functions of film editing software. The avant-garde move to combine
animation, printed texts and live action footage is repeated in the convergence of
animation, title generation, paint, compositing and editing systems into single all-
in-one packages. Finally, another move to combine a number of film images
together within one frame (for instance, in Leger's 1924 Ballet Mechanique or in
A Man with a Movie Camera) also become legitimized by technology, since all
editing software, including Photoshop, Premiere, After Effects, Flame, and
Cineon, by default assumes that a digital image consists of a number of separate
image layers. All in all, what used to be exceptions for traditional cinema became
the normal, intended techniques of digital filmmaking, embedded in technology
design itself.365
259
From Kino-Eye to Kino-Brush
In the twentieth century, cinema has played two roles at once. As a media
technology, cinema's role was to capture and to store visible reality. The difficulty
of modifying images once they were recorded was exactly what gave cinema its
value as a document, assuring its authenticity. The same rigidity of the film image
has defined the limits of cinema as I defined it earlier, i.e. the super-genre of live
action narrative. Although it includes within itself a variety of styles — the result
of the efforts of many directors, designers and cinematographers — these styles
share a strong family resemblance. They are all children of the recording process
which uses lens, regular sampling of time and photographic media. They are all
children of a machine vision.
The mutability of digital data impairs the value of cinema recordings as a
documents of reality. In retrospect, we can see that twentieth century cinema's
regime of visual realism, the result of automatically recording visual reality, was
only an exception, an isolated accident in the history of visual representation
which has always involved, and now again involves the manual construction of
images. Cinema becomes a particular branch of painting — painting in time. No
longer a kino-eye, but a kino-brush.366
The privileged role played by the manual construction of images in digital
cinema is one example of a larger trend: the return of pre-cinematic moving
images techniques. Marginalized by the twentieth century institution of live action
narrative cinema which relegated them to the realms of animation and special
effects, these techniques reemerge as the foundation of digital filmmaking. What
was supplemental to cinema becomes its norm; what was at its boundaries comes
into the center. Computer media returns to us the repressed of the cinema.
As the examples discussed in this section suggest, the directions which
were closed off at the turn of the century when cinema came to dominate the
modern moving image culture are now again beginning to be explored. Moving
image culture is being redefined once again; the cinematic realism is being
displaced from being its dominant mode to become only one option among many.
260
New Language of Cinema
Cinematic and Graphic: Cinegratography
3D animation, compositing, mapping, paint retouching: in commercial cinema,
these radical new techniques are mostly used to solve technical problems while
traditional cinematic language is preserved unchanged. Frames are hand-painted
to remove wires which supported an actor during shooting; a flock of birds is
added to a landscape; a city street is filled with crowds of simulated extras.
Although most Hollywood releases now involve digitally manipulated scenes, the
use of computers is always carefully hidden.367
Appropriately, in Hollywood the practice of simulating traditional film language
received a name “invisible effects,” defined as “computer-enchanced scenes that
fool the audience into believing the sots were produced with live actors on
location, but are really composed of a mélange of digital and live action
footage.”368
Commercial narrative cinema still continues to hold on to the classical
realist style where images function as un-retouched photographic records of some
events which took place in front of the camera. So when Hollywood cinema uses
computers to create fantastic, impossible reality, this is done through the
introduction of various non-human characters such as aliens, mutants and robots.
We never notice the pure arbitrariness of their colorful and mutating bodies, of the
beams of energy emulating from their eyes, of the whirlpools of particles
emulating from their wings, because they are made perceptually consistent with
the set, i.e. they look like something which could have existed in a three-
dimensional space and therefore could have been photographed.
But how do filmmakers motivate turning familiar reality such as a human
body or a landscape into something phsically impossible in our world? Such
transformations are motivated by the movie’s narrative. The shiny metallic body
of Terminator in Terminator 2 is possible because the Terminator is a cyborg send
from the future; the rubber-like body of Jim Carrey in The Mask (Russell, 1994)
is possible because his character wears a mask with magical powers. Similarly, in
What Dreams May Come (PolyGram Filmed Entertainment, Ward, special effects
by Mass.Illusions and others, 1998) the fantastic landscape made of swirling
brushstrokes where the main hero is transported after his death is motivated by the
unique status of this location.
While embracing computers as a productivity tool, cinema refuses to give
up its unique cinema-effect, an effect which, according to film theorist Christian
Metz's penetrating analysis made in the 1970s, depends upon narrative form, the
261
reality effect and cinema's architectural arrangement all working together.369
Towards the end of his essay, Metz wonders whether in the future non-narrative
films may become more numerous; if this happens, he suggests that cinema will
no longer need to manufacture its reality effect. Electronic and digital media have
already brought about this transformation. Beginning in the 1980s, new cinematic
forms have emerged which are not linear narratives, which are exhibited on a
television or a computer screen, rather than in a movie theater — and which
simultaneously give up cinematic realism.
What are these forms? First of all, there is the music video. Probably not
by accident, the genre of music video came into existence exactly at the time
when electronic video effects devices were entering editing studios. Importantly,
just as music videos often incorporate narratives within them, but are not linear
narratives from start to finish, they rely on film (or video) images, but change
them beyond the norms of traditional cinematic realism. The manipulation of
images through hand-painting and image processing, hidden in Hollywood
cinema, is brought into the open on a television screen. Similarly, the construction
of an image from heterogeneous sources is not subordinated to the goal of
photorealism but functions as a aesthetic strategy. The genre of music video has
been a laboratory for exploring numerous new possibilities of manipulating
photographic images made possible by computers — the numerous points which
exist in the space between the 2D and the 3D, cinematography and painting,
photographic realism and collage. In short, it is a living and constantly expanding
textbook for digital cinema.
A detailed analysis of the evolution of music video imagery (or, more
generally, broadcast graphics in the electronic age) deserves a separate treatment
and I will not try to take it up here. Instead, I will discuss another new cinematic
non-narrative form, CD-ROM-based games, which, in contrast to music video,
relied on the computer for storage and distribution from the very beginning. And,
unlike music video designers who were consciously pushing traditional film or
video images into something new, the designers of CD-ROMs arrived at a new
visual language unintentionally while attempting to emulate traditional cinema.
In the late 1980s, Apple began to promote the concept of computer
multimedia; and in 1991 it released QuickTime software to enable an ordinary
personal computer to play movies. However, for the next few years the computer
did not perform its new role very well. First, CD-ROMs could not hold anything
close to the length of a standard theatrical film. Secondly, the computer would not
smoothly play a movie larger than the size of a stamp. Finally, the movies had to
be compressed, degrading their visual appearance. Only in the case of still images
was the computer able to display photographic-like detail at full screen size.
Because of these particular hardware limitations, the designers of CD-
ROMs had to invent a different kind of cinematic language in which a range of
262
strategies, such as discrete motion, loops, and superimposition, previously used in
nineteenth century moving image presentations, in twentieth century animation,
and in the avant-garde tradition of graphic cinema, were applied to photographic
or synthetic images. This language synthesized cinematic illusionism and the
aesthetics of graphic collage, with its characteristic heterogeneity and
discontinuity. The photographic and the graphic, divorced when cinema and
animation went their separate ways, met again on a computer screen.
The graphic also met the cinematic. The designers of CD-ROMs were
aware of the techniques of twentieth century cinematography and film editing, but
they had to adopt these techniques both to an interactive format and to hardware
limitations. As a result, the techniques of modern cinema and of nineteenth
century moving image have merged in a new hybrid language which can be called
“cinegratography.”.
We can trace the development of this language by analyzing a few well-
known CD-ROM titles. The best selling game Myst (Broderbund, 1993) unfolds
its narrative strictly through still images, a practice which takes us back to magic
lantern shows (and to Chris Marker's La Jetée).370 But in other ways Myst relies
on the techniques of twentieth century cinema. For instance, the CD-ROM uses
simulated camera turns to switch from one image to the next. It also employs the
basic technique of film editing to subjectively speed up or slow down time. In the
course of the game, the user moves around a fictional island by clicking on a
mouse. Each click advances a virtual camera forward, revealing a new view of a
3D environment. When the user begins to descend into the underground
chambers, the spatial distance between the points of view of each two consecutive
views sharply decreases. If before the user was able to cross a whole island with
just a few clicks, now it takes a dozen clicks to get to the bottom of the stairs! In
other words, just as in traditional cinema, Myst slows down time to create
suspense and tension.
In Myst, miniature animations are sometimes embedded within the still
images. In the next best-selling CD-ROM 7th Guest (Virgin Games, 1993), the
user is presented with video clips of live actors superimposed over static
backgrounds created with 3D computer graphics. The clips are looped, and the
moving human figures clearly stand out against the backgrounds. Both of these
features connect the visual language of 7th Guest to nineteenth century pro-
cinematic devices and twentieth century cartoons rather than to cinematic
verisimilitude. But like Myst, 7th Guest also evokes distinctly modern cinematic
codes. The environment where all action takes place (an interior of a house) is
rendered using a wide angle lens; to move from one view to the next a camera
follows a complex curve, as though mounted on a virtual dolly.
Next, consider the CD-ROM Johnny Mnemonic (Sony Imagesoft, 1995).
Produced to complement the fiction film of the same title, marketed not as a
"game" but as an "interactive movie," and featuring full screen video throughout,
263
it comes closer to cinematic realism than the previous CD-ROMs — yet it is still
quite distinct from it. With all action shot against a green screen and then
composited with graphic backgrounds, its visual style exists within a space
between cinema and collage.
It would be not entirely inappropriate to read this short history of the
digital moving image as a teleological development which replays the emergence
of cinema a hundred years earlier. Indeed, as computers' speed keeps increasing,
the CD-ROM designers have been able to go from a slide show format to the
superimposition of small moving elements over static backgrounds and finally to
full-frame moving images. This evolution repeats the nineteenth century
progression: from sequences of still images (magic lantern slides presentations) to
moving characters over static backgrounds (for instance, in Reynaud's
Praxinoscope Theater) to full motion (the Lumieres' cinematograph). Moreover,
the introduction of QuickTime in 1991 can be compared to the introduction of the
Kinetoscope in 1892: both were used to present short loops, both featured the
images approximately two by three inches in size, both called for private viewing
rather than collective exhibition. The two technologies appear to play the similar
cultural role. If in the early 1890s the public patronized Kinetoscope parlors
where peep-hole machines presented them with the latest marvel — tiny moving
photographs arranged in short loops; exactly a hundred years later, computer
users were equally fascinated with tiny QuickTime Movies which turned a
computer in a film projector, however imperfect.371 Finally, the Lumieres' first
film screenings of 1895 which shocked their audiences with huge moving images
found their parallel in 1995 CD-ROM titles where the moving image finally fills
the entire computer screen (for instance, in Jonny Mnemonic.) Thus, exactly a
hundred years after cinema was officially "born," it was reinvented on a computer
screen.
But this is only one reading. We no longer think of the history of cinema
as a linear march towards only one possible language, or as a progression towards
more and more accurate verisimilitude. Rather, we have come to see its history as
a succession of distinct and equally expressive languages, each with its own
aesthetic variables, each new language closing off some of the possibilities of the
previous one — a cultural logic not dissimilar to Kuhn's analysis of scientific
paradigms.372 Similarly, instead of dismissing visual strategies of early
multimedia titles as a result of technological limitations, we may want to think of
them as an alternative to traditional cinematic illusionism, as a beginning of
digital cinema's new language.
For the computer / entertainment industries, these strategies represent only
a temporary limitation, an annoying drawback that needs to be overcome. This is
one important difference between the situation at the end of the nineteenth and the
end of the twentieth centuries: if cinema was developing towards the still open
horizon of many possibilities, the development of commercial multimedia, and of
264
corresponding computer hardware (compression boards, storage formats such as
DVD), is driven by a clearly defined goal: the exact duplication of cinematic
realism. So if a computer screen, more and more, emulates cinema's screen, this
not an accident but a result of conscious planning by the computer and
entertainment industry. But this drive to turn new media into a simulation of
classical film language, which paralles the encoding of cinema’s techniques in
software interfaces and hardare itself, described in “Cultural Interfaces” section,
is just one direction for new media dvelopment among numerous others. I will
next examine a number of new media and old media objects which point towards
other possible trajectories.
New Temporality: Loop as a Narrative Engine
One of the underlying assumptions of this book is that by looking at the history of
visual culture and media, and in particular cinema, we can find many strategies
and techniques relevant to new media design. Put differently, in order to develop
new aesthetics of new media we should pay as much attention to the cultural
history as to computer’s new unique possibilities to generate, organize,
manipulate and distribute data.
As we scan through cultural history (which includes the history of new
media up until the time of research), three kinds of situations will be particularly
relevant for us:
• when an earlier interesting strategy or technique was abandoned or
forced into “underground” without fully developing its potential;
• when an earlier strategy can be understood as a response to the
technological constrains (I am using this more technical term on
purpose instead of more ideologically loaded “limitations”) similar to
the constrains of new media;
• when an earlier strategy was used in a situation similar to a particular
situation faced by new media designers. For instance, montage was a
strategy to deal with modularity of a film (how do you join separate
shots?) as well as with a problem of coordinating diffirent media types
such as images and sound. Both of these simutaions are being faced
once again today by new media designers.
I already used these principles in discussing the parallels between nineteenth
century pro-cinematic techniques and the language of new media; they also
guided me in thinking about animation (the “underground” of 20
th
century
cinema) as the basis for digital cinema new language. I will now use a particular
parallel between early cinematic and new media technology to highlight another
265
older technique useful to new media: a loop. Characterically, many new media
products, be it cultural objects (such as games) or software (various media players
such as QuickTime Player) use loops in their design while treating them as
temporary technological limitations. I, however, want to think about it as a source
of new possibilities for new media.373
As already mentioned in the previous section, all nineteenth century pro-
cinematic devices, up to Edison's Kinetoscope, were based on short loops. As "the
seventh art" began to mature, it banished the loop to the low-art realms of the
instructional film, the pornographic peep-show and the animated cartoon. In
contrast, narrative cinema has avoided repetitions; as modern Western fictional
forms in general, it put forward a notion of human existence as a linear
progression through numerous unique events.
Cinema's birth from a loop form was reenacted at least once during its
history. In one of the sequences of A Man with a Movie Camera, Vertov shows
us a cameraman standing in the back of a moving automobile. As he is being
carried forward by an automobile, he cranks the handle of his camera. A loop, a
repetition, created by the circular movement of the handle, gives birth to a
progression of events — a very basic narrative which is also quintessentially
modern: a camera moving through space recording whatever is in its way. In what
seems to be a reference to cinema's primal scene, these shots are intercut with the
shots of a moving train. Vertov even re-stages the terror which Lumieres's film
supposedly provoked in its audience; he positions his camera right along the train
track so the train runs over our point of view a number of times, crushing us again
and again.
Early digital movies shared the same limitations of storage as nineteenth
century pro-cinematic devices. This is probably why the loop playback function
was built into QuickTime interface, thus giving it the same weight as the VCR-
style "play forward" function. So, in contrast to films and videotapes, QuickTime
movies were supposed to be played forward, backward or looped. Computer
games also heavily relied on loops. Since it was not possible to animate in real
time every character, the designers stored short loops of character’s motion — for
instance, an enemy soldier or a monster walking back and forth — which would
be recalled at the appropriate times in the game. Internet pornography also heavily
relied on loops. Many sites featured numerous “channels” which were supposed
to stream either feature length feature films or “live feeds”; in reality they would
usually play short loops (a minute or so) over and over. Sometimes a few films
will be cut into a number of short loops which would become the content of 100,
500 or 1000 channels.374
The history of new media tells us that the hardware limitations never go
away: they disappear in one area only to come back in another. One example of
this which I already noted is the hardware limitations of the 1980s in the area of
3D computer animation. In the 1990s they returned in the new area: Internet-
266
based real-time virtual worlds. What used to be the slow speed of CPUs became
the slow bandwidth. As a result the 1990s VRML worlds look like the pre-
rendered animations done ten years earlier.
The similar logic applies to loops. Earlier QuickTime movies and
computer games heavily relied on loops. As the CPU speed increased and larger
storage media such as CD-ROM and DVD became available, the use of loops in
stand-alone hypermedia declined. However, online virtual worlds such as Active
Worlds came to use loops extensively, as it provides a cheap (in terms of
bandwidth and computation) way of adding some signs of “life” to their
geometric-looking environments.375 Similarly, we may expect that when digital
videos will appear on small displays in our cellular phones, personal managers
such as Palm Pilot or other wireless communication devices, they will once again
will be arranged in short loops because of bandwidth, storage, or CPU limitations.
Can the loop be a new narrative form appropriate for the computer
age?376 It is relevant to recall that the loop gave birth not only to cinema but also
to computer programming. Programming involves altering the linear flow of data
through control structures, such as "if/then" and "repeat/while"; the loop is the
most elementary of these control structures. Most computer programs are based
on repetitions of a set number of steps; this repetition is controlled by the
program’s main loop. So if we strip the computer from its usual interface and
follow the execution of a typical computer program, the computer will reveal
itself to be another version of Ford's factory, with a loop as its conveyer belt.
As the practice of computer programming illustrates, the loop and the
sequential progression do not have to be thought as being mutually exclusive. A
computer program progresses from start to end by executing a series of loops.
Another illustration of how these two temporal forms can work together is
Möbius House by Dutch team UN Studio/Van Berkel & Bos.377 In this house a
number of functionally different areas are arranged one after another in the form
of a Möbius strip, thus forming a loop. As the narrative of the day progresses
from one activity to the next, the inhabitants move from area to area.
Traditional cell animation similarly combines a narrative and a loop. In
order to save labor, animators arrange many actions, such as movements of
characters’ legs, eyes and arms, into short loops and repeat them over and over.
Thus, as already discussed in the previous section, in a typical twentieth century
cartoon a large proportion of motions involves loops. This principle is taken to the
extreme in Rybczynski’s Tango. Subjecting live action footage to the logic of
animation, Rybczynski arranges the trajectory of every character through space as
a loop. These loops are further composited together resulting in a complex and
intricate time-based structure. At the same time, the overall “shape” of this
structure is governed by a number of narratives. The film begins in an empty
room; next the loops of character’s trajectories through this room are added, one
267
by one. The end of the film mirrors its beginning as the loops are “deleted” in a
reverse order, also one by one. This metaphor for a progression of a human life
(we are born alone, gradually forms relations with other humans, and eventually
die alone) is also supported by another narrative: the first character to appear in
the room is a young boy, the last one is an old woman.
The concept of a loop as an “engine” which puts the narrative in motion
becomes a foundation of a brilliant interactive TV program Akvaario (aquarium)
by a number of graduate students at Helsinki’s University of Art and Design
(Professor and Media Lab coordinator: Minna Tarrka).378 In contrast to many
new media objects which combine the conventions of cinema, print and HCI,
Akvaario aims to preserve the continuos flow of traditional cinema, while adding
interactivity to it. Along with an earlier game Jonny Mnemonic (SONY, 1995), as
well as the pioneering interactive laserdisk computer installations by Graham
Weinbren done in the 1980s, this project is a rare example of a new media
narrative which does not rely on the oscillation between non-interactive and
interactive segments (see “Illusion, Narrative and Interactivity” section for the
analysis of this temporal ossicilation.)
Using the already familiar convention of such games such as Tamagotchi
(1996-), the program asks TV viewers to “take charge” of a fictional human
character.379 Most shots which we see show this character engaged in different
activities in his apartment: eating dinner, reading a book, starring into space. The
shots replace each other following standard conventions of film and TV editing.
The result is something which looks at first like a conventional, although very
long, movie (the program was projected to run for three hours every day over the
course of a few months), even though the shots are selected in real time by a
computer porgram from a database of a few hundreds diffirent shots.
By choosing one of the four buttons which are always present on the
bottom of the screen, the viewers control character’s motivation. When a button is
pressed, a computer program selects a sequence of particular shots to follow the
shot which plays currently. Because of visual, spatial and referential discontinuity
between shots typical of standard editing, the result is something which the
viewer interprets as a conventional narrative. A film or television viewer viewer
does not expect that any two shots which follow one another have to display the
same space or subsequent moments of time. Therefore in Akvaario a computer
program can “weave” an endless narrative by choosing from a database of
different shots. What gives the resulting “narrative: a suficient continuity is that
almost all shots show the same character.
Akvaario is one of the first examples of what in previous chapter I called a
“database narrative.” It is, in other words, a narrative which fully utilizes many
features of database organization of data. It relies on our abilities to classify
database records according to different dimensions, to sort through records, to
268
quickly retrieve any record, as well as to “stream” a number of different records
continuously one after another.
In Akvaario the loop becomes the way to bridge linear narrative and
interactive control. When the program begins, a few shots keep following each
other in a loop. After users choose character’s motivation by pressing a button,
this loop becomes a narrative. Shots stop repeating and a sequence of new shots is
displayed. If no button pressed again, the narrative turns back into a loop, i.e. a
few shots start repeating over and over. In Akvaario a narrative is born from a
loop and it returns back to a loop. The historical birth of modern fictional cinema
out of the loop returns as a condition of cinema’s rebirth as an interactive form.
Rather than being an archaic leftover, a reject from cinema’s evolution, the use of
loop in Akvaario suggests a new temporal aesthetics for computer-based cinema.
Jean-Louis Boissier's Flora petrinsularis realizes some of the possibilities
contained in the loop form in a diffirent way.380 This CD-ROM is based on
Rousseau's Confessions. It opens with a white screen, containing a numbered list.
Clicking on each item leads us to a screen containing two windows, positioned
side by side. Both windows show the same video loop made from a few diffirent
shots. The two loops are offset from each other in time. Thus, the images
appearing in the left window reappear in a moment on the right and vice versa, as
though an invisible wave is running through the screen. This wave soon becomes
materialized: when we click inside the windows we are taken to a new screen
which also contains two windows, each showing loop of a rhythmically vibrating
water surface. The loops of water surfaces can be thought of as two sign waves
offset in phase. This structure, then, functions as a “meta-text” of a structure in
the first screen. In other words, the loops of water surface act as a diagram of the
loop structure which controls the correlations between shots in the first screen,
similar to how Marey and the Gibsons diagrammed human motion in their film
studies in the beginning of the twentieth century.
As each mouse click reveals another loop, the viewer becomes an editor,
but not in a traditional sense. Rather than constructing a singular narrative
sequence and discarding material which is not used, here the viewer brings to the
forefront, one by one, numerous layers of looped actions which seem to be taking
place all at once, a multitude of separate but co-existing temporalities. The viewer
is not cutting but re-shuffling. In a reversal of Vertov's sequence where a loop
generated a narrative, viewer's attempt to create a story in Flora petrinsularis leads
to a loop.
It is useful to analyze the loop structure of Flora petrinsularis using
montage theory. From this perspective, the repetition of images in two adjoint
windows can be interpreted as an example of what Eisenstein called rhythmical
montage. At the same time, Boissier takes montage apart, so to speak. The shots
which in traditional temporal montage would follow each in time here appear next
to each other in space. In addition, rather than being “hard-wired” by an editor in
269
only one possible structure, here the shots can appear in different combinations
since they are activated by a user moving a mouse across the windows.
At the same time, it is possible to find more traditional temporal montage
in this work as well — for instance, the move from first screen which shows
close-up of a woman to a second screen which shows water surfaces and back to
the first screen. This move can be interpreted as a traditional parallel editing. In
cinema parallel editing involves alternating between two subjects. For instance, a
chase sequence may go back and forth between the images of two cars, one
pursuing another. However in our case the water images are always present
“underneath” the first set of images. So the logic here is again one of co-existence
rather than that of replacement, typical of cinema (see my discussion of spatial
montage below).
The loop which structures Flora petrinsularis on a number of levels
becomes a metaphor for human desire which can never achieve resolution. It can
be also read as a comment on cinematic realism. What are the minimal conditions
necessary to create the impression of reality? As Boissier demonstrates, in the
case of a field of grass, a close-up of a plant or a stream, just a few looped frames
become sufficient to produce the illusion of life and of linear time.
Steven Neale describes how early film demonstrated its authenticity by
representing moving nature: "What was lacking [in photographs] was the wind,
the very index of real, natural movement. Hence the obsessive contemporary
fascination, not just with movement, not just with scale, but also with waves and
sea spray, with smoke and spray."381 What for early cinema was its biggest pride
and achievement — a faithful documentation of nature's movement — becomes
for Boissier a subject of ironic and melancholic simulation. As the few frames are
looped over and over, we see blades of grades shifting slightly back and forth,
rhythmically responding to the blow of non-existent wind which is almost
approximated by the noise of a computer reading data from a CD-ROM.
Something else is being simulated here as well, perhaps unintentionally.
As you watch the CD-ROM, the computer periodically staggers, unable to
maintain consistent data rate. As a result, the images on the screen move in
uneven bursts, slowing and speeding up with human-like irregularity. It is as
though they are brought to life not by a digital machine but by a human operator,
cranking the handle of the Zootrope a century and a half ago...
Spatial Montage
Along with taking on a loop, Flora petrinsularis can also be seen as a step towards
what I will call a spatial montage. Instead of a traditional singular frame of
cinema, Boissier uses two images at once, positioned side by side. This can be
thought of a simplest case of a spatial montage. In general, spatial montage would
270
involve a number of images, potentially of different sizes and proportions,
appearing on the screen at the same time. This by itself of course does not result
in montage; it up to the filmmaker to construct a logic which drives which images
appear together, when they appear and what kind of relationships they enter with
each other.
Spatial montage represents an alternative to traditional cinematic temporal
montage, replacing its traditional sequential mode with a spatial one. Ford's
assembly line relied on the separation of the production process into a set of
repetitive, sequential, and simple activities. The same principle made computer
programming possible: a computer program breaks a tasks into a series of
elemental operations to be executed one at a time. Cinema followed this logic of
industrial production as well. It replaced all other modes of narration with a
sequential narrative, an assembly line of shots which appear on the screen one at a
time. A sequential narrative turned out to be particularly incompatible with a
spatial narrative which played a prominent role in European visual culture for
centuries. From Giotto's fresco cycle at Capella degli Scrovegni in Padua to
Courbet's A Burial at Ornans, artists presented a multitude of separate events
within a single space, be it the fictional space of a painting or the physical space
which can be taken by the viewer all in once. In the case of Giotto’s fresco cycle
and many other fresco and icon cycles, each narrative event is framed separately
but all of them can be viewed together in a single glance. In other cases, different
events are represented as taking place within a single pictorial space. Sometimes,
events which formed one narrative but they separated by time were depicted
within a single painting. More often, the painting’s subject became an excuse to
show a number of separate “micro-narratives” (for instance, works by
Hiëronymous Bosch and Peter Bruegel). All in all, in contrast to cinema's
sequential narrative, in spatial narrative all the "shots" were accessible to a viewer
at one. Like nineteenth century animation, spatial narrative did not disappear
completely in the 20
th
century; but just as animation, it came to be delegated to a
minor form of Western culture — comics.
It is not accidental that the marginalization of spatial narrative and the
privileging of sequential mode of narration coincided with the rise of historical
paradigm in human sciences. Cultural geographer Edward Soja has argued that
the rise of history in the second half of the nineteenth century coincided with the
decline in spatial imagination and the spatial mode of social analysis.382
According to Soja, it is only in the last decades of the twentieth century that this
mode made a powerful comeback, as exemplified by the growing importance of
such concepts as “geopolitics” and “globalisation” as well as by the key role
analysis of space played in theories of post-modernism. Indeed, although some of
the best thinkers of the twentieth century such as Freud, Panofsky and Foucault
were able to combine historical and spatial mode of analysis in their theories, they
probably represent an exemption rather than the norm. The same holds for film
271
theory, which, from Eisenstein in the 1920s to Deleuse in the 1980s, focused on
temporal rather than spatial structures of film.
Twentieth century film practice has elaborated complex techniques of
montage between different images replacing each other in time; but the possibility
of what can be called "spatial montage" between simultaneously co-exiting
images was not explored as systematically. (Thus cinema also given to historical
imagination at the expense of spatial one.) The notable exemptions include the
use of split screen by Hans Abel in Napoléon in the 1920s and also by the
American experimental filmmaker Stan Van der Beek in the 1960s; also some
other works, or rather, events, of the 1960s “expanded cinema” movement, and,
last but not least, the legendary multi-image multimedia presentation shown in the
Chech Pavilion at the1967 World Expo. Emil Radok’s Diaolyektan consisted
from 112 separate cubes. One hundred and sixty different images could be
projected onto each cube. Radok was able to “direct” each cube separately. To the
best of my knowledge, since this project nobody tried again to create a spatial
montage of this complexity in any technology.
Traditional film and video technology were designed to completely fill a
screen with a single image; thus to explore spatial montage a filmmaker had to
work “against” the technology. This in part explains why so few tried to do this.
But when, in the 1970s, the screen became a bit-mapped computer display, with
individual pixels corresponding to memory locations which can be dynamically
updated by a computer program, one image/ one screen logic was broken. Since
the Xerox Park Alto workstation, GUI used multiple windows. It would be logical
to expect that cultural forms based on moving images will eventually adopt
similar conventions. In the 1990s some computer games such as Golden Eye
(Nintendo/Rare, 1997) already used multiple windows to present the same action
simultaneously from different viewpoints. We may expect that computer-based
cinema will eventually have to follow the same direction — especially when the
limitations of communication bandwidth will disappear, while the resolution of
displays will significantly increase, from the typical 1-2K in 2000 to 4K, 8K or
beyond. I believe that the next generation of cinema — broadband cinema — will
add multiple windows to its language. When this happen, the tradition of spatial
narrative which twentieth century cinema suppressed will re-emerge one again.
Looking back at visual culture and art of the previous centuries gives
many ideas for how spatial narrative can be further developed in a computer; but
what about spatial montage? In other words, what will happen if we combine two
different cultural traditions: informationally dense visual narratives of
Renaissance and Baroque painters with “attention demanding” shot juxtapositions
of twentieth century film directors? "My boyfriend came back from war!," a Web-
based work by the young Moscow artist Olga Lialina, can be read as an
exploration of this direction.383 Using the capability of HTML to create frames
within frames, Lialina leads us through a narrative which begins with an single
272
screen. This screen becomes progressively divided into more and more frames as
we follow different links. Throughout, an image of a human couple and of a
constantly blinking window remain on the left part of screen. These two images
enter into new combinations with texts and images on the right part which keep
changing as the user interacts with the work. As the narrative activates different
parts of the screen, montage in time gives way to montage in space. Put
differently, we can say that montage acquires a new spatial dimension. In addition
to montage dimensions already explored by cinema (differences in images'
content, composition, movement) we now have a new dimension: the position of
the images in space in relation to each other. In addition, as images do not replace
each other (as in cinema) but remain on the screen throughout the movie, each
new image is juxtaposed not just with one image which preceded it, but with all
the other images present on the screen.
The logic of replacement, characteristic of cinema, gives way to the logic
of addition and co-existence. Time becomes spatialized, distributed over the
surface of the screen. In spatial montage, nothing is potentially forgotten, nothing
is erased. Just as we use computers to accumulate endless texts, messages, notes
and data, and just as a person, going through life, accumulates more and more
memories, with the past slowly acquiring more weight than the future, spatial
montage can accumulate events and images as it progresses through its narrative.
In contrast to cinema's screen, which primarily functioned as a record of
perception, here computer screen functions as a record of memory.
As I already noted, spatial montage can also be seen as an aesthetics
appropriate for the user experience of muli-tasking and multiple windows of GUI.
In the text of his lecture “Of other spaces” Michel Foucault writes: “We are now
in the epoch of simultaneity: we are in epoch of juxtaposition, the epoch of near
and far, of the side-by-side, of the dispersed…our experience of the world is less
of a long life developing through time that that of a network that connects points
and intersects with its own skein…”384 Writing this in the early 1970s, Foucault
appears to prefigure not only the network society, exemplified by the Internet (“a
network which connects points”) but also GUI (“epoch of simultaneity…of the
side-by-side). GUI allows the users to run a number of software applications at
the same time; and it uses the convention of multiple overlapping windows to
present both data and controls. The construct of the desktop with presents the
user with multiple icons which are all simultaneously and continuously “active”
(since they all can be clicked at any time) follows the same logic of
“simultaneity” and of “side-by-side.” On the level of computer programming, this
logic corresponds to object-oriented programming. Instead of a single program
which, like Ford’s assembly line, is executed one statement at a time, in object-
oriented paradigm a number of objects send messages to each other. These objects
are all active simultaneously. Object-oriented paradigm and multiple windows of
GUI work together; object-oriented approach was in fact used to program the
273
original Macintosh GUI which substituted the “one command at a time” logic of
DOS with the logic of simultaneity of multiple windows and icons.
The spatial montage of "My boyfriend came back from war!" follows this
logic of simultaneity of modern GUI. Multiple and simultaneously active icons
and windows of GUI become the multiple and simultaneously active frames and
hyperlinks of this Web artwork. Just as the GUI user can click on any icon at any
time, changing the overall “state” of the computer environment, the user of
Lialina’s site can activate different hyperlinks which are all simultaneously
present. Each action either changes the contents of a single frame or creates new
frame(s). In either case, the “state” of the screen as a whole is affected. The result
is a new cinema where syncronic dimension is no longer privileged to the
diacronic dimension, space is no longer privileged to time, the simultaneity is no
longer privileged to sequence, montage within a shot is no longer privileged to
montage in time.
Cinema as an Information Space
As we saw in “Cultural Interfaces” section, cinema language which originally was
an interface to narrative taking place in 3D space is now becoming an interface to
all types of computer data and media. I discussed how such elements of this
language as rectangular framing, mobile camera, image transitions, montage in
time and montage within an image reappear in general purpose HCI, in interfaces
of software applications and in cultural interfaces.
Yet another way to think about new media interfaces in relation to cinema
is to interpret the later as information space. If HCI is an interface to computer
data, and a book is interface to text, cinema can be thought of an interface to
events taking place in 3D space. Just as painting before it, cinema presented us
with familiar images of visible reality — interiors, landscapes, human characters
— arranged within a rectangular frame. The aesthetics of these arrangements
ranges from extreme scarcity to extreme density. The examples of the former are
paintings by Morandi and shots in Late Spring (Yasujiro Ozu, 1949); the
examples of the later are paintings by Bosch and Bruegel (and much of Northern
Renaissance painting in general), and many shots in A Man with a Movie
Camera.385 It would be only a small leap to relate this density of “pictorial
displays” to the density of contemporary information displays such as Web
portals which may contain a few dozen hyperlinked elements; or the interfaces of
popular software packages which similarly present the user with dozens
commands at once. Can the contemporary information designers learn from
information displays of the past — particular films, paintings and other visual
forms which follow the aesthetics of density?
274
In making such a connection I rely on work of art historian Svetlana
Alpers who claimed that in contrast to Italian Renaissance painting primarily
concerned with narration, Dutch painting of the Seventeenth century is focused on
description.386 While the Italians subordinated details to the narrative action,
creating clear hierarchy of viewer’s attention, in Dutch paintings particular details
and, consequently, viewer’s attention, are more evenly distributed throughout the
whole image. While functioning as a window into an illusionary space, the Dutch
painting also is a loving catalog of numerous objects, different material surfaces
and light effects painted in minute detail (works by Vermeer, for instance.) The
dense surfaces of these paintings can be easily related to contemporary interfaces;
in addition, they can be also related to the future aesthetics of a moving image,
when the digital displays will move much beyond the resolution of analog
television and film.
The trilogy of computer films by Paris-based filmmaker Christian
Boustani, develops such an aesthetics of density. Taking his inspiration from
Renaissance Dutch painting as well as from classical Japanese art, Boustani uses
digital compositing to achieve unprecedented. for film, information density.
While this density was typical for old art he draws on, it was never before
achieved in cinema. In Brugge (1995) Boustani recreates the images typical of
winter landscape scenes in Dutch seventeenth century painting. His next film A
Viagem (The Voyage, 1998) achieves even higher information density; some
shots of the film use as many as 1600 separate layers.
This new cinematic aesthetics of density seems to be highly appropriate
for out age. If, from a city street to a Web page, we are surrounded by highly
dense information surfaces, it is appropriate to expect from cinema similar logic.
(In a same fashion, we may think of spatial montage as reflecting another
contemporary daily experience: working with a number of different applications
at once on a computer. If we are now used to distribute and rapidly switch our
attention from one program to another, from one set of windows and command to
another set, we may find multiple streams of audio-visual information presented
simultaneously more satisfying than a single stream of traditional cinema.)
It is appropriate that some of the most dense shots of A Viagem recreates a
Renaissance marketplace, this symbol of emerging capitalism which was probably
responsible for the new density of Renaissance painting (think, for instance, of
Dutch still-lives which function as a kind of store display window aiming to
overwhelm the viewer and seduce her into making a purchase). In the same way,
in the 1990s the commercialization of the Internet was responsible for the new
density of Web pages. By the end of the decade all home pages of big companies
and Internet portals became indexes containing dozens of entries in a small type.
If every small area of the screen can potentially contain a lucrative add or a link to
a page with one, this leaves no place for the aesthetics of emptiness and
275
minimalism. Thus it is not surprising that commercialized Web joined the same
aesthetics of information density and competing signs and images which
characterizes visual culture in a capitalist society in general.
If Lialina’s spatial montage relies on HTML frames and actions of the user
to activate images appearing in these frames, Boustani’s spatial montage is more
purely cinematic and painterly. He combines mobility of camera and movement
of objects characteristic of cinema which the “hyper-realism” of old Dutch
painting which presented everything “in focus.” In analog cinema, the inevitable
“depth of field” artifact acted as a limit to the information density of an image.
The achievement of Boustani is to create images where every detail is in focus
and yet the overall image is easily readable. This could only be done through
digital compositing. By reducing visible reality to numbers the computer makes
possible for us to literally see in a new way. If, according to Benjamin, early
twentieth century cinema used close-up "to bring things 'closer' spatially and
humanly," "to get hold of an object at very close range,” and, as a result,
destroyed their aura, digital composites of Boustani can be said to bring objects
close to a viewer without “extracting” them away from their places in the word.
(Of course also an opposite interpretation is possible: we can say that Boustani’s
digital eye is super-human. Similar to the argument in “Synthetic Image and its
subject” section, his vision can be interpreted as the gaze of a cyborg or computer
vison system which can see things equally well at any distance.)
Scrutinizing the prototypical perceptual spaces of modernity — the
factory, the movie theater, the shopping arcade — Walter Benjamin insisted on
the contiguity between the perceptual experiences in the workplace and outside of
it:
Whereas Poe's passers-by cast glances in all directions which still
appeared to be aimless, today's pedestrians are obliged to do so in order to
keep abreast of traffic signals. Thus technology has subjected the human
sensorium to a complex kind of training. There came a day when a new
and urgent need for stimuli was met by the film. In a film, perception in
the form of shocks was established as a formal principle. That which
determines the rhythm of production on a conveyer belt is the basis of the
rhythm of reception in the film.387
For Benjamin, the modern regime of perceptual labor, where the eye is constantly
asked to process stimuli, equally manifests itself in work and leisure. The eye is
trained to keep pace with the rhythm of industrial production at the factory and to
navigate through the complex visual semiosphere beyond the factory gates. It is
appropriate to expect that the computer age will follow the same logic, presenting
the users with similarly structured perceptual experiences at work and at home, on
276
a computer screen and outside of it. Indeed, as I already noted, we now use the
same interfaces for work and for leisure, the condition exemplified most
dramatically by Web browsers. Another example is the use of the same interfaces
in flight and military simulators, in computer games modeled after these
simulators, and in the actual controls of planes and other vehicles (recall the
popular perception of Gulf War as “video game war.”) But if Benjamin appears to
regret that the subjects of an industrial lost pre-modern freedom of perception,
now regimented by factory, modern city and film, we may instead think of
information density of our own workspaces as a new aesthetic challenge,
something to explore rather than to condemn. Similarly, we should explore the
aesthetic possibilities of all aspects of user’s experience with a computer, this key
experience of modern life: dynamic windows of GUI, multi-tasking, search
engines, databases, navigable space, and others.
Cinema as a Code
When radically new cultural forms appropriate for the age of wireless
telecommunication, multitasking operating systems and information appliances
will arrive, what will they look like? How would we even know they are here?
Would future films look like a "data shower" from the movie "Matrix"? Is the
famous fountain at Xerox Park in which the strength of the water stream reflects
the behavior of the stock market, with stock data arriving in real time over
Internet, represents the future of public sculpture?
We don't yet know the answers to these questions. However, what artists
and critics can do is point out the radically new nature of new media by staging —
as opposed to hiding — its new properties. As my last example, I will discuss
Vuk Cosic's ASCII films, which effectively stage one characteristic of computer-
based moving images — their identity as a computer code.388
It is worthwhile to relate Cosic's films to both Zuse’s "found footage
movies" from the 1930s, which I invoke in the beginning of this book, and to the
first all-digital feature length movie made sixty years later — Lucas's Stars Wars:
Episode 1, The Phantom Menace.389 Zuse superimposes digital code over the
film images. Lucas follows the opposite logic: in his film, digital code “lies
under” his images. That is, given that most images in the film were put together
on computer workstations, during the post-production process they were pure
digital data. The frames were made up from numbers rather than bodies, faces,
and landscapes. The Phantom Menace is, therefore, can be called the first feature-
length commercial abstract film: two hours worth of frames made up from matrix
of numbers. But this is hidden from the audience.
277
What Lucas hides, Cosic reveals. His ASCII films "perform" the new
status of media as digital data. The ASCII code that results when an image is
digitized is displayed on the screen. The result is as satisfying poetically as it is
conceptually — for what we get is a double image: a recognizable film image and
an abstract code together. Both are visible at once. Thus, rather than erasing the
image in favor of the code as in Zuse's film, or hiding the code from us as in
Lucas's film, in ASCII films the code and the image coexist.
Like VinylVideo project by Gebhard Sengmüller which records TV
programs and films on old vinyl disks,390 Cosic's ASCII initiative391 is a
systematic program of translating media content from one obsolete format into
another. These projects remind us that since at least the 1960s the operation of
media translation has been at the core of our culture. Films transferred to video;
video transferred from one video format to another; video transferred to digital
data; digital data transferred from one format to another: from floppy disks to Jaz
drives, from CD-ROMs to DVDs; and so on, indefinitely. The artists noticed this
new logic of culture early on: in the 1960s, Roy Lichtenstein and Andy Warhol
already made media translation the basis of their art. Sengmuller and Cosic
understand that the only way to deal with built-in media obsolescence of a
modern society is by ironically resurrecting dead media. Sengmuller translates old
TV programs into vinyl disks; Cosic translates old films into ASCII images.392
Why do I call ASCII images an obsolete media format? Before the printers
capable of outputting raster digital images became widely available toward the
end of the 1980s, it was commonplace to make printouts of images on dot matrix
printers by converting the images into ASCII code. I was surprised that in 1999 I
still was able to find the appropriate program on my UNIX system. Called simply
"toascii," the command, according to the UNIX system manual page for the
program, "prints textual characters that represent the black and white image used
as input."
The reference to early days of computing is not unique to Cosic but shared
by other net.artists. Jodi.org, the famous net.art project created by the artistic team
of Joan Heemskerk and Dirk Paesmans, often evokes DOS commands and the
characteristic green color of computer terminals from the 1980s393; a Russian
net.artist Alexei Shulgin has performed music in the late 1990s using old
386PC.394 But in the case of ASCII code, its use evokes not only a peculiar
episode in the history of computer culture but a number of earlier forms of media
and communication technologies as well. ASCII is an abbreviation of American
Standard Code for Information Interchange. The code was originally developed
for teleprinters and was only later adopted for computers in the 1960s. A
teleprinter was a twentieth-century telegraph system that translated the input from
a typewriter keyboard into a series of coded electric impulses, which were then
transmitted over communications lines to a receiving system, which decoded the
278
pulses and printed the message onto a paper tape or other medium. Teleprinters
were introduced in the 1920s and were widely used until the 1980s (Telex being
the most popular system), when they were gradually replaced by fax and
computer networks.395
ASCII code was itself an extension of an earlier code invented by Jean-
Maurice-Emile Baudot in 1874. In Baudot code, each letter of an alphabet is
represented by a five-unit combination of current-on or current-off signals of
equal duration. ASCII code extends Baudot code by using eight-unit
combinations (that is, eight "bits" or one "byte") to represent 256 different
symbols. Baudot code itself was an improvement over the Morse code invented
for early electric telegraph systems in the 1830s. And so on.
The history of ASCII code compresses a number of technological and
conceptual developments which lead to (but I am sure will not stop at) a modern
digital computers: cryptography, real-time communication, communication
network technology, coding systems. By juxtaposing ASCII code with the history
of cinema, Cosic accomplishes what can be called an artistic compression. That
is, along with staging the new status of moving images as a computer code, he
also “encodes” in these images many key issues of computer culture and new
media art.
As this book has argued, in computer age, cinema, along with other established
cultural forms, indeed becomes precisely a code. It is now used to communicate
all types of data and experiences; and its language is encoded in interfaces and
defaults of software programs and hardware itself. Yet, while on the one hand
new media strengthens existing cultural forms and languages, including the
language of cinema, it simultaneously “opens” them up for redefinition. The
elements of their interfaces become separated from the types of data they were
traditionally connected to. Further, what was previously in the background, on the
margins, comes into the center. For instance, animation comes to challenge live
cinema; spatial montage comes to challenge temporal montage, database comes to
challenge narrative; search engine comes to challenge encyclopedia; and, last but
not least, online distribution of culture challenges traditional “off-line” formats.
To use a metaphor from computer culture, new media turns all culture and
cultural theory into “open source.” This “opening up” of all cultural techniques,
conventions, forms and concepts is ultimately the most positive cultural effect of
computerization — the opportunity to see the world and the human being anew, in
ways which were not available to A Man with a Movie Camera.
279
NOTES
1
http://www.nettime.org
2
http://www.rhizome.org
3
Phong, B.T. “Illumination for Computer Generated Pictures,” Communication
of the ACM, Volume 18, no. 6 (June 1975): 311-317.
5
Thomas S. Kuhn, The Structure of Scientific Revolutions, 2nd ed. (Chicago:
University of Chicago Press, 1970).
6
By virtual worlds I mean 3D computer-generated interactive environments. This
definition fits a whole range of 3D computer environments already in existence:
high-end VR works which feature head-mounted displays and photo realistic
graphics; arcade, CD-ROM and on-line multi-player computer games; QuickTime
VR movies; VRML (The Virtual Reality Modeling Language) scenes; and
graphical chat environments such as The Palace and Active Worlds.
Virtual worlds represent an important trend across computer culture,
consistently promising to become a new standard in human-computer interfaces
and in computer networks. (For a discussion of why this promise may never be
fulfilled, see “Navigable Space” section.) For example, Silicon Graphics
developed a 3-D file system which was showcased in the movie Jurassic Park.
Sony used a picture of a room as an interface in its MagicLink personal
communicator. Apple's short-lived E-World greeted its users with a drawing of a
city. Web designers often use pictures of buildings, aerial views of cities, and
maps a interface metaphors. In the words of the scientists from Sony's The Virtual
Society Project (www.csl.sony.co.jp/project/VS/), "It is our belief that future
online systems will be characterized by a high degree of interaction, support for
multi-media and most importantly the ability to support shared 3-D spaces. In our
vision, users will not simply access textual based chat forums, but will enter into
3-D worlds where they will be able to interact with the world and with other users
in that world."
7
Tzevan Todorov, Introduction to Poetics, trans. by Rchard Howard
(Minneapolis: University of Minnesota Press, 1981), 6.
8
Examples of software standards include operating systems such as UNIX,
Windows and MAC OS; file formats (JPEG, MPEG, DV, QuickTime, RTF,
WAV); scripting languages (HTML, Javascript); programming langauges (C++,
Java); communication protocols (TCP-IP); the conventions of HCI (e.g. dialog
boxes, copy and paste commands, help pointer); and also unwritten conventions,
280
such as the 640 by 480 pixel image size which was used for more than a decade.
Hardware standards include storage media formats (ZIP, JAZ, CD-ROM, DVD),
port types (serial, USB, Firewire), bus architectures (PCI), and RAM types.
9
Vkutemas was a Moscow art and design school in the 1920s which united most
Left avant-garde artists; it functioned as a counterpart of Bauhaus in Germany.
10
Qtd. in Beumont Newhall, The History of Photography from 1839 to the
Present Day. Revised and Enlarged Edition, fourth edition (New York: The
Museum of Modern Art, 1964), 18.
11
Newhall, The History of Photography, 17-22.
12
Charles Eames, A Computer Perspective: Background To The Computer Age,
1990 edition (Cambridge, Mass.: Harvard University Press, 1990), 18.
13
David Bordwell and Kristin Thompson, Film Art: An Introduction, fifth edition
(New York: The McGraw-Hill Companies), 15.
14
Eames, A Computer Perspective, 22-27, 46-51, 90-91.
15
Eames, A Computer Perspective, 120.
16
Isaac Victor Kerlov and Judson Rosebush, Computer Graphics for Designes
and Artists (New York: Van Nostrand Reinhold Company, 1986), 14.
17
Kerlov and Rosebush, Computer Graphics, 21.
18
Roland Barthes,.Elements of Semiology (New York: Hill and Wang, 1968), 64.
19
I discuss the particular cases of computer automation of visual communication
in more detail in "Automation of Sight from Photography to Computer Vision,"
Electronic Culture: Technology and Visual Representation, edited by Timothy
Druckery and Michael Sand (New York: Aperture, 1996); "Mapping Space:
Perspective, Radar and Computer Graphics,” SIGGRAPH '93 Visual Proceedings,
edited by Thomas Linehan, 143-147 (New York: ACM, 1993).
20
http://www.mrl.nyu.edu/improv/, accessed June 29, 1999.
21
http://www-white.media.mit.edu/vismod/demos/smartcam/, accessed June 29,
1999.
22
http://pattie.www.media.mit.edu/people/pattie/CACM-95/alife-cacm95.html,
accessed June 29, 1999.
23
This research was persued at diffirent groups at the MIT lab. See for instance
home page of Gesture and Narrative Language Group,
http://gn.www.media.mit.edu/groups/gn/, accessed June 29, 1999.
24
See http://www.virage.com/products, accessed June 29, 1999.
281
25
http://agents.www.media.mit.edu/groups/agents/projects/, accessed June 29,
1999.
26
See my “Avant-Garde as Software,” in Ostranenie, edited by Stephen Kovats
(Frankfurt and New York: Campus Verlag, 1999.).
(http://visarts.ucsd.edu/~manovich)
27
For an experiment in creating diffirent multimedia interfaces to the same text,
see my Freud-Lissitzky Navigator (http://visarts.ucsd.edu/~manovich/FLN).
28
http://jefferson.village.virginia.edu/wax/, accessed October 24, 1999.
29
Frank Halacz and Mayer Swartz, “The Dexter Hypertext Reference Model,”
Communication of the ACM (New York: ACM, 1994), 30.
30
Noam Chomsky, Syntactic Structures, reprint edition (Peter Lang Publishing,
1978).
31
“How Marketers ‘Profile’ Users,” USA Today (November 9, 1999), 2A.
32
See http://www.three.org. Our conversations helped me to clarify my ideas,
and I am very grateful to Jon for the ongoing exchange.
33
Marcos Novak, lecture at “Interactive Frictions” conference, University of
Southern Californa, Los Angeles, June 6, 1999.
34
Graame Weinbren, In the Ocean of Streams of Story, Millennium Film Journal
28 (Spring 1995),
http://www.sva.edu/MFJ/journalpages/MFJ28/GWOCEAN.HTML.
35
Rick Moody, Demonology, first published in Conjunctions, reprinted in The
KGB Bar Reader, qtd. in Vince Passaro, “Unlikely Stories,” Harper’s Magazine
vol. 299, no. 1791 (August 1999), 88-89.
36
Albert Abramson, Electronic Motion Pictures. A History of Television Camera
(Berkeley: University of California Press, 1955), 15-24.
37
Charles Musser, The Emergence of Cinema: The American Screen to 1907
(Berkeley: University of California Press, 1994), 65.
38
Mitchell, The Reconfigured Eye (Cambridge, Mass.: The MIT Press, 1982), 6.
39
Mitchell, The Reconfigured Eye, 6.
40
Mitchell, The Reconfigured Eye, 49.
41
Ernst Gombrich analses "the beholder's share" in decoding the missing
information in visual images in his classic Art and Illusion. A Study in the
Psychology of Pictorial Representation (Princeton: Princeton University Press,
1960).
282
42
The notion that computer interactive art has its origins in new art forms of the
1960s is explored in Söke Dinkla, "The History of the Interface in Interactive
Art," ISEA (International Symposium on Electronic Art) 1994 Proceedings
(http://www.uiah.fi/bookshop/isea_proc/nextgen/08.html, accessed August 12,
1998); "From Participation to Interaction: Toward the Origins of Interactive Art,"
in Lynn Hershman Leeson, ed. Clicking In: Hot Links to a Digital Culture
(Seattle: Bay Press, 1996): 279-290. See also Simon Penny, “Consumer Culture
and the Technological Imperative: The Artist in Dataspace, in Simon Penny, ed.,
Criical Issues in Electronic Media (Alabany, New York: State University of New
York Press, 1993): 47-74.
43
This argument relies on a cognitivist perspective which stresses the active
mental processes involved in comprehension of any cultural text. For an example
of cognitivist aproach in film studies, see David Bordwell and Kristin Thompson,
Film Art: an Introduction; David Bordwell, Narration in the Fiction Film
(Madison, Wisconsin: University of Wisconsin Press, 1989).
44
For a more detailed analysis of this tend, see my article "From the
Externalization of the Psyche to the Implantation of Technology," in Mind
Revolution: Interface Brain/Computer, edited by Florian Rötzer (München:
Akademie Zum Dritten Jahrtausend, 1995), 90-100.
45
Qtd. in Allan Sekula, "The Body and the Archive," October 39 (1987): 51.
46
Hugo Münsterberg, The Photoplay: A Psychological Study (New York: D.
Aplleton & Co., 1916), 41.
47
Sergei Eisenstein, "Notes for a Film of 'Capital,'" trans. Maciej Sliwowski, Jay
Leuda, and Annette Michelson, October 2 (1976): 10.
48
Timothy Druckrey, "Revenge of the Nerds. An Interview with Jaron Lanier,"
Afterimage (May 1991), 9.
49
Fredric Jameson, The Prison-house of Langauge: a Critical Account of
Structuralism and Russian Formalism (Princeton, N.J.: Princeton University
Press, 1972).
50
Jurgen Habermas, The Theory of Communicative Action, trans. Thomas
McCarthy (Boston, Beacon Press, c1984-).
51
Druckrey, "Revenge of the Nerds,” 6.
52
Sigmund Freud, Standard Edition of the Complete Psychological Works
(London: Hogarth Press, 1953), 4: 293.
53
Edward Bradford Titchener, A Beginner's Psychology (New York: The
Macmillan Company, 1915), 114.
283
54
George Lakoff, "Cognitive Linguistics," Versus 44/45 (1986): 149.
55
Philip Johnson-Laird, Mental Models: Towards a Cognitive Science of
Language, Inference, and Consciousness (Cambridge: Cambridge University
Press, 1983).
56
Louis Alhusser introduced his influential notion of ideological interpellation in
his "Ideology and Ideological State Apparatuses (Notes Towards an
Investigation), in Lenin and Philosophy, trans. by Ben Brewster (New York:
Monthly Review Press, 1971).
57
Stephen Johnson’s Interface Culture makes a claim for the cultural significance
of computer interface.
58
Other examples of cultural theories which rely on “non-transparency of the
code” idea are Yuri Lotman’s theory of secondary modeling systems, George
Lakoff’s cognitive linguistics, Jacques Derrida’s critique of logocentrism and
Marshall McLuhan’s media theory.
59
http://www.ntticc.or.jp/permanent/index_e.html, accessed July 15, 1999.
60
Brad. A. Myers, "A Brief History of Human Computer Interaction
Technology," technical report CMU-CS-96-163 and Human Computer Interaction
Institute Technical Report CMU-HCII-96-103 (Pittsburgh, Pennsylvania:
Carnegie Mellon University, Human-Computer Interaction Institute, 1996).
61
http://www.xanadu.net/the.project, accessed December 1, 1997.
62
XML which is promoted as the replacement for HTML enables any user to
create her customized markup language. Thus, the next stage in computer culture
may involve authoring not simply new Web documents but new languages. For
more information on XML, see http://www.ucc.ie/xml., accessed December 1,
1997.
63
http://www.hotwired.com/rgb/antirom/index2.html, accessed December 1,
1997.
64
See, for instance, Mark Pesce, "Ontos, Eros, Noos, Logos," keynote address for
ISEA (International Symposium on Electronic Arts) 1995,
http://www.xs4all.nl/~mpesce/iseakey.html, accessed December 1, 1997.
65
http://www.backspace.org/iod, accessed July 15, 1999.
66
http://www.netomat.net, accessed July 15, 1999.
67
Roman Jakobson, "Deux aspects du langage et deux types d'aphasie", in Temps
Modernes, no. 188 (January 1962).
284
68
XLM diversifies types of links available by including bi-directional links,
multi-way links and links to a span of text rather than a simple point.
69
This may imply that new digital rhetoric may have less to do with arranging
information in a particular order and more to do simply with selecting what is
included and what is not included in the total corpus being presented.
70
See
http://www.aw.sgi.com/pages/home/pages/products/pages/poweranimator_film_s
gi/index.html, accessed December 1, 1997.
71
In The Address of the Eye Vivian Sobchack discusses the three metaphors of
frame, window and mirror which underlie modern film theory. The metaphor of a
frame comes from modern painting and is central to formalist theory which is
concerned with signification. The metaphor of window underlies realist film
theory (Bazin) which stresses the act of perception. Realist theory follows Alberti
in conceptualizing the cinema screen as a transparent window onto the world.
Finally, the metaphor of a mirror is central to psychoanalytic film theory. In terms
of these distinctions, my discussion here is concerned with the window metaphor.
The distinctions themselves, however, open up a very productive space for
thinking further about the relationships between cinema and computer media, in
particular the cinema screen and the computer window. Vivian Sobchack, The
Address of the Eye: a Phenomenology of Film Experience (Princeton: Princeton
University Press, 1992).
72
Jacques Aumont et al., Aesthetics of Film (Austin: Texas University Press,
1992), 13.
73
By VR interface I mean the common forms of a head-mounted or head-coupled
directed display employed in VR systems. For a popular review of such displays
written when the popularity of VR was at its peak, see Steve Aukstakalnis and
David Blatner, Silicon Mirage: The Art and Science of Virtual Reality (Berkeley:
CA: Peachpit Press, 1992), pp. 80-98. For a more technical treatment, see Dean
Kocian and Lee Task, "Visually Coupled Systems Hardware and the Human
Interface" in Virtual Environments and Advanced Interface Design, edited by
Woodrow Barfield and Thomas Furness III (New York and Oxford: Oxford
University Press, 1995), 175-257.
74
See Kocian and Task for details on field of view of various VR displays.
Although it varies widely between different systems, the typical size of the field
of view in commercial head-mounted displays (HMD) available in the first part of
the 1990's was 30-50o.
285
75
http://webspace.sgi.com/WebSpace/Help/1.1/index.html, accessed December
1, 1997.
76
See John Hartman and Josie Wernecke, The VRML 2.0 Handbook: Building
Moving Worlds on the Web (Reading, Mass.: Addison-Wesley Publishing
Company, 1996), 363.
77
Examples of an earlier trend are Return to Zork (Activision, 1993) and The 7th
Guest (Trilobyte/Virgin Games, 1993). Examples of the later trend are Soulblade
(Namco, 1997) and Tomb Raider (Eidos, 1996).
78
Critical literature on computer games, and in particular on their visual
language, remains slim. Useful facts on history of computer games, description of
different genres and the interviews with the designers can be found in Chris
McGowan and Jim McCullaugh, Entertainment in the Cyber Zone (New York:
Random House, 1995). Another useful source is J.C. Herz, Joystick Nation: How
Videogames Ate Our Quarters, Won Our Hearts, and Rewired Our Minds
(Boston: Little, Brown and Company, 1997).
79
Dungeon Keeper (Bullfrog Productions, 1997).
80
For a more detailed discussion of the history of computer imaging as gradual
automation, see my articles "Mapping Space: Perspective, Radar and Computer
Graphics," and "Automation of Sight from Photography to Computer Vision.”
81
Moses Ma's presentation, panel "Putting a Human Face on Cyberspace:
Designing Avatars and the Virtual Worlds They Live In," SIGGRAPH '97,
August 7, 1997.
82
Li-wei He, Michael Cohen, David Salesin, “The Virtual Cinematographer: A
Paradigm for Automatic Real-Time Camera Control and Directing,” SIGGRAPH
’96 (http://research.microsoft.com/SIGGRAPH96/96/VirtualCinema.htm).
83
See http://www.artcom.de/projects/invisible_shape/welcome.en, accessed
December 1, 1997.
84
Jay David Bolter and Richard Grusin, Remediation: Understanding New Media
(Camridge, Mass.: The MIT Press, 1999), 19.
85
See Svetlana Alpers, The Art of Describing: Dutch Art in the Seventeenth
Century (Chicago: University of Chicago Press, 1983). See particularly chapter
"Mapping Impulse."
86
This historical connection is illustrated by popular flight simulator games
where the computer screen is used to simulate the control panel of a plane, i.e. the
very type of object from which computer interfaces have developed. The
conceptual origin of modern GUI in a traditional instrument panel can be seen
286
even more clearly in the first graphical computer interfaces of the late 1960's and
early 1970's which used tiled windows. The first tiled window interface was
demonstrated by Douglas Engelbart in 1968.
87
My analysis here focuses on the continuities between a computer screen and
preeceding its representational conventions and technologies. For alterantive
readings will take up the diffirences between the two, see exellent artcles by
Vivian Sobchack, “Nostalgia for a Digital Object: Regrets on the Quickening of
QuickTime,” in Millennium Film Journal (Winter 2000) and Norman Bryson,
“Summer 1999 at TATE,” available from Tate Gallery, 413 West 14th Street, New
York City. Bryson writes: “Though the [computer] screen is able to present a
scenographic depth, it is obviously unlike the Albertian or Reneissance Window;
its surface never vanishes before the imaginary depths behind it, it never truly
opens into depth. But the PC screen does not behave like the modernist image,
either. It cannot foreground the materiality of the surface (of pgments on canvas)
since it has no materiality to speak of, other than the play of shifting light.” Both
Sobchack and Bryson also stresss the diffirence between traditional image frame
and multiple windows of a computer screen. Bryson: “basically the whole order
of the frame is abolished, replaced by the order of superimposition or tiling.”
88
The degree to which a frame that acts as a boundary between the two spaces is
emphasized seems to be proportional to the degree of identification expected from
the viewer. Thus, in cinema, where the identification is most intense, the frame as
a separate object does not exist at all — the screen simply ends at its boundaries
— while both in painting and in television the framing is much more pronounced.
89
Here I agree with the parallel suggested by Anatoly Prokhorov between
window interface and montage in cinema.
90
For these origins, see, for instance, C.W. Ceram, Archeology of the Cinema
(New York: Harcourt, Brace & World, Inc., 1965).
91
Beaumont Newhall, Airborne Camera (New York: Hastings House, Publishers,
1969).
92
This is more than a conceptual similarity. In the late 1920s John H. Baird
invented "phonovision," the first method for the recording and the playing back of
a television signal. The signal was recorded on Edison's phonograph's record by a
process very similar to making an audio recording. Baird named his recording
machine "phonoscope." Albert Abramson, Electronic Motion Pictures (University
of California Press, 1955), 41-42.
93
Echoes of War (Boston: WGBH Boston, n.d.), videotape.
94
Ibid.
287
95
Ibid.
96
On SAGE, see exellent social history of early computing by Paul Edwards, The
Closed World: Computers and the Politics of Discourse in Cold War America
(MIT Press, 1996). For a shorter sumary of his argument, see Paul Edwards, "The
Closed World. Systems discourse, military policy and post-World War II U.S.
historical consciousness," in Cyborg Worlds: The Military Information Society,
eds. Les Levidow and Kevin Robins (London: Free Association Books, 1989).
See also Howard Rheingold, Virtual Reality (New York: Simon & Schuster, Inc.,
1991), 68-93.
97
Edwards (1989), 142.
98
"Retrospectives II: The Early Years in Computer Graphics at MIT, Lincoln
Lab, and Harvard," in SIGGRAPH '89 Panel Proceedings (New York: The
Association for Computing Machinery, 1989), 22-24.
99
Ibid., 42-54.
100
Rheingold, 105.
101
Qtd. in Rheingold, 104.
102
Roland Barthes, "Diderot, Brecht, Eisenstein," in Images-Music-Text, ed.
Stephen Heath (New York: Farrar, Straus and Giroux, 1977), 69-70.
103
Ibid.
104
While in the following I discuss the immobility of the subject of a screen in
the context of the history of representation, we can also relate this condition to the
history of communication. In Ancient Greece, communication was understood as
an oral dialogue between people. It was also assumed that physical movement
stimulated dialogue and the process of thinking. Aristotle and his pupils walked
around while discussing philosophical problems. In the Middle Ages, a shift
occured from a dialogue between subjects to communication between a subject
and an information storage device, i.e., a book. A Medieval book chained to a
table can be considered a precursor to the screen which “fixes” its subject in
space.
105
As summarized by Martin Jay, "Scopic Regimes of Modernity," in Vision and
Visuality, edited by Hal Foster (Seattle: Bay Press, 1988), 7.
106
Qtd. in Ibid, 7.
107
Ibid, 8.
108
Qtd. in Ibid., 9.
288
109
For a survey of perspectival instruments, see Martin Kemp, The Science of
Art (New Haven: Yale University Press, 1990), 167-220.
110
Ibid., 171-172.
111
Ibid., 200.
112
Ibid.
113
Anesthesiology emerges approximately at the same time.
114
Walter Benjamin, "The Work of Art in the Age of Mechanical Reproduction,"
in Illuminations, ed. Hannah Arendt (New York: Schochen Books, 1969), 238.
115
Anne Friedberg, Window Shopping: Cinema and the Postmodern (Berkeley:
University of California Press, 1993), 2.
116
See, for instance, David Bordwell, Janet Steiger and Kristin Thompson, The
Classical Hollywood Cinema (New York: Columbia University Press, 1985).
117
Qtd. in Ibid., 215.
118
Ibid., 214.
119
Friedberg, 134. She refers to Jean-Louis Baudry, "The Apparatus:
Metapsychological Approaches to the Impression of Reality in the Cinema," in
Narrative, Apparatus, Ideology, ed. Philip Rosen (New York: Columbia
University Press, 1986) and Charles Musser, The Emergence of Cinema: The
American Screen to 1907 (New York: Charles Scribner and Sons, 1990).
120
Qtd. in Baudry, 303.
121
Friedberg, 28.
122
A typical VR system adds other ways of moving around, for instance, the
ability to move forward in a single direction by simply pressing a button on a
joystick. However, to change the direction the user still has to change the position
of his/her body.
123
Rheingold, 104.
124
Ibid., 105.
125
Ibid., 109.
126
Marta Braun, Picturing Time: The Work of Etienne-Jules Marey (1830-1904)
(Chicago: The University of Chicago Press, 1992), 34-35.
127
Rheingold, 201-209.
128
Qtd. in Ibid., 201.
289
129
Here I disagree with Friedberg who writes, "Phantasmagorias, panoramas,
diaramas — devices that concealed their machinery — were dependent on the
relative immobility of their spectators." (23)
130
In some nineteenth century panoramas the central area was occupied by the
simulation of a vehicle consistent with the subject of the panorama, such as a part
of the ship. We can say that in this case virtual space of the simulation completely
takes over the physical space. That is, physical space has no identity of its own –
not even such minimal negative identity as emptiness. It completely serves the
simulation.
131
I am refering here to Rem Koolhaus unrealized project for a new building for
ZKM in Karlsruhe, Germany. See Rem Koolhaaus and Bruce Mau, S, M, L, XL
(Penguin, 1998).
132
Sampling across media is the subject of the Ph.D. dissertation (in progress) by
Tarleton Gillespie (Department of Communication, University of California, San
Diego); morping is the subject of Vivian Sobcack, ed., Meta-Morphing: Visual
Transformation and the Culture of Quick-Change (University of Minnesota Press,
1999).
133
See my article "'Real' Wars: Esthetics and Professionalism in Computer
Animation,” Design Issues 6, no. 1 (Fall 1991): 18-25.
134
Switch 5, no. 2 (http://switch.sjsu.edu/CrackingtheMaze).
135
Peter Eiseman, Diagram Diaries (New York: Universe Publishing, 1999),
238-239.
136
Issey Miyake Making Things, an exhibition at Foundation Cartier, Paris,
October 13, 1998 – January 17, 1999.
137
http://www.viewpoint.com
138
http://www.adobe.com
139
http://www.macromedia.com
140
htpp://www.aw.sgi.com
141
http://www.apple.com/quicktime/authoring/tutorials.tml, accessed September
26, 1999.
142
http://geocities.yahoo.com
143
http://www.turneupheat.com, acessed August 4, 1999.
144
E.H. Gombrich, Art and Illusion; Roland Barthes, "The Death of the Author,"
in Image, Music, Text, ed. Stephen Heath (New York: Farrar, Straus and Giroux,
1977).
290
145
Barthes, "The Death of the Author," 142.
146
Bulat Galeyev, Soviet Faust. Lev Theremin — Pioneer Of Electronic Art (in
Russian) (Kazan, 1995), 19.
147
http://www.microsoft.com; http://www.macromedia.com, accessed September
22, 1999.
148
Herbert Muschamp, “Blueprint: The Shock of the Familiar, The New York
Times Magazine (December 13, 1998), 66.
149
Musser, The Emergence of Cinema.
150
Fredric Jameson, “Postmodernism and Consumer Society,” in Postmodernism
and its Discontents, edited by E. Ann Kaplan (London and New York: Verso,
1988): 15
151
Jameson, “Postmodernism and Consumer Society,” 20.
152
Peter Lunenfeld discusses the relevance of Frampton to new media in his
Snap to the Grid (Cambridge, Mass.: The MIT Press, forthcoming).
153
Hollis Frampton, "The Withering Away of the State of the Art," in Circles of
Confusion (Rochester: Visual Studies Workshop), 169.
154
Thomas Porter and Tom Duff, “Compositing Digital Images,” Computer
Graphics vol. 18, no. 3 (July 1984): 253-259.
155
http://www.apple.com/quicktime/resources/qt4/us/help/QuickTime%20Help.htm,
accessed Septermebr 26, 1999.
156
http://drogo.cset.it/mpeg, accessed September 26, 1999.
157
For an exellent theoretical analysis of morphing, see Vivian Sobchack, “’At
the Still Point of the Turning World’: Meta-Morphing and Meta-Stasis, in Vivian
Sobchack, ed., Meta-Morphing: Visual Transformation and the Culture of Quick-
Change (University of Minnesota Press, 1999).
158
Terence Riley, The Un-private House (New York: The Museum of Modern
Art,1999).
159
On presentational system of early cinema, see Charles Musser, The
Emergence of Cinema: The American Screen to 1907 (Berkeley: University of
California Press, 1990), 3.
160
Paul Johnson, The Birth of the Modern: World Society 1815-1830 (London:
Orion House, 1992), 156.
161
The examples of Citizen Kane and Ivan the Terrible are from Aumont et al.,
Aesthetics of Film (Austin: Texas University Press, 1992), 41.
291
162
Dziga Vertov, "Kinoki. Perevorot" (Kinoki. A revolution), LEF 3 (1923): 140.
163
Jen-Luc Godard, Son + Image, edited by Raymond Bellour (New York: the
Museum of Modern Art, 1992) p. 171.
164
Ibid.
165
See Paula Parisi, “Lunch on the Deck on the Titanic,” Wired 6.02 (February
1998).
166
IMadGibe. Virtual Advertising for Live Sport Events. A promotional flyer by
ORAD, P.O. Box 2177, Kfar Saba 44425, Israel, 1998.
167
Sergei Eisenstein, “The Filmic Forth Dimension,” in Film Form, trans. by Jay
Leyda (San Diedo, New York, London: Harcourt Brace & Company, 1949).
168
Eisenstein, “A Dialectical Approach to Film Form,” in Film Form.
169
Eisenstein, “Statement,” in Film Fom, and “Synchronization of Senses,” in
Film Sense, trans. by Jay Leyda (San Diedo, New York, London: Harcourt Brace
& Company, 1942).
170
For an exellent theoretical analysis of QuickTime and digital moving image in
general, see Vivian Sobchack, “Nostalgia for a Digital Object: Regrets on the
Quickening of QuickTime.”
171
Private communication, Helsinki, October 4, 1999.
172
Nelson Goodman, Languages of Art, second edition (Indianapolis and
Camrbridge: Hackett Publishing Company, 1976), 252-253.
173
Roland Barthes, “From Work to Text,” trans. Stephen Heath, in Image-Music-
Text (New York: Hill and Wang, 1977).
174
www. yahoo.com, accessed March 27, 1999.
175
Brenda Laurel, quoted in Rebecca Coyle, "The Genesis of Virtual Reality," in
Future Visions: New Technologies of the Screen," edited by Philip Hayward and
Tana Wollen (London: British Film Institute, 1993), 162.
176
Fisher, 430. Emphasis mine — LM.
177
Fisher defines telepresence as "a technology which would allow remotely
situated operators to receive enough sensory feedback to feel like they are really
at a remote location and are able to do diffirent kinds of tasks." Scott Fisher,
"Visual Interface Environments," in The Art of Human-Computer Interface
Design, edited by Brenda Laurel (Reading, Mass.: Addison-Wesley Publishing
Company, Inc., 1990), 427.
292
178
I am grateful to Thomas Elsaesser for suggesting the term “image-instrument”
and also making a number of other suggestions regarding “Teleaction” section as
a whole.
179
Bruno Latour, "Visualization and Cognition: Thinking with Eyes and Hands,"
Knowledge and Society: Studies in the Sociology of Culture Past and Present 6
(1986): 1-40.
180
Ibid., 22.
181
Ibid., 8.
182
http://telegarden.aec.at, accessed March 27, 1999.
183
Walter Benjamin, "The Work of Art in the Age of Mechanical
Reproduction," in Illuminations, ed. Hannah Arendt (New York: Schochen Books,
1969).
184
Paul Virilio, "Big Optics," in On Justifying the Hypothetical Nature of Art and
The Non-Identicality Within The Object World, ed. Peter Weibel (Köln, 1992).
Virilio's argument can also be found in his other texts, for instance, "Speed and
Information: Cyberspace Alarm!" in CTHEORY (www.ctheory.com/a30-
cyberspace_alarm.html) and Open Sky, trans. by Julie Rose (London and New York:
Verso, 1997).
185
Virilio, "Big Optics," 90.
186
Jonathan Crary, Techniques of the Observer: On Vision
and Modernity in the Nineteenth Century (Cambridge: The MIT Press, 1990), 10.
187
This point is argued in Mitchell, The Reconfigured Eye.
188
Jacques Lacan, The Four Fundamental Concepts of
Psycho-Analysis, ed. Jacques-Alain Miller (New York and London: W.W.Norton,
1978), 95.
189
Martin Jay, Downcast Eyes: The Denigration of Vision in
Twentieth-Century French Thought (Berkeley: University of California Press, 1993).
190
For a detailed analysis of this story, see Stephen Bann, The True Vine. On
Western Representation and the Western Tradition (Cambridge: Cambridge
University Press, 1989).
191
Onyx is a faster version or RealityEngine which was also manufactured by
Silicon Graphics. See www.sgi.com
192
I am grateful to Peter Lunenfeld for pointing out this connection to me.
293
193
For an overview of the early history of computer art which includes the
discussion of the “turn to illusionism,” see Frank Dietrich, "Visual Intelligence:
The First Decade of Computer Art," in Computer Graphics, 1985.
194
Andre Bazin, What is Cinema? (Berkeley: University of California Press,
1967-71); Stephen Bann, The True Vine: on Visual Representation and the
Western Tradition (Camridge, England, and New York: Cambridge University
Pres, 1989).
195
On the history of illusionism in cinema, see the influential theoretical analsis
by Jean-Louis Comolli, "Machines of the Visible, The Cinematic Apparatus,
edited by Teresa De Lauretis and Steven Health (New York: St. Martin Press),
1980. I discuss Comolli argument in more detail in “Synthetic Realism as
Brickolage” section below.
196
André Bazin, What is Cinema? Vol. 1 (Berkeley: University of California
Press, 1967), 20.
197
Bazin, What is Cinema? Vol. 1, 21.
198
Bazin, What is Cinema? Vol. 1, 20.
199
Bazin, What is Cinema? Vol. 1, 36-37.
200
Jean-Louis Comolli, "Machines of the Visible,” 122
201
Bordwell, David and Janet Staiger. "Technology, Style and Mode of
Production," in David Bordwell, Janet Staiger and Kristin Thompson, The
Classical Hollywood Cinema, 243-261.
202
Cook, R., L. Carpenter and E. Catull. "The Reys Image Rendering
Architecture." Computer Graphics. 21.4 (1987): 95. Emphasis mine - L.M.
203
Cynthia Goodman, Digital Visions (New York: Harry N. Abrams, Inc., 1987),
22, 102.
204
Carpenter, L., A. Fournier and D. Fussell. "Fractal Surfaces."
Communications of the ACM. 1981.
205
Gardner, Geoffrey Y. "Simulation of Natural Scenes Using Textured Quadric
Surfaces." Computer Graphics. 18.3 (1984): 21-30.
Gardner, Geoffrey Y. "Visual Simulation of Clouds." Computer Graphics. 19.3
(1985): 297-304.
206
Gardner (1984), 19.
207
Reeves, William T. "Particle Systems — A Technique for Modeling a Class
of Fuzzy Objects." ACM Transactions on Graphics. 2.3 (1983): 91-108.
294
208
Magnenat-Thalman, Nadia and Daniel Thalman. "The Direction of Synthetic
Actors in the Film 'Rendezvous a Montreal'." IEEE Computer Graphics and
Applications. December 1987.
209
Carignan, M., Yang, Y., Thalmann, N., and Thalmann, D. "Dressing
Animated Synthetic Actors with Complex Deformable Clothes." Computer
Graphics. 26.2 (1992 ??): 99-104.
210
Anjyo, K., Usami, Y., and Kurihara, T. "A Simple Method for Extracting the
Natural Beauty of Hair." Computer Graphics. 26.2 (1992): 111-120.
211
Steve Neale,.Cinema and Technology (Bloomington: Indiana University
Press, 1985), 52.
212
The folllowing are just a few well-known “classics” in the field devoted to
this research: Nelson Max,."Vectorized procedure models for natural terrain:
waves and islands in the sunset" Computer Graphics 15.3 (1981); Ken Perlin,."An
Image Synthesizer," Computer Graphics. 19.3 (1985): 287-296; William
T.Reeves, "Particle Systems — A Technique for Modeling a Class of Fuzzy
Objects" ACM Transactions on Graphics 2.3 (1983): 91-108; William T. Reeves
and Ricki Blau, "Approximate and Probabilistic Algorithms for Shading and
Rendering Structured Particle Systems" Computer Graphics 19.3 (1985): 313-
322.
213
http://www.worlds.com, accessed September 9, 1999.
214
http://www.activeworlds.com, accessed September 9, 1999.
215
Cynthia Goodman, Digital Visions, 18-19.
216
J. F Blinn,."Simulation of Wrinkled Surfaces," Computer Graphics (August
1978): 286-92.
217
The research in VR aims to go beyond the screen image in order to simulate
both the perceptual and bodily experience of reality.
218
See Roman Jakobson, "Closing Statement: Linguistics and Poetics," in Style
In Language, ed. Thomas Sebeok (Cambridge, Mass.: The MIT Press, 1960).
219
Walter Benjamin, "The Work of Art in the Age of Mechanical Reproduction,"
in Illuminations, ed. Hannah Arendt (New York: Schochen Books, 1969).
220
Private communication, September 1995, St. Petersburg.
221
On theories of suture in relation to cinema, see chapter 5 of Kaja Silverman,
The Subject of Semiotics (New York: Oxford University Press, 1983).
222
www.adweek.com, January 18, 1999.
223
http://www.plumbdesign.com/thesaurus/, accessed May 14, 1999.
295
224
According to Janet Murray, digital environments have four essential
properties: they are procedural, participatory, spatial and encyclopedic. As can be
seen, spatial and encyclopedic can be correlated with the two forms I describe
here: navigable space and a database. Janet Murray, Hamlet on the Holodeck –
The Future of Narrative in Cyberspace (Cambridge, Mass.: The MIT Press, 1997),
73.
225
Sigfried Giedion, Mechanization Takes Command, a Contribution to
Anonymous History (New York: Oxford University Press, 1948).
226
"database" Britannica Online. |