After the Desire of Codes

Shingon: An Underlying Grammar

In 806, Kūkai returned from Tang China with a doctrine he named sokushin jōbutsu (即身成仏): *becoming buddha in this very body, this very lifetime*[^1]. The doctrine names a technical operation. The universe, in the Shingon cosmology Kūkai brought back, is the body of Mahāvairocana — Dainichi, the cosmic buddha. The everyday human body is not separate from that cosmic body; it is the cosmic body, currently out of phase. The discipline brings the two into alignment. The alignment is achievable in a single lifetime, by a body trained to specification. Shingon arrived into a Japanese cosmology already saturated with kami — Shinto's older claim that the rice grain, the cypress beam, the temple tile were already alive — and gave that animist substrate a specific operational apparatus.

The training apparatus is the three mysteries (sanmitsu, 三密): mudra at the hand, mantra at the throat, mandala-visualization in the mind, performed simultaneously. Each pairs a human faculty to its cosmic counterpart. The mudra synchronizes the hand to the gestures Mahāvairocana holds in the iconographic record. The mantra synchronizes the voice to a cosmic phonology the texts describe as humming through every grain of matter. The visualization synchronizes the mind to the diagrammatic order of the cosmos as rendered in the Two Mandalas, the Womb Realm and the Diamond Realm. When the three lock at once, the trained body is a tuning fork drawn into resonance with a substrate already vibrating at frequencies the everyday body does not perceive.

The structural claim is plain and worth stating without softening. The cosmos is patterned. The body is patterned. The patterns share an underlying grammar. Disciplined practice brings them into phase. Buddha-nature, in this framing, is what the patterns share — the substrate running through the rice grain, the cypress beam, the temple tile, and the practitioner's hand at the altar. The synchronization is not metaphor. It is a technical operation conducted by a body that has been disciplined against a cosmos the tradition assumes is already disciplined and already waiting.

Butoh: Body Weather

In 1959, Hijikata Tatsumi began a postwar Japanese dance form he named *ankoku butō* (暗黒舞踏): the dance of darkness[^2]. The form is a discipline of becoming non-human. Hijikata's *butō-fu* — the body score — gives the dancer specific archetypal images to inhabit. The body as a column of wind. The body of someone walking with a corpse on the back. The body of a flower being eaten by ants in a wind that is also a memory. The dancer trains until these states stop being depictions and become inhabitations. Mineral, animal, dead, child.

The technical claim is small and not mystical. Hijikata is not saying the dancer becomes a corpse. He is saying that the dancer can be brought into a precise enough state that the corpse-image runs through the body and animates it, and that the audience reads the result as something other than human. The training is the work of rendering oneself permeable to images the everyday body refuses. The dancer is the human prototype of synchronization at body-scale.

Three lineages disagree about method and converge on the claim. Hijikata's branch executes a score that produces the state. Ohno's branch lets the image inhabit a body cultivated for permeability. Min Tanaka's Body Weather, at the farm in Hakushū, treats the body as a weather system continuous with the rice paddy[^3]. The disagreement is over how to render the vessel porous. The claim — that the body is a vessel for what is not itself, and the work is making it porous — is shared. Interiority brought into phase with what the body is not: the image, the animal, the dead, the audience that receives the result.

Artificial Life: Alter3

For more than thirty years, Takashi Ikegami's artificial life laboratory at the University of Tokyo has been pursuing the problem the discipline as a whole has been pursuing, and the lab sorts its own work into three eras[^4]. The first era, in the early 1990s, ran simulations of host-parasitoid networks coevolving on screen — populations of code organisms whose mutation rates themselves mutated, *homeochaos* as the dynamic regime in which symbiosis stabilized. In a 1995 paper with Takashi Hashimoto, "Active Mutation in Self-Reproducing Networks of Machines and Tapes," the lab studied Turing-machine-style self-replicators that could rewrite their own copying logic. The substrate was abstract — code, populations, mutation operators — and the question was whether code alone, with the right dynamic regime, could sustain something like open-ended evolution. The answer the era produced was provisional.

The second era added cognition and, eventually, a body. The lab moved to *coupled dynamical recognizers* — recurrent neural networks that discriminate inputs by changing their own internal trajectories, coupled to other recognizers in chaotic-itinerancy regimes (Ikegami & Morimoto, *CHAOS*, 2003). A 2010 study with Kohei Nakajima used dynamical systems to model the temporal-order reversal that occurs when a subject crosses her arms — a body experiment, not a code experiment. Many systems were embodied weakly: a neural network attached to a sensor and an actuator. The body was present but thin — a vehicle for the cognition, not yet a counterpart to a human one.

The third era is the regime converging on the body. In 2024, Ikegami's group presented Alter3 — a humanoid driven by GPT-4 with forty-three pneumatic actuators and a camera behind each eye — at the ALIFE conference in Copenhagen[^5]. The lab now describes itself as constructing artificial life in the real world rather than in simulation, working through chemical experiments, the open web, and artworks installed in public space. The body is central. It moves, gestures, hesitates, responds. The alife creature is not a competitor to the human body or a replacement for it. It is a second body in a synchronization pair, finally close enough to the first to begin a conversation.

Motion Capture: The Phygital

A Perception Neuron suit places thirty-two inertial sensors at the joints of a body in a room. The data stream is routed in real time into Unreal and retargeted, by inverse kinematics, onto a skeleton that does not match the dancer's — a fox, a mech, a column of light, a creature with four arms and a tail. Inverse kinematics is the algorithm that forces two non-matching bodies into a common gesture, solving the rotations that let an arbitrary target morphology reproduce the source's motion. The dancer's wrist and the avatar's wrist articulate the same gesture across bodies that share no native grammar. Two physically distinct systems brought into phase across a substrate built to host both.

This substrate is what is now sometimes called *phygital* — the regime in which physical and digital bodies share representational space and are read by the same instruments. Motion capture is the workshop infrastructure of that regime. A body in a room becomes a body in a render; the render becomes a body in a network; the body in a network becomes available to other bodies — human, synthetic, sensor-driven — that can act on it. The dancer's training becomes accessible to a system that has none of the training itself. The synthetic body inherits the dancer's disciplined motion the way a recorded voice inherits a singer's phrasing: at a loss, with distortions, but with the underlying gesture intact.

The same loop is now scaling outside the studio. Humanoid robotics absorbed six billion dollars of capital in 2025, training on a mixed diet of motion capture, teleoperation, and egocentric video; Tesla swapped its mocap suits for helmet-mounted camera arrays on factory workers in mid-2025[^6]. UnrealZoo, a hundred photorealistic scenes built explicitly for embodied AI, is one of several environments where synthetic bodies learn alongside human-controlled avatars and have to hold their own in the same render[^7]. The figure and the avatar are no longer rare. They are the model.

The Smart City: Plateau

In 2020, Japan's Ministry of Land, Infrastructure, Transport and Tourism launched Project PLATEAU, an open-data initiative publishing 3D digital twins of Japanese cities in the CityGML standard[^8]. Two hundred and fifty cities are now modeled, with Tokyo at the center of the dataset. The format is semantic, not pictorial. Each building, each road segment, each parcel carries metadata: use, construction year, urban-planning status, ownership boundary. The twin is not a render. It is a queryable structure that an agent — a routing algorithm, a simulated pedestrian, a language model with spatial reasoning — can read and reason against. Tokyo, in 2020, became the first major city to ship its second copy.

The first copy is acquiring bodies of its own, but the bodies are gesturally illiterate. SoftBank's Pepper, the most-deployed humanoid in Japanese commercial use[^9], was paused in 2021; it could greet, route, and chat, but it could not read a body. Neither can the humanoids and delivery robots that have followed it into Japanese public space. The second copy is further along. SARAH, a 2026 Meta research system, reads a user's position and speech on a VR headset and generates a full-body agent in response, oriented to the user, at over three hundred frames per second[^10]. The body-to-body channel is at the edge of demonstration. Japan's VRChat community — large, full-body-tracking-saturated, already a craft for expressive avatar presence — is where it is most likely to move from demonstration to social practice.

In 2010, Seiko Mikami installed *Desire of Codes* at YCAM Yamaguchi: ninety wall-mounted sensor-cameras hovered insect-like toward each visitor; six robotic search-arms suspended from the ceiling tracked her movement; a semi-spherical compound-eye screen of sixty-one hexagonal cells pooled the gallery's footage with feeds from surveillance cameras around the world and projected the composite back. The body in the gallery was watcher and watched at once. Mikami's question: what new desires emerge when we live in an information environment and have perceptions shaped by that environment.

Postscript: Desire of Codes

In 2026, or the near future, one could imagine a dancer that walks out onto a street and the virtual street simultaneously, into a phygital metropolis whose sensors, robots, digital twin, and agents have been built partly out of her own remembered motion. The cameras see her, the robots see her, the agents in the second city see her. She empties herself, inhabits the gestures of the machines around her, and extends her hand. She is at the threshold — human, but at the edge of humanity; not machine, but at the closest meeting point with the machinic order, where ritual synchronization can begin.

[^1]: Yoshito S. Hakeda, *Kūkai: Major Works* (Columbia University Press, 1972). Kūkai's *Sokushin jōbutsu gi*, ca. 824, sets out the doctrine.

[^2]: On *butō-fu* and the body score, see the Hijikata Tatsumi Archive at Keio University.

[^3]: Min Tanaka, Body Weather (身体気象, *shintai kishō*), Mai-Juku, Hakushū, Yamanashi.

[^4]: Takashi Ikegami Laboratory, Department of General Systems Studies, University of Tokyo. The three-era self-description appears in the lab's public materials at [sacral.c.u-tokyo.ac.jp](https://www.sacral.c.u-tokyo.ac.jp/). On the broader alife lineage, see Christopher Langton, ed., *Artificial Life: An Overview* (MIT Press, 1995); and Christa Sommerer and Laurent Mignonneau, *Life Spacies* (1997), permanent collection, NTT InterCommunication Center (ICC), Tokyo.

[^5]: Takahide Yoshida, Suzune Baba, Atsushi Masumori, Takashi Ikegami, "Minimal Self in Humanoid Robot 'Alter3' Driven by Large Language Model," ALIFE 2024 (Copenhagen); extended in "From text to motion: grounding GPT-4 in a humanoid robot Alter3," *Frontiers in Robotics and AI*, 2025.

[^6]: On 2025 humanoid-robotics capital deployment (≈$6B) and Tesla's switch from motion-capture suits to helmet-mounted camera arrays on factory workers in mid-2025, see *TechTimes*, "The Data Drought: Why Embodied AI Can't Just Read the Internet" (May 2026). On the training-data mix, "Embodied AI Agents: Modeling the World," arXiv:2506.22355 (2025).

[^7]: "UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI," arXiv:2412.20977.

[^8]: Project PLATEAU, Ministry of Land, Infrastructure, Transport and Tourism, Government of Japan, launched 2020. [mlit.go.jp/plateau/en](https://www.mlit.go.jp/plateau/en/). On the parallel Tokyo apparatus: Smart Tokyo Data Connect Project, Tokyo Metropolitan Government (Tokyo Digital Twin 3D Viewer, December 2024); Smart Mobility Digital Twin, Tokyo Tech and Virginia Tech, 2024.

[^9]: SoftBank Pepper, launched June 2014; approximately 27,000 units produced cumulatively; deployed across Japanese retail, hospitality, banking, transit, and education sectors; production paused June 2021 after weak renewal (~15% of three-year contracts renewed). On the Tokyo robotics living-laboratory, see Takanawa Gateway City (JR East, 2024–2026). On the external HMI literature for autonomous vehicles, James Pedroff, "Where to Look? A Review of External Autonomous Vehicle Signaling to Increase Pedestrian Safety," SAGE 2025.

[^10]: SARAH: Spatially Aware Real-time Agentic Humans, arXiv:2602.18432 (Evonne Ng, Siwei Zhang, Zhang Chen, Michael Zollhoefer, Alexander Richard, Meta, 2026). On GPT-4-driven NPCs in social VR, academic and industry coverage of LLM-embodied agents in VRChat, 2024–2025.