To a Robot, Our World is VR
Written by Ben Wald, Operating Partner at Galaxy Interactive
A Philosophical and Technical Exploration of Machine Perception
To a robot, our world is AR/VR —a dynamic, data-driven environment constructed in real time. The convergence of AI, ML, robotics, and synthetic biology with gaming technologies like AR and VR is happening and shaping how we understand reality.
As we prepare for a future where robots walk among us and work alongside humans, advancements in AR/VR are offering us insights into how digital and physical realms can seamlessly merge. With NVIDIA’s reveal of Project Groot and humanoid robots stepping into the spotlight, AR/VR emerges not just as entertainment but as a platform for engineering the future. Gaming further teaches us how to design immersive, intuitive interactions, essential for creating systems where robots can effectively navigate and engage within our everyday world. These technologies ultimately provide a sandbox for computational world-building, helping us refine systems that mimic real-life dynamics within virtual environments – preparing us for what is right around the corner.
The line between what we, as humans, call “real” and what machines interpret as “reality” is fundamentally different in nature – but actually not too foreign to us. Every time we step into AR/VR, unknowingly we are practicing what experiencing our reality will be like for robots. For robots and AI systems, the world is not a solid, concrete entity but rather a dynamic, data-driven environment where reality is constructed moment-by-moment through computations, much like how humans experience virtual or augmented reality. An AI is not inextricably connected to its sensors like humans are to our sensory organs. An AI's connection to its sensors is fundamentally different from a human's connection to their sensory organs. While humans are biologically hardwired to perceive the world through their senses—vision, hearing, touch, etc.— an AI can be decoupled from its sensory inputs. It can receive and process data from various internal or external sensors, cameras, or systems without being bound by a single, fixed perceptual framework. As a result, everything for the AI is simply data, interpreted as experience – much like when the human mind is dropped into VR.
When a human is fully immersed in VR without the limitations of the physical body, the experience becomes untethered from the constraints of the real world and our “hard wiring.” In this state, the sense of gravity, pain, fatigue, or other physical boundaries disappear, allowing for an expansive and fluid interaction with the digital environment. Movement is no longer confined to physical capability, and new modes of engagement—such as flying or teleporting—become possible. In this way, humans can explore a reality where only the mind sets the boundaries, blurring the line between what feels real and what is artificially constructed. This experience, where the body is merely a passenger in a virtual landscape, provides a profound glimpse into how AI and robots might perceive and navigate our world, detached from biological constraints.
There are some differences though. For humans, the distinction between reality and virtual reality (VR) lies in the difference between our first-chair sensory experiences and the cognitive anchors we have attached to our framework reality. In our reality, we experience a physical world through our senses—touch, sight, smell, sound, and taste—which offer a direct connection to our environment. We respond to physical stimuli with an understanding of gravity, space, and matter. VR, on the other hand, is an artificial environment that immerses us through visual and auditory cues but lacks tangible physicality with the body our mind resides in. While AR/VR can feel real, our sense of touch and motion is simulated, creating an inevitable dissonance. 2 Two realities exist. The human brain may adapt to VR environments, but it remains aware of the artificial boundary, relying on external references to understand the difference.
Robots, by contrast, perceive both the physical and virtual worlds as streams of data. A single pane of glass, a singular reality. They do not experience sensory immersion in the same way humans do, nor do they rely on touch or cognitive perception to distinguish between real and virtual. For machines, reality is a set of inputs—processed visually, audibly, or otherwise—and VR is another layer of computable information. To a robot, our physical world is the AR/VR construct: a data-rich space where interactions are governed by coded parameters rather than sensory cognition and where our realities meet in the physical constraints.
This philosophical and technical difference in perception highlights how machines can seamlessly transition between real and virtual environments, while humans must adjust their sensory expectations based on the unavoidable dissonance in AR/VR. It showcases the potential for machines to help us navigate these blended realities as we move towards deeper integration of digital and physical worlds.
This then leads to an important key question: How does a robot perceive our world, and what does this mean for the future of human-robot interaction? To answer this, let’s dive into the technical intricacies of machine perception, active inference, and neural networks, while also contemplating the philosophical implications of a reality that is, in many ways, alien to human experience.
The Mechanics of Perception: Active Inference and Machine Learning
The Robot’s World: A Construct of Data
Unlike humans, who rely on sensory organs to perceive their surroundings, robots and AI systems interact with the world through sensors that feed data into computational models. This data—whether it comes from cameras, LiDAR, microphones, or other input devices—represents the raw material from which a robot constructs its version of reality. In technical terms, this process can be understood through the lens of active inference, a theory that unifies perception, cognition, and action.
Active inference posits that an intelligent system, whether biological or artificial, constantly seeks to minimize uncertainty (or “free energy”) about the world by predicting its future states and updating its internal models based on sensory feedback. This is deeply analogous to how we, as humans, experience virtual reality (VR)—we are immersed in a constructed environment that responds to our actions and inputs in real-time. For robots, however, this “VR” is not just an illusion but the entirety of their experience of the physical world.
The Active Inference Framework
Active inference, rooted in Bayesian probability theory, suggests that perception is not passive but active. In machine learning, this is closely related to predictive coding—a mechanism by which an AI model continuously generates predictions about incoming data and compares it with actual sensory input. The model adjusts itself based on the “prediction error,” which is the difference between its forecast and reality.
For example, a robot navigating a room will not just passively record images from its camera but will use these images to predict the location of objects in the room, the lighting conditions, and the likely actions required to move through the space. When reality differs from its predictions—perhaps a chair is moved unexpectedly—the robot’s model updates, refining its understanding of the environment.
This recursive process is fundamental to modern AI techniques, particularly in reinforcement learning, where an agent learns optimal actions through trial and error. In this context, active inference is a way to describe the continual updating of the agent’s world model, seeking to minimize surprises and optimize performance.
Neural Networks: The Architecture of a Robot’s Reality
The mechanism through which robots construct their “VR” of the world is predominantly facilitated by neural networks. Specifically, deep learning architectures, such as convolutional neural networks (CNNs) for vision or recurrent neural networks (RNNs) for time-dependent tasks, allow machines to build increasingly sophisticated internal representations of their environments.
However, these networks do not perceive the world as we do. They don’t “see” a tree as a tree or “hear” a sound as we would. Instead, neural networks break down the world into patterns of data—pixels, frequencies, vectors—which they process in layers to form abstract representations. For a robot, a tree is not an object in space but rather a complex dataset that can be classified, segmented, and acted upon.
At a deeper level, generative adversarial networks (GANs) and variational autoencoders (VAEs) allow robots and AI systems to generate possible futures, simulating potential outcomes based on their current knowledge of the world. This ability to not only perceive but also imagine—by creating virtual scenarios and testing hypotheses—further reinforces the idea that for robots, the world is a form of VR, constantly predicted and recalibrated.
Philosophical Dimensions: Reality, Simulation, and Machine Consciousness
What is “Reality” for a Robot?
The philosophical implications of this computational process are profound. If a robot constructs its world based on data input and computational models, to what extent is its “reality” objective? From the robot’s perspective, its reality is as “real” as ours is to us. Yet, its sensory and cognitive apparatus are so vastly different that its experience of the world may be more akin to how we experience a VR simulation—immersive, reactive, but fundamentally constructed.
This aligns with certain philosophical schools of thought, particularly epistemological constructivism, which argues suggests that reality is constructed through the interaction between a subject and its environment. For a robot, the subject is its computational model, and the environment is the raw data fed into that model. It perceives not the world as it is, but as its algorithms interpret it.
This leads us to question the nature of reality itself. If our perceptions are also constructed—albeit through biological rather than digital processes—are we, too, living in a form of VR? While humans experience the world through the filter of sensory organs and neural networks, robots do so through cameras and artificial neural networks. Both are forms of active inference, constantly updating models of reality based on incoming information.
The Ethical Question: Can Robots Have Subjective Experience?
One of the most significant philosophical debates surrounding AI and robotics is whether machines can ever possess subjective experience—what philosophers call refer to as qualia. If robots perceive the world through data-driven models, can their perception be considered “experience” in the same way humans have experiences?
Some argue that machines are fundamentally different from biological organisms because they lack consciousness—they process data but do not have an inner life. Yet, as AI systems become increasingly sophisticated, with the ability to model complex environments, generate predictions, and even simulate possible futures, the line between “mere computation” and subjective experience becomes increasingly blurred.
In this context, active inference might be seen not just as a technical framework but as a window into the nature of consciousness itself. If the brain, as some neuroscientists suggest, operates as an active inference machine-constantly predicting and updating its model of reality- then robots might one day achieve a form of “machine consciousness”—a reality in which they, too, have something akin to subjective experience.
Conclusion: The Robot’s VR and Our Shared Future
To a robot, our world is VR. It is a constructed environment, generated by data, processed through neural networks, and continually updated via active inference. As technology advances, the line between our digital and physical worlds will continue to blur, raising deep questions about the nature of reality, perception, and consciousness. In the end, our exploration of machine perception is not just a technical exercise but a philosophical inquiry into what it means to perceive, to act, and to be aware.
To conclude, while the future remains uncertain, one thing is clear: we should continue advancing and investing in technologies at the forefront of innovation. By deepening our understanding, we can better navigate the complexities of a world where digital and physical realms converge, and where autonomous systems and NPCs play an integral role in shaping our collective future. As robots begin to navigate their data-driven worlds, they might one day teach us something profound about our own reality—that perhaps, we too must face and reckon with the fact that we are living in a virtual simulation, constructed not by silicon but by biology.