The Realities of Robot Deployment: What It Takes for Embodied AI to Succeed
The "hype" of robots ignore the unstructured environment problem.
2026 is undoubtedly the year of Physical AI.
Robots have been making headlines and dominating social feeds over the past year. We’re seeing humanoids step onto stage, pulling off coordinated choreographies and impressive stunts. At the same time, they are being marketed as future home companions, promising to handle everyday, general-purpose tasks.

Robots were once designed for very specific tasks, but these days we are sold the dream that robots can evolve into general purpose assistants capable of handling almost everything. The vision many companies push is the idea of Embodied AI, a concept that has become a buzzword for the next generation of intelligent robots.
What often goes unmentioned is that most of these impressive “autonomous” demos are often recorded in tightly controlled environments, or with a human discreetly teleoperating the robot from behind the scenes. Building robots capable of acting reliably in the real world is far more challenging than what these glossy videos suggest. This article dives into the realities of robot deployment, why Embodied AI alone is not enough, and what it takes for it to succeed.
Contents
- What Embodied AI Really Means
- The Reality of Deploying Robots in the Wild
- Lessons from Field Deployment
🔧 Technical Limitations
⚙️ Operational Challenges
💰 Financial Considerations - Why Embodied AI Alone Isn’t Enough
- What It Takes to Succeed
What Embodied AI Really Means
Embodied AI is a field of artificial intelligence that integrates AI with physical systems so that they can dynamically interact with their environment. The key distinction between traditional robotics and embodied AI is its ability to operate beyond preprogrammed behaviours as systems perceive, reason, and respond in real time. This adaptive loop can be illustrated through the three stages of See, Think, and Act shown in the diagram below.

It is an exciting area because it is an amalgamation of multiple disciplines like robotics, computer vision, and reinforcement learning. Much of the recent progress in Embodied AI focuses on improving the Think step — to perform adaptive reasoning, prediction, and planning in changing environments and imperfect sensors. World models support this by providing robots with an internal representation of its surroundings, allowing them to predict future states and anticipate the consequences of its actions.
This vision of autonomy is why Embodied AI remains a challenging scientific frontier, with many of its capabilities still far from solved. Yet even before reaching this level of intelligence, today’s robots which rely on far simpler mechanisms already struggle once they leave controlled environments.
The Reality of Deploying Robots in the Wild
As part of GovTech’s AI Practice, we worked on a pilot project that deployed robots in real environments. Under the Workplace Safety and Health (Work at Heights) Regulations 2013, it is mandatory to install effective guardrails or barriers at any open side or opening where a person could fall more than two metres. The regulation also specifies how these barriers should be built and installed so they function effectively, as reflected in the diagram below.

We wanted to explore automated safety barrier breach detection using robots and video analytics. Such a system could serve as an early warning mechanism, helping site teams detect hazards sooner, rely less on manual inspections, and maintain stronger compliance with safety regulations.
Through this project, we aim to understand whether our long-term vision of autonomous construction site supervision is achievable, with safety barrier breach detection as one of the many potential use cases. In this future, robots will conduct autonomous patrols, detect breaches, and alert stakeholders to possible non-compliance. This would reduce the need for constant on-site supervision and support more scalable, remote oversight.

To ground the project in real-world feasibility, we structured the work around two parallel tracks to examine whether onboard video analytics could detect breaches in practice, and whether the robot could reliably navigate construction sites, as seen in the diagram above.
Lessons from Field Deployment
Through this pilot project, we learned some hard lessons on the ground while evaluating the feasibility of the solution. We encountered technical, operational, and financial challenges, and this was where the gap between controlled testing and real-world conditions became clear.

🔧 A. Technical Limitations
Technical limitations showed up across the robot’s hardware, its software, and the computer vision solution we built for the project.
>> A.1. Navigation Stability

Unpredictable construction terrain made it difficult for the robot to judge whether it was safe to proceed. Loose rocks, uneven soil, and deep-water ponds created conditions where autonomous assessment was highly unreliable. In many cases, the robot could not determine if a path was stable enough to traverse, and human judgment was required for evaluation. As a result, human oversight remained essential, and full autonomy was not feasible across large portions of the site.
Most robots operating in dynamic environments adopt SLAM-based algorithms, which allow them to build and update maps while moving. Household robots such as robot vacuums are a common example, where the vacuum continuously remaps new obstacles that appear within its surroundings. However, SLAM is unsuitable for construction sites as the robot tested struggled to reliably detect boundaries such as no-go zones or sudden edge drops.

For this pilot project, we therefore adopted a waypoint navigation approach. This method operates on a known map and the robot is guided between predefined waypoints while avoiding obstacles in real time. This approach requires humans to map the area, post-process the map (as seen in diagram above), and define the route before deployment. On a construction site where conditions change daily, the continuous need for remapping renders such a solution to be semi-autonomous since human input is still necessary.

Using the waypoint navigation, its obstacle avoidance was not always successful. A robot typically relies on its LiDAR to detect obstacles in front of it, which works reasonably well in indoor environments. However, construction sites have uneven and rocky terrain. To allow the robot to move across such ground, we had to set a height threshold so that it ignores low-height objects, which was necessary to prevent it from constantly stopping for minor debris or small rocks. However, this global threshold filtering also introduced blind spots and caused the robot to occasionally miss low-height obstacles, such as the base of a traffic cone or a person’s foot, which posed safety concerns.
💡 With advancements in Embodied AI: Navigation capabilities could be significantly improved. More capable world models may allow robots to build richer representations of their surroundings and navigate more safely. In our current system, obstacle detection is based mainly on objects detected in front of the robot, and route planning selects the lowest-cost terrain path. However, the system has little awareness of what those objects are, or whether the terrain itself is stable enough to traverse.
Unlike our current system, which relies mainly on LiDAR geometry, a world model integrates information from multiple sensors, attaches semantic meaning to objects, and predicts how the environment may change, allowing risk to be estimated rather than assumed. Although these approaches are still at an early stage, these developments could eventually enable fully autonomous navigation in such complex environments.
>> A.2. Sensor Constraints
In practice, sensors often struggle to capture the information needed for reliable perception. Through this pilot project, we observed multiple ways in which sensor performance degraded.
Noisy and unreliable perception data for detection. The computer vision solution struggled due to blurry RGB images and noisy depth camera data captured. In some cases, it was also challenging for the human eye to recognise objects. While upgrading to a higher quality camera may help, it does not fully resolve the underlying problem arising from motion blur or challenging foreground-background separation.

Furthermore, weather conditions can degrade sensor performance. Most of the hardware and sensors used were individually IP-rated for dust and water resistance but rain or fog can cause occlusion when lenses are covered in raindrops. Most autonomous vehicle companies adopt mitigation techniques such as hydrophobic coatings that help water roll off, pulsed air-puffers that blow droplets away, or tiny wipers and nozzles to remove water from sensors. Further algorithmic processing can also be adopted to enhance degraded images for detection.
For LiDAR, rain introduced a different kind of distortion, as each raindrop can reflect a LiDAR pulse, producing ghost points that appear as tiny floating objects. By using LiDAR rain-filtering algorithms and sensor fusion techniques combining data from other sensors like camera and radar, it may help to limit such effects. Even so, overall performance in adverse weather may still be limited.
One key limitation is the lack of available sensors to detect or measure vertical drops. Depth of drop is essential for identifying safety barrier breaches in our use case, and detecting such drops is fundamental for autonomous navigation (as mentioned in A.1. Navigation Stability above). Without a sensor capable of detecting and measuring drops on a robot reliably, autonomy remains severely constrained in construction environments.
>> A.3. System Robustness
Overall, the pilot exposed several weaknesses in the robot’s robustness under real-world conditions.
During the pilot, the robot was subjected to long hours under the sun on open construction sites with no shade. This led to the robot overheating under prolonged sun exposure, causing unexpected shutdowns without a graceful recovery procedure. These incidents raised concerns about the robot’s outdoor robustness and its ability to withstand harsh environmental conditions. Additional cooling mechanisms may help to reduce the likelihood of such occurrences.

We also observed occasional actuator or motor malfunctions. At one point, the quadruped robot glitched and abruptly jumped and flipped onto its back. Although rare, such incidents highlight the need to anticipate faults arising from dust buildup, wear and tear, or environmental stress, which is an area that tends to be overlooked.
Another issue involved operational instability during field trials. The robot occasionally failed to initialise autonomous navigation or teleoperation modes, particularly during earlier trials. These issues required frequent on-site troubleshooting despite prior successful lab testing. The need for manual intervention to reset or recover the system indicated that the robot was not yet ready for full-scale deployment and raised questions about how much autonomy it could reliably achieve without human oversight.
>> A.4. Software Generalisability
Developing the software for this use case was challenging. The sensor constraints described earlier in A.2. Sensor Constraints meant that perception data was often noisy, which made it hard to build a detection system that behaved consistently across different environments.

Unexpected edge cases emerged early during data collection and complicated solution development. For instance, some sites had designated walkway openings within the safety barriers, which had to be distinguished from actual breaches to avoid false alarms. In other cases, additional netting was installed over the barriers, introducing noise into the depth map and requiring further adjustments to the logic.
The broader challenge is that the solution struggles to generalise in the wild. Real-world data is mostly out-of-distribution, and our dataset covered only a limited number of sites. During evaluation, we deliberately held out one entire site from calibration to test performance on unseen conditions. The system performed noticeably worse, reinforcing how difficult generalisation is in practice.
In practice, computer vision systems still rely heavily on rules. A machine learning model may not be explicitly programmed like a series of if-else statements, but it is usually surrounded by additional conditions and thresholds for downstream use. As more edge cases appear, the system becomes increasingly complex, while still failing to handle everything that happens on site.
💡 With advancements in Embodied AI: As multimodal language models begin to support richer 3D understanding, it may eventually become possible to leverage RGB-D data for zero-shot spatial measurement for breach detection. This could reduce the amount of hand-crafted logic required and improve generalisability to unseen scenes, although reliable zero-shot measurement remains an active research direction.
⚙️ B. Operational Challenges
Beyond technical limitations, operational challenges were another major issue in integrating robots into existing construction site procedures.
>> B.1. Procedure Integration

Early on-site robot trials caused disruptions to regular operations. Robots required temporary dedicated walkways to avoid heavy machinery and high levels of supervision from multiple stakeholders. This setup was resource-intensive and unsustainable for long-term deployment.
This reliance on temporary walkways highlighted a deeper issue: the absence of clear, site-wide Standard Operating Procedures (SOPs) for integrating robots into existing workflows, which is difficult to develop in practice. For robots to be operationally feasible, they need well-defined yet adaptable SOPs, such as clear right-of-way protocols that apply consistently across the site. More thought is needed on how these SOPs should be designed so that robots can be integrated effectively.
>> B.2. Training Requirements
The human factor played a significant role in operational feasibility. Workers need to be trained to operate and manage robots independentlyacross different sites. Without such training, robot operations will continue to rely on close supervision from multiple stakeholders including the robot vendor. This is not sustainable in the long term, because the goal is for dedicated on-site roles to eventually take full responsibility of robots.
Beyond operators, all other site personnel also require basic safety awareness training to interact safely around robots. During field trials, many workers were justifiably curious, which sometimes led to distractions, increased safety risks, and workflow disruptions. To enable safe and scalable deployment in the long run, baseline awareness training should be provided to all site personnel, supported by regular on-site briefings, clear signage, and clearly marked operational zones.
>> B.3. Liability and Insurance

Robot operations pose significant liability risks in the absence of available insurance coverage. The introduction of robots on active construction sites introduces even more safety risks to an already dangerous environment. Examples include navigation failures, collisions, malfunctions, and poor adaptability to site conditions.
On our project, the site contractor did not have insurance that covered robot-related incidents. A scan of available insurance offerings for ground robots suggested that coverage in this area is still uncommon and often treated as high-risk, whereas insurance for drones is already more widely available.
During the field trials we conducted, temporary risk mitigation measures were put in place to manage these uncertainties. However, these measures do not resolve the underlying long-term liability questions, which remain a key barrier to sustainable deployment.
>> B.4. Infrastructure Support
Beyond operational procedures, infrastructure readiness was also a key constraint for further scaling robot deployments.

Charging requirements posed practical challenges for deployment. Fixed charging solutions were impractical because construction site layouts changed frequently over time. This made it difficult to install durable, weather-proof auto-docking stations that could remain usable across different project phases.
On the other hand, stable network connectivity was essential for reliable robot operations. Video streaming and automated alerting depend on network availability, yet network infrastructure was not uniformly available across all site areas, especially in newly developed towns. Network readiness therefore needed to be assessed early, as additional infrastructure might be required to support deployment.
💰 C. Financial Considerations
>> C.1. Cost Viability
Integrating robots into existing workflows entails high upfront and ongoing costs with unclear returns on investment. Developing a viable solution requires significant R&D investment, while deployment introduces continuous maintenance and troubleshooting costs. Human oversight remains necessary to supervise operations, recover systems during faults, and manage safety risks, reducing the potential for labour savings.
Additional costs may also arise from staff training, workflow adjustments, and operational downtime when failures occur. A comprehensive cost-benefit analysis is therefore required to assess whether such a solution can be financially viable at scale.
Why Embodied AI Alone Isn’t Enough
When we zoom out from these findings, a pattern becomes clear.
Most blockers we encountered were technical, operational, or financial in nature. They were not primarily intelligence problems. Advancements in a robot’s cognitive capabilities alone are insufficient for robust real-world robot deployment.
First, the physical world is inherently unpredictable. They introduce rare events and shifting conditions that are hard to capture fully in simulation or training data. Although Embodied AI has advanced significantly, it has not yet shown consistent reliability in such situations. This makes continuous monitoring, iterative refinement, and active intervention during deployment essential, as failures in real-world conditions are inevitable.
Second, hardware adds an additional point of failure and is subject to physical wear and tear. Our experience showed that sensors have inherent limitations, while mechanical components can overheat, degrade, or malfunction, from inaccurate sensors to faulty motors. The physical world imposes limits that intelligence alone cannot overcome.
Third, deployment must align with human workflows and economic realities. Even with higher levels of autonomy, humans remain essential for supervision, troubleshooting, and recovery. Operational workflows must be redesigned around robots, and downtime, maintenance, and calibration create a heavy support burden for humans. In practice, the cost of supporting the robots often outweighed the value they delivered. Intelligence does not eliminate organisational friction, nor does it automatically make the economics work.
Finally, real-world deployment raises unresolved questions around safety, liability, and trust. Robots operating around people introduce physical risks that organisations cannot ignore. When failures occur, responsibility is often unclear. Increased autonomy also introduces security concerns, while the absence of established standards for responsible embodied AI makes it difficult to provide strong guarantees. Without these assurances, trust becomes a major barrier to deployment.
Taken together, these factors explain why embodied AI, even with improved world models and reasoning, is not enough to reliably move robots from the lab into the field.
What It Takes to Succeed
Robots will eventually leave the lab, and it is only a matter of time before they become ubiquitous in our everyday environments. Advances in world models, better hardware, and richer training data will all contribute to that future, but embodied AI alone is not the magic key.
Building robots that operate reliably in the real world also depends on the broader ecosystem: physical platforms, operating environments, deployment workflows, safety and liability frameworks, and economic viability.
Embodied AI is an important piece of the puzzle, but it is not the whole picture. Until the surrounding ecosystem matures, robots will continue to shine in demos while struggling in everyday conditions.
Acknowledgements
A huge thank-you to colleagues from GovTech, partner agencies, vendors, and site teams who worked alongside us throughout this project. I’m also grateful to the colleagues who reviewed early drafts of this article and offered thoughtful feedback.
