
December 10, 2025
Category:
Physical AI
Read time:
16 minutes
Share This:
To understand why Memo matters, you need to understand what Tony and Cheng were working on before they dropped out of their Stanford PhDs.
Making Robot Learning Accessible with Tony Zhao
Tony Zhao arrived at Stanford after studying at UC Berkeley, where he worked with robotics pioneers Sergey Levine and Dan Klein. At Stanford, advised by Professor Chelsea Finn, he tackled one of robotics' most frustrating problems: why did effective robot learning require such expensive hardware?
His answer was ALOHA, A Low-cost Open-source Hardware System for Bimanual Teleoperation. The name might sound technical, but the insight was beautifully simple. What if you could teach robots complex tasks using hardware that cost under $20,000 instead of $200,000? Tony and his team showed that low-cost arms could thread zip ties, juggle ping pong balls, and slot batteries with precision, tasks that previously required high-end expensive equipment.
And ALOHA was just the beginning. Quickly after, Tony developed ACT (Action Chunking with Transformers), a learning algorithm that taught robots to predict actions in chunks rather than one tiny step at a time. We can think of it like learning to tie your shoes. You don't memorize every millimeter of movement, you learn the sequence of motions as connected chunks. This approach allowed robots to master difficult tasks with just 10-15 minutes of demonstration data.
Then came Mobile ALOHA, which added mobility to the mix. Now robots could navigate while manipulating objects, walking into the kitchen, grabbing items, and bringing them back. Tony's research wasn't just academically interesting; it was making robot learning practical and accessible. His papers have been cited over 6,400 times, and robotics companies around the world now build on his open-source work.
Before committing fully to Sunday, Tony spent time at Google DeepMind and interned at Tesla, working on projects that would influence both Autopilot and the Optimus humanoid robot. But increasingly, he felt that the real breakthrough wasn't going to happen in a lab, it was going to happen in people's homes.
Cheng's Revolution: Teaching Robots Through Diffusion
While Tony was making robot hardware cheaper and more capable, Cheng Chi was solving a different puzzle. How do you make robot learning more intelligent and adaptable?
Cheng started his PhD at Columbia University under Professor Shuran Song, then moved to Stanford when Song joined the faculty there. His background included stints at the autonomous vehicle company Nuro, an internship at Apple, and research work at Toyota Research Institute. But his breakthrough came when he brought an idea from generative AI into robotics.
In 2023, Cheng published Diffusion Policy, a paper that applied diffusion models to robot learning. Diffusion models are the same technology behind AI image generators like Stable Diffusion. The results were stunning. When benchmarked across 15 different manipulation tasks, Diffusion Policy outperformed every other method by an average of 46.9%.
What made diffusion models so effective for robots? They naturally handle situations where there are multiple valid ways to accomplish a task, they work well with complex action sequences, and they train more reliably than other approaches. In interviews, Cheng would later say: "I realized after training more tasks, that my code hadn't been changed for a few months. The only thing that changed was the data, and whenever the robot doesn't work, it's not the code, it's the data."
The realization that robotics had become a data problem rather than an algorithm problem would become central to Sunday's entire strategy.
Cheng also developed UMI (Universal Manipulation Interface), which used hand-held grippers to capture human demonstrations that could be transferred to robots. The gripper was portable, inexpensive, and could collect data anywhere, in restaurants, homes, offices, without needing to bring a full robot along. UMI was selected as a Best Systems Paper Finalist at one of robotics' top conferences.
The Decision to Build
By early 2024, both Tony and Cheng faced a choice. They’d either continue their prestigious academic careers at Stanford, or bet everything that household robotics was ready to make the leap from research lab to living room.
They chose the garage.

Teaching Robots by Living Your Life
Most companies training home robots face an impossible problem. You need robots to collect training data, but you need training data to make robots that work. It's expensive, time-consuming, and limiting. You can't exactly ship research robots to hundreds of homes to learn how different families load their dishwashers.
Sunday's breakthrough was recognizing that you don't need robots to teach robots, you need humans.
The Skill Capture Glove: Your Movements Become Robot Memories
Imagine slipping on a pair of high-tech gloves before doing your dishes. As you pick up each plate, scrub a pan, or organize items in the dishwasher, these gloves, Sunday calls them Skill Capture Gloves, record everything. From the exact motion of your hands, the force you apply, the tactile feedback, the subtle adjustments you make without thinking.
Sunday has distributed over 2,000 pairs of these gloves to what they call "memory developers,” real people in real homes performing real household tasks. Over 500 homes have participated, generating approximately 10 million episodes of authentic household routines. That's 10 million examples of how humans actually do chores, in their actual homes, with their actual dishes and laundry and clutter.
This approach is radically different from competitors like 1X Technologies, whose NEO robot relies on VR teleoperation. With 1X's approach, remote operators wearing VR headsets control the robot while viewing live feeds of your home, essentially puppeteering the machine to collect training data. When journalists tested NEO at 1X's Palo Alto headquarters, it took over a minute to retrieve a water bottle and five minutes to load a dishwasher, all under human remote control. The VR approach has fundamental problems: operators' hands go numb without force feedback, leading to aggressive movements; movements feel unnatural because you're controlling a robot interface rather than doing the task; and critically, you need the physical robot present for every demonstration, making data collection slow and expensive.
With Sunday's gloves, a family in Seattle can teach Memo how to handle their specific dishware while another family in Miami demonstrates laundry folding techniques, all simultaneously, all contributing to a shared knowledge base without any robots present. The gloves encourage natural, dexterous movements because you're actually doing the task, not puppeteering a robot from afar. Ship gloves overnight, and data collection begins immediately. No robot transportation, no complex setup, no remote operators viewing your home.

The scalability difference is stark. If 1X wants to double their training data, they need to double their operator hours and robot deployments. If Sunday wants to double their data, they ship more gloves. The marginal cost of each new data episode drops as the glove network grows, while VR teleoperation maintains high costs per episode. This is why Sunday can collect 10 million episodes while competitors struggle to scale beyond thousands.
Even more importantly, the gloves capture how humans naturally perform tasks with full sensory feedback, not how someone awkwardly controls a robot through VR. Cheng Chi explained the problem directly. VR teleoperation results in "aggressive" robot movements unsafe for delicate items because operators can't feel what they're touching. Sunday's gloves record the practiced expertise of people who've loaded dishwashers thousands of times, capturing the embodied knowledge that makes humans so much better at household tasks than any robot has been.
This data collection infrastructure creates a compounding advantage. Every additional home using gloves generates training data that improves Memo's capabilities, attracting more participants, generating more data, further improving performance. A competitor starting today would need years to replicate Sunday's 10 million episode dataset, and by that time, Sunday would have 50 million or 100 million episodes. The data moat grows deeper every day.
Sunday co-designed Memo's hands to mirror the glove's shape and sensors, minimizing the translation gap from human demonstrations to robot actions. Their Skill Transform pipeline converts human movements into robot-compatible commands, but because the hardware was designed in parallel, the embodiment gap is smaller than trying to map VR-controlled robot actions to autonomous operation.
In an industry obsessed with humanoid forms and impressive VR demos, Sunday made the contrarian bet to build superior data infrastructure first, then build the robot to learn from it. The Skill Capture Glove looks less impressive in marketing videos than VR-controlled humanoids, but it solves the fundamental problem, how to collect the billions of diverse, high-quality demonstrations that AI needs to achieve reliability in the infinite complexity of real homes.
Of course, human hands and robot grippers aren't the same. You have 27 bones in your hand and more than 30 joints. Memo has mechanical grippers with different constraints and capabilities.
That's where Skill Transform comes in. Skill Transform is Sunday's proprietary pipeline that converts human demonstrations into robot-executable actions. But here's the clever part. Memo's hands were specifically co-designed to mirror the glove's shape and sensors. It's like Sunday designed the glove and the robot hand to be translation-compatible from the start. Any skill you demonstrate with the glove, Memo can learn to master.
ACT-1: A Robot Brain Trained on Human Expertise
All that data flows into ACT-1, Sunday's foundation model for robotics. The name is both technical (it builds on Tony's ACT research from Stanford) and aspirational (in a field fond of comparing everything to GPT, this is "ACT-1", the foundation for a new generation).
What makes ACT-1 special is right there in its tagline: "A frontier robot foundation model trained on zero robot data." Every bit of training came from human demonstrations captured through the gloves. ACT-1 has never seen a robot do anything, it only knows how humans do things, and it's learned to translate that knowledge into robot movements.
This approach enables three capabilities that set Memo apart:
- Ultra Long-Horizon Tasks: Memo can clear an entire table, walk the dishes to the kitchen, load the dishwasher, start the cycle, and return. In demonstrations, Memo has completed tasks requiring navigation over 130 feet and 33 unique interactions with 21 different objects. Real household chores aren't atomic actions, they're complex sequences, and Memo understands that.
- Zero-Shot Generalization: Because ACT-1 trained on data from hundreds of different homes, Memo can walk into a house it's never seen before and just... work. Sunday has tested this by deploying Memo in Airbnb rentals with no prior exposure. Different layouts, different dishware, different clutter patterns, Memo adapts because it learned from diversity rather than memorizing one specific environment.
- Advanced Dexterity: The force-feedback data from the gloves allows Memo to handle delicate items appropriately. Wine glasses, folded socks, organizing shoes, tasks requiring fine motor control and appropriate grip strength. It's the difference between a robot that can move objects and a robot that can handle your belongings with care.
Memo: Friendly by Design, Capable by Engineering

Why Memo Doesn't Have Legs
Walk into most robotics labs and you'll see companies racing to build bipedal humanoid robots. Walking on two legs is technically impressive, generates great marketing videos, and seems like the "right" way to make a general-purpose robot.
Sunday took the opposite approach, and their reasoning is refreshingly practical.
Bipedal locomotion is one of the hardest problems in robotics. You need sophisticated balance algorithms, powerful actuators in every leg joint, real-time adjustments to maintain stability, and even then, falls are a constant risk. All that engineering effort goes into just staying upright and moving around.
Memo rolls on a wheeled base, basically a sophisticated, heavy-duty version of a very large Roomba. The base is stable, reliable, and frankly not that impressive. But that's the point. By making locomotion simple and stable, Sunday freed up all their engineering resources to focus on what actually matters for household tasks: manipulation.
The wheeled base also brings safety advantages that matter when you're operating in family homes. If Memo loses power, it doesn't fall over, it just stops. There's no risk of a humanoid robot toppling onto a pet or a child. The wide, stable base means Memo can't tip over even when reaching for items on high shelves.
The Height-Adjustable Torso: Vertical Reach Without Complexity
Rising from Memo's wheeled base is a height-adjustable central column. Think of it as a telescoping torso. This allows Memo to reach items from floor level all the way up to countertops, giving it the vertical working range it needs for kitchen and laundry tasks without the mechanical complexity of human-like joints.
Need to pick up toys from the floor? Memo lowers down. Time to reach items on the counter? The torso extends upward. It's mechanically simpler than legs and hips, more reliable, and perfectly suited for the tasks Memo needs to perform.
The Arms and Hands: Where the Magic Happens
This is where Sunday invested their engineering focus. Memo has two anthropomorphic arms with dual grippers, designed for bimanual manipulation—using both hands together for tasks that require it.
The grippers mirror the Skill Capture Glove design, which means the translation from human demonstration to robot execution is as direct as possible. They include sensors for force feedback and tactile sensing, allowing Memo to adjust grip strength for different objects. Memo can handle wine glasses with a gentle grasp and grip heavier pots with more force.
The manipulation system prioritizes safe, deliberate movements over industrial speed. Memo isn't trying to set records for how fast it can load a dishwasher. It's trying to do it reliably, safely, and without breaking your dishes.
The Silicone-Clad Exterior
Sunday wrapped Memo in a soft, silicone-clad exterior that serves multiple purposes. Aesthetically, it makes Memo look less like an industrial machine and more like a friendly household appliance. The soft exterior is approachable, it doesn't look intimidating in your living room.
But the design is also functional. The soft exterior reduces injury risk during human-robot interactions. If Memo bumps into you (or you bump into Memo), the padded surface is forgiving. For families with children and pets, this safety consideration matters enormously.
The overall aesthetic is somewhere between "sophisticated appliance" and "friendly robot." Memo doesn't try to look human, but it doesn't look coldly mechanical either. It's its own thing. Practical, capable, and yes, kind of cute.

The Form Follows Function Philosophy
In interviews, Tony explained Sunday's design philosophy: "Whenever we see something that we can accelerate with simplification, we'll go simplify that." The gripper design illustrates this perfectly. Rather than giving Memo five independent fingers like a human hand, they combined them into a three-pronged clamp because "most of the time when we use those fingers, we use them together."
This functionalist approach permeates Memo's design. Every component exists to serve the mission of reliable household task completion, not to impress with unnecessary complexity or achieve some idealized humanoid form. It's engineering with purpose.
What Memo Can Actually Do (And What It's Still Learning)
Sunday has been refreshingly honest about Memo's capabilities and limitations. In a robotics industry prone to flashy demos that don't represent real-world performance, this honesty is notable.
Current Capabilities
Making Coffee: Memo can navigate to a messy kitchen, locate the espresso machine, add grounds, start the brew, and clean up afterward. The "messy kitchen" detail matters. This isn't a sterile lab environment.
Clearing Tables and Loading Dishwashers: This is Memo's showcase task, demonstrated publicly with the impressive statistics: 130+ feet of autonomous navigation, 33 unique interactions, 21 different objects including delicate dinnerware. Memo can clear a dinner table, transport items to the kitchen, properly load them into the dishwasher, and start the cycle.
Folding Laundry: The dexterity required for fabric manipulation makes this particularly challenging. Memo can fold various clothing items, though as with any new capability, reliability improves with more training data.
Tidying and Organizing: Putting away shoes, organizing household items, clearing clutter - tasks that require understanding object categories and appropriate storage locations.
Brewing Beverages: Beyond espresso, Memo can handle various drink preparation tasks that combine precision, sequencing, and safety (handling liquids and hot items).

The Honest Caveats
Reliability on tasks like table clearing and dishwasher loading remains an active area of improvement. Sunday openly states that as they move toward deployment in actual homes, continuous refinement is necessary. Polished demos can overstate real-world readiness, and the key question, as technology journalists have noted, is whether Memo performs consistently in different homes without engineers on hand to troubleshoot.
Tony and Cheng are direct about this challenge: "The promise of AI robotics isn't back-flipping or dancing demos, but robots that work in messy, real-world situations." They're building for reliability over impressiveness, even if that makes for less viral marketing videos.
The Beta Program
Starting November 19, 2025, Sunday opened applications for the Founding Family Beta program. In late 2026, they'll select 50 households to become early adopters, each receiving an individually numbered Memo robot along with direct support from the Sunday team.
This measured approach reveals Sunday's priorities. They're not rushing to mass production to hit arbitrary timelines. They're validating real-world performance, gathering feedback from actual households, and iterating before scaling.
The beta families will effectively become co-developers, their experiences shaping future capabilities and refinements. It's a collaborative approach that acknowledges the complexity of deploying robots in the infinite variability of home environments.
By 2027-2028, Sunday anticipates being able to ship at scale, with manufacturing costs potentially dropping below $10,000 per unit as they transition from CNC machining and hand-painting to injection molding for volume production.
Why Memo Might Actually Succeed Where Others Failed
The home robotics graveyard is vast. From the ambitious promises of the 1960s to the disappointing vacuuming robots that couldn't handle chair legs, the field has more failures than successes. Why might Sunday be different?
The "GPT Moment" in Physical AI
Tony and Cheng describe the current state of physical AI using an illuminating analogy. When GPT-3 launched, it showed that transformer architectures could scale and produce interesting results,"signs of life" for language models. But it wasn't until ChatGPT, with more data and fine-tuning, that we got something transformative for consumers.
Robotics, they argue, is between those two moments. The fundamental learning algorithms (transformers, diffusion policies) work. The technical question is whether massive data scaling will produce similar emergent capabilities for physical tasks.
Sunday's entire strategy is a bet that it will - which is why they've invested so heavily in data collection infrastructure rather than novel algorithms. The Skill Capture Glove system isn't just clever; it's potentially the unlock that allows scaling from thousands to millions to eventually billions of demonstration episodes.
The Data Moat Grows Deeper Every Day
Every home using a Skill Capture Glove generates new training data that improves Memo's capabilities. This creates a virtuous cycle: better performance attracts more beta participants, generating more data, which further improves performance.
A competitor starting today would need years to replicate Sunday's 10 million episode dataset, even with equivalent technology. And by that time, Sunday will have 50 million episodes, or 100 million. The data advantage compounds.

The Stanford Network Effect
The founders' academic reputation has created a talent attraction advantage. Top robotics researchers want to work with Tony and Cheng because of their published work. Sunday's team of 25 includes engineers and researchers from Stanford, Tesla, DeepMind, Waymo, Meta, and Neuralink, a density of expertise that accelerates development velocity.
Functional Simplicity as Strategy
While competitors wrestle with bipedal balance and fall recovery, Sunday focuses every engineering hour on manipulation quality. The wheeled design isn't a compromise, it's a strategic choice that lets them be excellent at what matters for household tasks.
A Robot in Every Home
"There are plenty of companies building on our work to advance their research," Tony said at Memo's launch, "but what we are building here is something that is even larger: to put a robot in every home."
It's an audacious goal, one that recalls the personal computer revolution's democratizing mission. Just as PCs went from million-dollar mainframes to devices in every household, Sunday envisions capable robots becoming as common as dishwashers or washing machines.
The timing might finally be right. The convergence of validated learning algorithms, declining compute costs, massive training datasets, and practical hardware design suggests we may be approaching the reliability threshold necessary for consumer robotics to work.
Watching History Being Made From a Garage
There's something compelling about the image of two Stanford PhDs, working through the night in a Mountain View garage, surrounded by 3D printers and robot prototypes, building something that could change how millions of families live.
It recalls the founding mythology of so many transformative technologies, starting small, solving hard problems, believing in a vision when the path forward isn't certain. Tony and Cheng dropped out of prestigious PhD programs, left promising academic careers, and bet everything on the conviction that household robotics was ready.
Memo is the result of that bet. A soft, silicone-clad, wheeled robot that wants nothing more than to do your dishes and fold your laundry. It learned from watching real families in real homes, trained on 10 million episodes of human expertise, powered by some of the most sophisticated AI research to ever be applied to manipulation tasks.
And somehow, despite all that complexity, despite the transformers and diffusion models and data pipelines, despite the Stanford PhDs and the millions in funding, Memo just looks friendly. Approachable. Maybe that's the real innovation: not just building a capable robot, but building one that feels right to live with.
The first 50 beta families will tell us if Sunday got it right. But watching from the outside, it's hard not to root for the Stanford dropouts in the garage, teaching robots to be helpful, one household chore at a time.
Want to follow Sunday's journey? Visit sunday.ai to learn more about Memo and the future of household robotics.
Note: Sunday Robotics launched Memo in November 2025 after 18 months in stealth. The company has raised $35 million and is accepting applications for its Founding Family Beta program, with first deployments planned for late 2026.
Bullish on Robotics? So Are We.
XMAQUINA is a decentralized ecosystem, giving a global community early exposure to the world’s leading robotics companies before they disrupt trillion-dollar industries.
Now, you don’t have to be sidelined. Own the rise of humanoids.
Join our Discord and connect with thousands of futurists building the XMAQUINA DAO.
Follow us on X for the latest updates.
Owner:

.png)
.png)
