As I got to my desk on Monday morning, I had a little moment of ick as I realized I had left last week’s coffee mug on my desk over the weekend. Not a big deal - I just dropped it in the dishwasher - but it’s a problem I’m all too familiar with. I forget about my empty coffee mug all the time!
Like any smart machine enthusiast, I decided to over-engineer a solution to my simple tidiness problem.
The idea is relatively simple: a Viam Rover navigates around the office and looks for coffee mugs. When it finds one it reports it so someone (me) can clean it up.
I like this project because it brings together several parts of the Viam platform to make a helpful smart machine:
- Viam Rover
- ML Model Service to detect coffee mugs
- SLAM Service to build a map
- Motion Service to navigate
I put together a small team and we got to work using Viam’s flexible and composable platform to build a prototype.
How we built the office tidiness rover
Defining the mission
We designed the rover to follow a predetermined route of waypoints around the office. The rover would follow a very simple control loop to accomplish the task:
- Navigate to a waypoint
- Check if there are any coffee mugs
- Rotate 90°
- Check again for coffee mugs
- Repeat until facing back in the original direction
- Proceed to the next waypoint
For now, the rover would navigate autonomously but it would not start itself or dock for self charging. We decided to keep it straightforward – we’d put the batteries in the rover and start the service.
Detecting coffee mugs
We needed to train an ML model in order to actually detect the coffee mugs. Fortunately, this was pretty easy to do!
First, we set up the Data Capture Service to collect camera images of training data. Then we trained a detection model using Viam’s ML Model Service once the images were uploaded.
Unfortunately, the model wasn’t very good. We realized that the data was too homogenous. In our second attempt, we collected images from varying environments. That model performed a lot better!
Note: you can use a transform camera to visualize your ML model in action.
The last thing we needed to figure out was how to actually see the coffee mugs. Since the rover is low to the ground, it was impossible to see up on top of desks. We fixed that by mounting a 4-foot aluminum pole to the top of the rover and taping a USB camera to the top.
Map creation and navigation
In order to follow a route of waypoints around the office, we needed to build a map of the office! We used an RPlidar A3 mounted on top of the rover to build the map using the SLAM Service.
Originally we wanted to build a map of the whole office, but the SLAM implementation performs better with smaller maps. We picked a smaller area of the office and started our data collection.
Once we had a map, we could then use the built-in motion service to move the rover autonomously!
Putting it all together
With all of the pieces, it was time to put together our control loop. Using Viam’s Golang SDK, the control flow for our rover became trivial.
We ended up making our smart machine a little smarter by adding a few things:
- Retry navigation to the next waypoint when it fails. Navigation is not perfect. We almost always were successful in getting to our destination by trying again.
- Pick waypoints that are closer together. It is challenging to compute a route for a destination that is farther away. Picking points near each other (about a meter or two apart) made for more efficient navigation.
- Use diverse training data for the ML model. The ML Model will only be good at detecting one coffee mug if you use the same coffee mug in every training image in your dataset. We found this out the hard way.
In the end, our rover was able to detect the test coffee mugs we left scattered on our desks.
Teamwork makes the dream work
One of the best parts of this project is that we could divide and conquer to get it done. While Anjali and I trained the ML model, Zack and Roxy were hard at work on building maps and navigation.
Our team does not come from a traditional robotics background but we were able to put together a successful project! We learned a lot along the way.
- Training a ML Model is really easy. In just a couple of hours, we had a very accurate coffee mug detector. Adding new detection types (like plastic bottles) would be trivial.
- Small iterations got us to our goal quickly. We broke down the mission into as trivial of tasks as possible. This allowed us to quickly make progress with achievable steps along the way.
- Logging is really really helpful. Debugging a smart machine can be challenging, but adding excess logging can really illuminate what your robot is up to.
- Ask for help. There were moments we found ourselves at a standstill. Collaborative problem-solving allowed us to maintain momentum. And when things got really tricky, we sought guidance from domain experts at Viam, including our SLAM and ML teams. For anyone encountering roadblocks, the Viam Discord is an invaluable resource to find support!
The possibility of commercialization
I could already see the potential of the office tidiness rover for aiding our cleaning staff, not just with mugs, but other clutter like empty lunch containers and plastic bottles around the building as well. Training and deploying machine learning models for other messes slated for cleanup could save everyone time and effort that could be used for more productive and strategic tasks.
This is how many of Viam’s customers start out - perhaps they are tinkering with machines in their garage as part of a hobby, and realize that they have an idea useful enough to start a business around. Viam’s platform makes it straightforward to test machines out, add functionality on the fly, and iterate until you have a machine ready for a viable business.
Where do we go from here?
On a related note, we’re super excited about Viam’s Modular Registry, which makes it extremely easy to distribute code for wider use. We plan to expose our service in a module so that anyone can make a map, provide an ML model, and run their own in-office monitoring.
We also had some ideas for how to make it better:
- Train our ML model to detect more clutter on desks: plates, silverware, trash, and more
- Capture sensor data about the environment: temperature, humidity, gasses and more
- Expand the map to cover more of the office
- Autonomous charging and automated runs
- Send reports with the locations of the coffee mugs
What clutter do you want to detect in your office? Share with us and other smart machine enthusiasts in our Discord community.