Product
December 22, 2023

Enhanced datasets: Viam's simplified data solution

Natalia Jacobwitz
Product Manager
On this page
Explore Viam's enhanced datasets for streamlined data curation and model training, enhancing your ML projects with ease and precision.

Curating data and training models in Viam just got a whole lot more powerful! Now with datasets in Viam, you can organize the data you want to train models on and get valuable information about the dataset before you train. Kicking off a training (or retraining) session has never been easier with the button built into your dataset view. 

To train a model the previous way, you had to filter down to the exact data you wanted to train on, whether it was data within a time range, with certain tags applied, or from a specific component. 

We got feedback that this was confusing, so we wanted to streamline the flow and make sure it was clear to our users exactly which data their model would be training on.

In addition to the confusing filtering-to-train flow, there was no option to manually select which images to train on. Say you liked 20 specific images that were clear and demonstrated something new in each image - you weren’t able to choose to train your model on just those 20 images.

What's new with datasets

Data curation with Datasets

Explore app.viam.com/data/view to see all your organization's images that you synced to the cloud via Viam’s data capture and sync capabilities. 

Open the image details panel by clicking on any image, and use the dropdown for datasets to add the image to existing datasets or create a new named dataset. 

Notice the Datasets dropdown on the bottom right of this image.

Repeat this process for each image, curating the perfect dataset for your needs.

Effortless labeling and real-time analysis

Label each image during the dataset curation process or once you have the dataset of images navigate to app.viam.com/data/datasets to label the dataset’s images individually. 

The dataset provides real-time insights into label characteristics, detailing the total count of images including each classification tag and bounding box label. 

It is especially helpful that the dataset stats include information on whether any images are missing labels, and easily lets you filter down to those images. 

Simplified model training

Once your dataset is curated, initiating model training is a breeze. 

Click "train model," from within the specific dataset and you'll be guided to a page where you can easily choose to train a new model or a new version of an existing one. 

The 'train a model' page within app.viam.com.

Specify the model type and labels for training, and then give your model a name. 

As you select your model type and training labels you are proactively informed if your dataset doesn’t have a sufficient number of images, images with a specific label, or bounding boxes. 

The message you'll receive if the number of images you're training against is too small.

Then you can kick off the training with the click of a button.

Later, when looking at your trained models, it will be clear what data you have trained your model on by the link to the dataset it trained with, making it easy to understand and replicate training runs.

Easy repeated training of additional models on the same data

Should you find that the resulting model isn’t as good as you would like, or you want to train a slightly different model on the same data, you can easily navigate back to that named dataset to train/retrain on. 

This is a huge enhancement from before, when you needed to once again filter down to the exact data. 

Collaborating with Datasets

Because datasets live in your organization, you can share a dataset to anyone who is also an organization owner. You can also collaboratively add images to a dataset as well as collaborate on the tagging and labeling of your images so you don’t have to do it all yourself. 

Taking action and moving forward 

I have personally been using this feature for my robots, in fact I just successfully trained a machine learning model within minutes using a dataset consisting of just 20 images and a bunch of bounding boxes.

The dataset I trained with for my MenoRobot 2.0 project.

Get started today by following our up-to-date tutorial on how to capture data and train a model using Try Viam. This tutorial will show you how to create a new dataset, add captured images to the dataset, and train on this dataset, all with the updated functionality described in this post. 

Happy curation and training!

twitter iconfacebook iconlinkedin iconreddit icon

Find us at our next event

Feb 26, 2025
,
7:30 PM - 10:30 PM

NY AI Users MeetUp

In Person
New York, NY
Join us at the Viam office for tech talks and social, featuring Viam’s Jack Damon on visual language models for business and Gabriel Muller on AI-powered content creation. No technical background required—just curiosity.
Join Us
Feb 28, 2025
Feb 28, 2025
,
09:00 AM - 04:00 PM EST

SREDay 2025 NYC

In Person
New York
DevOps and SRE professionals are invited to the Viam office for this sponsored event. Join Director of Engineering at Viam, Ale Paredes, to discuss the challenges of integrating hardware and software.
Secure your spot
Mar 6, 2025
Mar 9, 2025
,

SoCal Linux Expo

In Person
Pasadena, CA
Join Viam at the Southern California Linux Expo, and don't miss our Head of Developer Relations, Joyce Lin, presenting a session on free and open-source software for home automation.
Build something
Apr 3, 2025
Apr 3, 2025
,

DotJS

In Person
Paris, France
JavaScript developers are invited to join Viam in Paris to learn how to overcome challenges of working with code in the physical world.
Join Us