Exporting data

To download your labeled data, you need to create a release on the Releases tab. A release is a snapshot of your dataset at a specific point in time.

By clicking the download link of a release, you obtain a release file in JSON format. This release file contains all information about the dataset, tasks, samples, and labels in the release.

Note that the segmentation masks are encoded as a png image that appears black when opened in an image viewer. The png image contains all necessary information though. Read more here.

Exporting the release file to different formats

You can export the release file to different formats with the Python SDK. Use the export_datasetutil function for this, setting the export_format parameter to one of the following:

Value

Description

coco-instance

COCO instance segmentation format

coco-panoptic

COCO panoptic segmentation format

yolo

Yolo Darknet object detection format

instance

Grayscale PNGs where the values correspond to instance ids

semantic

Grayscale PNGs where the values correspond to category ids

instance-color

Colored PNGs where the colors correspond to different instances

semantic-color

Colored PNGs where the colors correspond to different categories, with colors as configured in the label editor settings when available

Example:

# pip install segments-ai
from segments import SegmentsDataset
from segments.utils import export_dataset
# Initialize a SegmentsDataset from the release file
release_file = 'flowers-v1.0.json'
dataset = SegmentsDataset(release_file, labelset='ground-truth', filter_by=['labeled', 'reviewed'])
# Export to COCO panoptic format
export_dataset(dataset, export_format='coco-panoptic')

Alternatively, you can use the initialized SegmentsDataset to loop through the samples and labels, and visualize or process them in any way you please:

import matplotlib.pyplot as plt
from segments.utils import get_semantic_bitmap
for sample in dataset:
# Print the sample name and list of labeled objects
print(sample['name'])
print(sample['annotations'])
# Show the image
plt.imshow(sample['image'])
plt.show()
# Show the instance segmentation label
plt.imshow(sample['segmentation_bitmap'])
plt.show()
# Show the semantic segmentation label
semantic_bitmap = get_semantic_bitmap(sample['segmentation_bitmap'], sample['annotations'])
plt.imshow(semantic_bitmap)
plt.show()

Structure of the release file

The general structure of the release file is as follows:

{
"name": "first release",
"description": "This is a first release of Segments.ai playground dataset",
"created_at": "2020-07-09 10:20:19.888887+00:00",
"dataset": {
"name": "flowers",
"task_type": "segmentation-bitmap",
"task_attributes": {...} # the categories etc.
"labelsets": [
** list of labelsets **
],
"samples": {
** list of samples **
}
}
}

Label set

Each labelset entry contains the labelset's name and description:

{
"name": "ground-truth",
"description": ""
}

Sample

Each sample entry contains information about the sample (name, image URL, ...) and a list of labels.

{
"name": "donuts.jpg",
"attributes": {
"image": {
"url": "https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/segments/3b8b3da2-f09a-494b-999e-37250dfbf5b6.jpg"
}
},
"labels": {
/** list of labels, indexed by labelset **/
}
}

Label

Each label contains basic information such as the time it was created, the user who created it, its status (e.g. LABELED). The attributes field contains all info about the labeled objects. Its contents depend on the labeling type (segmentation or bounding boxes) and are described in more detail here.

{
"label_status": "LABELED",
"attributes": {
"format_version": "0.1",
"annotations": [
{
"id": 1,
"category_id": 1
},
{
"id": 2,
"category_id": 1
},
{
"id": 3,
"category_id": 1
},
{
"id": 4,
"category_id": 1
}
],
"segmentation_bitmap": {
"url": "https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/segments/504e7633-ef51-49c3-8b0e-d4eb9100532d.png"
}
}
}

Please refer to this blog post for an example of training a model on exported data.