To download your labeled data, you need to create a release on the Releases tab. A release is a snapshot of your dataset at a specific point in time.
By clicking the download link of a release, you obtain a release file in JSON format. This release file contains all information about the dataset, tasks, samples, and labels in the release.
Note that the segmentation masks are encoded as a png image that appears black when opened in an image viewer. The png image contains all necessary information though. Read more below.
You can export the release file to COCO format with the Python SDK:
# pip install segments-aifrom segments import SegmentsDatasetfrom segments.utils import export_dataset​# Initialize a SegmentsDataset from the release filerelease_file = 'flowers-v1.0.json'dataset = SegmentsDataset(release_file, labelset='ground-truth', filter_by=['labeled', 'reviewed'])​# Export to COCO panoptic formatexport_dataset(dataset, export_format='coco-panoptic')
Alternatively, you can use the initialized SegmentsDataset to loop through the samples and labels, and visualize or process them in any way you please:
import matplotlib.pyplot as pltfrom segments.utils import get_semantic_bitmap​for sample in dataset:# Print the sample name and list of labeled objectsprint(sample['name'])print(sample['annotations'])# Show the imageplt.imshow(sample['image'])plt.show()# Show the instance segmentation labelplt.imshow(sample['segmentation_bitmap'])plt.show()# Show the semantic segmentation labelsemantic_bitmap = get_semantic_bitmap(sample['segmentation_bitmap'], sample['annotations'])plt.imshow(semantic_bitmap)plt.show()
The general structure of the release file is as follows:
{"name": "first release","description": "This is a first release of Segments.ai playground dataset","created_at": "2020-07-09 10:20:19.888887+00:00","dataset": {"name": "flowers","task_type": "segmentation-bitmap","task_attributes": {...} # the categories etc."labelsets": [** list of labelsets **],"samples": {** list of samples **}}}
Each labelset
entry contains the labelset's name and description:
{"name": "ground-truth","description": ""}
Each sample entry contains information about the sample (name, image URL, ...) and a list of labels.
{"name": "donuts.jpg","attributes": {"image": {"url": "https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/segments/3b8b3da2-f09a-494b-999e-37250dfbf5b6.jpg"}},"labels": {/** list of labels, indexed by labelset **/}}
Each label contains information about the label_status (LABELED or REVIEWED) and provides a list of annotations (labeled objects) together with a segmentation bitmap.
{"label_status": "LABELED","attributes": {"format_version": "0.1","annotations": [{"id": 1,"category_id": 1},{"id": 2,"category_id": 1},{"id": 3,"category_id": 1},{"id": 4,"category_id": 1}],"segmentation_bitmap": {"url": "https://segmentsai-prod.s3.eu-west-2.amazonaws.com/assets/segments/504e7633-ef51-49c3-8b0e-d4eb9100532d.png"}}}
Thesegmentation_bitmap_url
refers to a 32-bit RGBA png image. The alpha channel is set to 255, and the remaining 24-bit values in the RGB channels correspond toinstance_ids.
Because of this large dynamic range, these png images appear black in an image viewer.
Please refer to this blog post for an example of training a model on exported data.