Documentation
  • Introduction
  • Tutorials
    • Getting started
    • Python SDK quickstart
    • Model-assisted labeling
  • How to annotate
    • Label images
      • View and navigate in the image interfaces
      • Image interface settings
      • Image segmentation interface
      • Image vector interface
    • Label 3D point clouds
      • View and navigate in the 3D interface
      • Upload, view, and overlay images
      • 3D interface settings
      • 3D point cloud cuboid interface
      • 3D point cloud vector interface
      • 3D point cloud segmentation interface
      • Merged point cloud view (for static objects)
      • Batch mode (for dynamic objects)
      • Smart cuboid propagation
      • 3D to 2D Projection
      • Tips for labeling cuboid sequences
    • Label sequences of data
      • Use track IDs in sequences
      • Use keyframe interpolation
    • Annotate object links (beta)
    • Customize hotkeys
  • How to manage
    • Add collaborators to a dataset
    • Create an organization
    • Configure the label editor
    • Customize label queue
    • Search within a dataset
    • Clone a dataset
    • Work with issues
    • Bulk change label status
    • Manage QA processes
  • How to integrate
    • Import data
      • Cloud integrations
    • Export data
      • Structure of the release file
      • Exporting image annotations to different formats
    • Integrations
      • Hugging Face
      • W&B
      • Databricks
      • SceneBox
    • Create an API key
    • Upload model predictions
    • Set up webhooks
  • Background
    • Main concepts
    • Sequences
    • Label queue mechanics
    • Labeling metrics
    • 3D Tiles
    • Security
  • Reference
    • Python SDK
    • Task types
    • Sample formats
      • Supported file formats
    • Label formats
    • Categories and attributes
    • API
Powered by GitBook
On this page
  • Export
  • Publish to the Hugging Face Hub
  • Train models on your dataset

Was this helpful?

  1. How to integrate
  2. Integrations

Hugging Face

PreviousIntegrationsNextW&B

Last updated 2 years ago

Was this helpful?

is a library for accessing and sharing machine learning datasets. It features powerful data processing methods to quickly get your dataset ready for training deep learning models.

Export

You can export a Segments dataset release as a 🤗 Dataset using the release2dataset function in the Python SDK:

from segments import SegmentsClient
from segments.huggingface import release2dataset

# Get a specific dataset release
client = SegmentsClient("YOUR_API_KEY")
release = client.get_release("jane/flowers", "v0.1")

# Convert it to a 🤗 Dataset
hf_dataset = release2dataset(release)

The returned object is a . The columns of the exported dataset depend on the task type of the dataset, and closely follow our documented sample and label formats. The columns can be inspected as follows:

>>> dataset.features

{
  'name': Value(dtype='string', id=None),
  'uuid': Value(dtype='string', id=None),
  'status': Value(dtype='string', id=None),
  'image': Image(decode=True, id=None),
  'label.annotations': [{'id': Value(dtype='int32', id=None), 'category_id': Value(dtype='int32', id=None)}],
  'label.segmentation_bitmap': Image(decode=True, id=None)
}
  • Reorder the rows

  • Rename and remove columns

  • Apply a processing function to each row in the dataset

  • Use as a PyTorch or TensorFlow dataset

Publish to the Hugging Face Hub

hf_dataset.push_to_hub("jane/flowers") # This is the name of a HF user/dataset

Train models on your dataset

The explains how you can modify the structure and contents of the exported dataset in all kinds of ways:

You can also easily publish your dataset to the :

shows how you can fine-tune a semantic segmentation model on a custom dataset exported from Segments. The process is similar for other task types.

Hugging Face (🤗) Datasets
🤗 Dataset object
🤗 Dataset documentation
Hugging Face Hub
This tutorial