Sample formats

A sample is a data point you want to label. Samples come in different types, like an image, a 3D point cloud, or a video sequence. When uploading (client.add_sample()) or downloading (client.get_sample()) a sample using the Python SDK, the format of the attributes field depends on the type of sample. The different formats are described here.

The section Import data shows how you can obtain URLs for your assets.

Image

Supported image formats: jpeg, png, bmp.

{
    "image": {
        "url": "https://example.com/image.jpg"
    }
}

If the image file is on your local computer, you should first upload it to our asset storage service (using upload_asset()) or to another cloud storage service.

Image sequence

Supported image formats: jpeg, png, bmp.

{ 
  "frames": [
    {
      "image": {
        "url": "https://example.com/frame_00001.jpg"
      },
      "name": "frame_00001" // optional
    },
    {
      "image": {
        "url": "https://example.com/frame_00002.jpg"
      },
      "name": "frame_00002"
    },
    {
      "image": {
        "url": "https://example.com/frame_00003.jpg"
      },
      "name": "frame_00003"
    }
  ]
} 

3D point cloud

On Segments.ai, the up direction is defined along the z-axis, i.e. the vector (0, 0, 1) points up. If you upload point clouds with a different up direction, you might have trouble navigating the point cloud.

{
    "pcd": {
        "url": "https://example.com/pointcloud.pcd",
        "type": "pcd"
    },
    "images": [
        { ... },
        { ... },
        { ... }
    ], // optional
    "name": "frame_00001", // optional
    "timestamp": "00001", // optional
    "ego_pose": {
        "position": {
            "x": -2.7161461413869947,
            "y": 116.25822288149078,
            "z": 1.8348751887989483
        },
        "heading": {
            "qx": -0.02111296123795955,
            "qy": -0.006495469416730261,
            "qz": -0.008024565904865688,
            "qw": 0.9997181192298087
        }
    },
    "default_z": -1, // optional, 0 by default
    "bounds": { // optional
        "min_z": -1,
        "max_z": 3
    }
}

Point cloud data

See 3D point cloud formats for the supported file formats.

{
    "url": "https://example.com/pointcloud.bin",
    "type": "kitti"
}

If the point cloud file is on your local computer, you should first upload it to our asset storage service (using upload_asset()) or to another cloud storage service.

Camera image

A calibrated or uncalibrated reference image corresponding to a point cloud. The reference images can be opened in a new tab from within the labeling interface. You can determine the layout of the images by setting the row and col attributes on each image. If you also supply the calibration parameters (and distortion parameters if necessary), the main point cloud view can be set to the image to obtain a fused view.

{
    "name": "Camera example 1", // optional
    "url": "https://example.com/image.jpg",
    "row": 0,
    "col": 0,
    "intrinsics": { // optional
        "intrinsic_matrix": [
            [1266.417203046554, 0, 816.2670197447984],
            [0, 1266.417203046554, 491.50706579294757],
            [0, 0, 1]
        ]
    },
    "extrinsics": { // optional
        "translation": {
            "x": -0.012463384576629082,
            "y": 0.76486688894964,
            "z": -0.3109103442096661
        },
        "rotation": {
            "qx": 0.713640516187247,
            "qy": -0.001134052598226082,
            "qz": 0.0036449450274057696,
            "qw": 0.7005017073187271
        }
    },
    "distortion": { // optional
        "model": "fisheye",
        "coefficients": {
            "k1": -0.0539124,
            "k2": -0.0101993,
            "k3": -0.00202017,
            "k4": 0.00120938
        }
    },
    "camera_convention": "OpenCV", // optional
    "rotation": 1.5708 // optional
}

If the image file is on your local computer, you should first upload it to our asset storage service (using upload_asset()) or to another cloud storage service.

Camera intrinsics

{
    "intrinsic_matrix": [
        [1266.417203046554, 0, 816.2670197447984],
        [0, 1266.417203046554, 491.50706579294757],
        [0, 0, 1]
    ]
}

Camera extrinsics

{
    "translation": {
        "x": -0.012463384576629082,
        "y": 0.76486688894964,
        "z": -0.3109103442096661
    },
    "rotation": {
        "qx": 0.713640516187247,
        "qy": -0.001134052598226082,
        "qz": 0.0036449450274057696,
        "qw": 0.7005017073187271
    }
}

Distortion

// Fisheye
{ 
    "model": "fisheye",
    "coefficients": {
        "k1": -0.0539124,
        "k2": -0.0101993,
        "k3": -0.00202017,
        "k4": 0.00120938
}
// Brown-Conrady
{ 
    "model": "brown-conrady",
    "coefficients": {
        "k1": -0.2916058942,
        "k2": 0.0763231072,
        "k3": 0.0,
        "p1": 0.0014829263,
        "p2": -0.0019540316
    }
}

Ego pose

The pose of the sensor used to capture the 3D point cloud data. This can be helpful if you want to obtain cuboids in world coordinates, or when your sensor is moving. In the latter situation, supplying an ego pose with each frame will ensure that static objects do not move when switching between frames.

{
    "position": {
        "x": -2.7161461413869947,
        "y": 116.25822288149078,
        "z": 1.8348751887989483
    },
    "heading": {
        "qx": -0.02111296123795955,
        "qy": -0.006495469416730261,
        "qz": -0.008024565904865688,
        "qw": 0.9997181192298087
    }
},

Segments.ai uses 32-bit floats for the point positions. Keep in mind that 32-bit floats have limited precision. In fact, only 24 bits can be used to represent the number itself (the significand, excluding the sign bit), or about 7.22 decimal digits. If you want to keep two decimal places, this only leaves 5.22 decimal digits, so the numbers shouldn't be larger than 10^5.22 = 165958.

To avoid rounding problems, it is best practice to subtract the ego position of the first frame from all other ego positions. This way, the first ego position is set to (0, 0, 0) and the subsequent ego positions are relative to (0, 0, 0) . In your export script, you can add the ego position of the first frame back to the object positions.

3D point cloud sequence

{ 
  "frames": [
    { ... },
    { ... },
    { ... }
  ]
} 

Multi-sensor sequence

{
  "sensors": [
    {
      "name": "Lidar", 
      "task_type": "pointcloud-cuboid-sequence",
      "attributes": { ... }
    },
    {
      "name": "Camera 1", 
      "task_type": "image-vector-sequence",
      "attributes": { ... } 
    },
    ...
  ]
}

Sensor

Text

{ 
    "text": "Example text sample." 
}

To upload text samples in bulk, see file formats.

Last updated