algorithm_format.md 22.1 KB

Algorithms file format specification

Algorithms spec is the file format used by videostitch-cmd. It uses full JSON syntax. Algorithms are simply defined by populating the root object with a few named variables.

Objects

VideoStitch will interpret the following objects. An object is of course free to have other members in addition to these.

Root

videostitch-cmd will interpret the following optional members of the root object:

Member Type Default value
outputPtv list Optional If provided, this value specifies the output ptv file that will contain the results of the "algorithms" and the settings from the input ptv.
The tables below show the members of the "outputPtv"
Member Type Default value
name string No Default Names of the output ptv.
outputFile list Optional An optional list to specify the format of the final output in the new ptv file.

outputFile

The "outputFile" contains two optional members.

Member Type Default value
name string Optional An optional string to specify the name of the output file.
type string Optional An optional string to specify the output extension.

An example setting

  "outputPtv": [
    {
      "name" : "outputFinal.ptv",
      "outputFile": {
        "name" : "out-vs",
        "type" : "jpg"
      }
    }
  ]

Algorithms

Algorithms specify which algorithms to be called from command-line In the tables below, mandatory members are shown in bold. If not set, optional members take a default value specified in the "Default value" column.

Member Type Default value
name string mask Type of algorithm. One of the types below.

"mask" algorithm:

The "mask" algorithm optimizes for the blending mask and blending order.

Member Type Default value
list_frames list [0] The list of frames used for the optimization.
blending_order bool true To specify whether the blending order is consider or not. If not, the order of inputs will be used as the blending order.
seam bool true To specify whether the optimal seams are computed.

An example setting for the "mask" algorithm is:

   {
     "algorithms" : [
     {
       "name": "mask",
       "config" : {
         "list_frames" : [0],
         "blending_order" : true,
         "seam" : true
       }
     }
     ]
   }

"autocrop" algorithm:

The "autocrop" algorithm detects the crop circle for the fish-eye images. The first frame of the inputs is used for detection.

Member Type Default value
neighborThreshold int 40 A threshold used for the image binarization step. Indicates the similarity of neighboring pixels to be considered as a connected component.
differenceThreshold int 500 A threshold used for the binarization step. Indicates the similarity of a pixel to the seed pixel to be considered as a connected component.
fineTuneMarginSize int 100 A pre-defined range around the coarse circle's samples to look for the fine scale samples, e.g. 0 indicates that the fine scale samples are the coarse samples, while 100 indicates that the fine scale samples will be searched in the range [-100..100] px around the coarse samples.
scaleRadius double 0.98 A parameter used to scale the fine scale circle's radius.
circleImage bool false Whether to dump an image with the crop circle overlays on top of the original image.

An example setting for the "autocrop" algorithm is:

   {
     "algorithms" : [
     {
       "name": "autocrop",
       "config" : {
         "neighborThreshold" : 40,
         "differenceThreshold" : 500,
         "fineTuneMarginSize" : 100,
         "circleImage" : false,
         "scaleRadius" : 0.98		 
       }
     }
     ]
   }

"calibration" algorithm:

The "calibration" algorithm runs a calibration with presets for the rig.

Member Type Default value
config list Optional The list of calibration configuration parameters.
apply_presets_only bool Optional If true, applies the rig presets to the PanoDefinition without actually calibrating. Meant to obtain a template project definition out of inputs and presets.
improve_mode bool Optional If true, calibration will reused past control points found in the JSON tree.
auto_iterate_fov bool Optional If true, calibration will try to estimate the FOV of the lens.
single_focal bool Optional If true, the optimizer will estimate a single (fu,fv) pair of parameters for all lenses in the rig.
dump_calibration_snapshots bool Optional If true, calibration will dump pictures with control points at various stages of the procedure.
deshuffle_mode bool Optional If true, calibration will extract key points and reorder the video inputs using the rig presets, before running the full calibration.
deshuffle_mode_only bool Optional If true, calibration will extract key points and reorder the video inputs using the rig presets, without running the full calibration.
deshuffle_mode_preserve_readers_order bool Optional If true, deshuffling will preserve the order of input readers in the returned PanoDefinition and reorder the geometries. Otherwise, the deshuffling will keep the order of geometries and reorder the input readers.
use_synthetic_keypoints bool false If true, the calibration algorithm will generate artificial keypoints from the PanoDefinition geometry, to cover the input areas where no real keypoint was extracted and preserve the PanoDefinition geometry.
synthetic_keypoints_grid_width int 5 Grid width to generate artificial keypoints in each input picture.
synthetic_keypoints_grid_height int 5 Grid height to generate artificial keypoints in each input picture.
cp_extractor list Optional The list of parameters for the control point detector and matcher.
extractor string Optional The name of the control point detector.
matcher_norm string Optional The name of the control point matcher.
octaves int Optional The number of octaves used for the detection.
sublevels integer Optional The number of sublevels used for the detection.
threshold double Optional Detection threshold.
nndr_ratio double Optional Ratio between the first and second best score to claim a right match.
cp_filter list Optional Parameters for the control point filter, which uses RANSAC.
angle_threshold double Optional Angle threshold to validate a control point rotated from one lens to the other.
min_ratio_inliers double Optional Minimum ratio of inliers, before RANSAC.
min_samples_for_fit integer Optional Minimum number of control points to run RANSAC.
proba_draw_outlier_free_samples double Optional Target probability to achieve for RANSAC.
decimating_grid_size double Optional Grid size for decimation (w.r.t. picture size).
rig list Optional The list of rig preset parameters.
name string Mandatory The name of the rig presets.
rigcameras list Mandatory The list of camera presets.
yaw_mean double Mandatory Mean value of the yaw rotation angle for the optimizer.
yaw_variance double Mandatory Variance value of the yaw rotation angle for the optimizer.
pitch_mean double Mandatory Mean value of the pitch rotation angle for the optimizer.
pitch_variance double Mandatory Variance value of the pitch rotation angle for the optimizer.
roll_mean double Mandatory Mean value of the roll rotation angle for the optimizer.
roll_variance double Mandatory Variance value of the roll rotation angle for the optimizer.
camera string Mandatory Camera preset name (from the list of cameras).
cameras list Mandatory The list of camera(s) presets.
name string Mandatory The name of the camera presets.
format string Mandatory The format of the lens projection.
width integer Mandatory The width of the camera pictures.
height integer Mandatory The height of the camera pictures.
fu_mean double Mandatory Mean value of the horizontal focal for the optimizer.
fu_variance double Mandatory Variance value of the horizontal focal for the optimizer.
fv_mean double Mandatory Mean value of the vertical focal for the optimizer.
fv_variance double Mandatory Variance value of the vertical focal for the optimizer.
cu_mean double Mandatory Mean value of the horizontal center of projection for the optimizer.
cu_variance double Mandatory Variance value of the horizontal center of projection for the optimizer.
cv_mean double Mandatory Mean value of the vertical center of projection for the optimizer.
cv_variance double Mandatory Variance value of the vertical center of projection for the optimizer.
distorta_mean double Mandatory Mean value of the "a" distortion parameter for the optimizer.
distorta_variance double Mandatory Variance value of "a" distortion parameter for the optimizer.
distortb_mean double Mandatory Mean value of the "b" distortion parameter for the optimizer.
distortb_variance double Mandatory Variance value of "b" distortion parameter for the optimizer.
distortc_mean double Mandatory Mean value of the "c" distortion parameter for the optimizer.
distortc_variance double Mandatory Variance value of "c" distortion parameter for the optimizer.
list_frames list Optional The list of frames used for the optimization.

"calibration_presets_maker" algorithm:

The "calibration_presets_maker" algorithm takes an input PanoDefinition, and creates calibration presets (i.e. "rig" and "cameras" JSON objects) from it, to be used by the "calibration" algorithm.

The returned PanoDefinition also contains the calibration presets, and its control point list is cleared.

Member Type Default value
output string none The name of the JSON file receiving the results of the algorithm.
name string Mandatory The name of the rig presets.
focal_std_dev_value_percentage double 5.0 Standard Deviation of focal parameters, expressed in terms of percentage of the input values.
center_std_dev_width_percentage double 10.0 Standard Deviation of center parameters, expressed in terms of percentage of the width of the input.
distort_std_dev_value_percentage double 50.0 Standard Deviation of distortion parameters, expressed in terms of percentage of the input values.
yaw_std_dev double 5.0 Standard Deviation of yaw angles, in degrees.
pitch_std_dev double 5.0 Standard Deviation of pitch angles, in degrees.
roll_std_dev double 5.0 Standard Deviation of roll angles, in degrees.
translation_x_std_dev double 0.0 Standard Deviation of X translations, in meters.
translation_y_std_dev double 0.0 Standard Deviation of Y translations, in meters.
translation_z_std_dev double 0.0 Standard Deviation of Z translations, in meters.

An example setting for the "calibration_presets_maker" algorithm is:

   {
     "algorithms" : [
       {
         "name": "calibration_presets_maker",
         "output": "output_presets.json",
         "config" : {
           "name" : "Orah Tetra 4i",
           "focal_std_dev_value_percentage" : 15.0,
           "center_std_dev_width_percentage" : 10.0, 
           "distort_std_dev_value_percentage" : 50,
           "yaw_std_dev" : 1.0, 
           "pitch_std_dev" : 1.0, 
           "roll_std_dev" : 1.0,
           "translation_x_std_dev" : 0.07,
           "translation_y_std_dev" : 0.07,
           "translation_z_std_dev" : 0.
         }
       }
     ]
   }

An example output of the "calibration_presets_maker" algorithm is:

{
  "rig" : {
    "name" : "Orah Tetra", 
    "rigcameras" : [
      {
        "angle_unit" : "degrees", 
        "yaw_mean" : 0, 
        "pitch_mean" : 0, 
        "roll_mean" : 0, 
        "yaw_variance" : 16414, 
        "pitch_variance" : 16414, 
        "roll_variance" : 16414, 
        "camera" : "camera_0"
      },
      ...
      {
        "angle_unit" : "degrees", 
        "yaw_mean" : 93.068, 
        "pitch_mean" : -67.2862, 
        "roll_mean" : -68.2285, 
        "yaw_variance" : 16414, 
        "pitch_variance" : 16414, 
        "roll_variance" : 16414, 
        "camera" : "camera_3"
      }
    ]
  }, 
  "cameras" : [
    {
      "name" : "camera_0", 
      "format" : "circular_fisheye_opt", 
      "width" : 1920, 
      "height" : 1440, 
      "fu_mean" : 1035.6, 
      "fu_variance" : 24130.5, 
      "fv_mean" : 1046.15, 
      "fv_variance" : 24624.7, 
      "cu_mean" : 885.214, 
      "cu_variance" : 36864, 
      "cv_mean" : 814.362, 
      "cv_variance" : 36864, 
      "distorta_mean" : 0, 
      "distorta_variance" : 0, 
      "distortb_mean" : -0.401758, 
      "distortb_variance" : 0.0403524, 
      "distortc_mean" : 0, 
      "distortc_variance" : 0
    },
    ...
    {
      "name" : "camera_3", 
      "format" : "circular_fisheye_opt", 
      "width" : 1920, 
      "height" : 1440, 
      "fu_mean" : 1035.6, 
      "fu_variance" : 24130.5, 
      "fv_mean" : 1046.15, 
      "fv_variance" : 24624.7, 
      "cu_mean" : 974.67, 
      "cu_variance" : 36864, 
      "cv_mean" : 690.774, 
      "cv_variance" : 36864, 
      "distorta_mean" : 0, 
      "distorta_variance" : 0, 
      "distortb_mean" : -0.311981, 
      "distortb_variance" : 0.024333, 
      "distortc_mean" : 0, 
      "distortc_variance" : 0
    }
  ]
}

"epipolar" algorithm (experimental, used for debugging purposes):

The "epipolar" algorithm takes a calibrated pano definition and a list of frames, of input points and/or matched input points to produce pictures showing:

  • a spherical grid of points (optional)
  • the location of points or point pairs
  • the epipolar curves corresponding to the points or point pairs
  • the estimated depth of point pairs

and outputs on the "Info" logger level the estimated error-free stitching distance. The point pairs can be automatically detected and matched.

It also computes the rig minimum stitching distance (i.e. sphere scale) where points seen by at least 2 cameras become visible by only one as a floating metric point value in the output JSON file, and saves a equirectangular gray-level 8 bits picture "output_min_stitching_distance.png" (which size is given by the output panorama size of the input panorama definition) where one intensity value is 1 cm. This scaling can be controlled by the image_max_output_depth parameter below.

Member Type Default value
list_frames list [0] The list of frames used for the optimization.
spherical_grid_radius double 0 If non-zero, generates a spherical grid of points with this radius.
auto_point_matching bool true If true, automatically matching points, in addition to the ones provided.
decimating_grid_size double 0.04 Grid size for decimation of matched points (w.r.t. picture size).
single_points list empty List of single points.
input_index int mandatory Element of single_points list: camera input index of a single point.
x float mandatory Element of single_points list: "x" coordinate of a single point in its input.
y float mandatory Element of single_points list: "y" coordinate of a single point in its input.
matched_points list empty List of matched pairs of points.
input_index0 int mandatory Element of matched_points list: camera input index of the first point in pair.
x0 float mandatory Element of matched_points list: "x" coordinate of the first point in a pair, in its input.
y0 float mandatory Element of matched_points list: "y" coordinate of the first point in a pair, in its input.
input_index1 int mandatory Element of matched_points list: camera input index of the second point in a pair.
x1 float mandatory Element of matched_points list: "x" coordinate of the second point in a pair, in its input.
y1 float mandatory Element of matched_points list: "y" coordinate of the second point in a pair, in its input.
image_max_output_depth float 2.55 Depth value mapped to 255 in the output depth image. Leaving it to 2.55 means that 255 will be mapped to 2.55 meters, i.e. one gray-level is 1 cm.

An example setting for the "epipolar" algorithm is:

{
  "algorithms" : [
    {
      "name": "epipolar",
      "config" : {
        "auto_point_selection" : false,
        "single_points" : [
          {
            "input_index" : 0,
            "x" : 1000,
            "y" : 720
          },
          {
            "input_index" : 1,
            "x" : 600,
            "y" : 600
          }
        ],
        "matched_points" : [
          {
            "input_index0" : 3,
            "x0" : 144,
            "y0" : 445,
            "input_index1" : 2,
            "x1" : 76,
            "y1" : 931
          }
        ],
        "list_frames" : [
          1,
          200
        ]
      }
    }
  ]
}

An example JSON output is:

{
  "minStitchingDistance" : 0.073859989643096924
}

"scoring" algorithm:

The "scoring" algorithm takes a calibrated pano definition and produces a list of scores per frame: a score between 0 and 1, based on the normalized cross correlation of input pairs in overlap in the equirectangular projection, and the ratio of uncovered pixels (i.e. holes in the projection).

Member Type Default value
output string none The name of the JSON file receiving the results of the algorithm.
first_frame int 0 First frame number for the scoring.
last_frame int 0 Last frame number for the scoring.

An example setting for the "scoring" algorithm is:

   {
     "algorithms" : [
       {
         "name": "scoring",
         "output": "output_scoring.ptv",
         "config" : {
           "first_frame" : 0,
           "last_frame" : 0
         }
       }
     ]
   }

An example output of the "scoring" algorithm is:

  [
    {
      "frame_number" : 0,
      "score" : 0.609535,
      "uncovered_ratio" : 0.0705967
    },
    {
      "frame_number" : 1,
      "score" : 0.609598,
      "uncovered_ratio" : 0.0705967
    },
    {
      "frame_number" : 2,
      "score" : 0.609391,
      "uncovered_ratio" : 0.0705967
    }
  ]