Skip to content

i.ann.maskrcnn.train

Train your Mask R-CNN network

i.ann.maskrcnn.train [-esbn] training_dataset=name [model=name] classes=string [,string,...] logs=name name=string [epochs=integer] [steps_per_epoch=integer] [rois_per_image=integer] [images_per_gpu=integer] [gpu_count=integer] [mini_mask_size=integer [,integer,...]] [validation_steps=integer] [images_min_dim=integer] [images_max_dim=integer] [backbone=string] [--verbose] [--quiet] [--qq] [--ui]

Example:

i.ann.maskrcnn.train training_dataset=name classes=string logs=name name=string

grass.script.run_command("i.ann.maskrcnn.train", training_dataset, model=None, classes, logs, name, epochs=200, steps_per_epoch=3000, rois_per_image=64, images_per_gpu=1, gpu_count=1, mini_mask_size=None, validation_steps=100, images_min_dim=256, images_max_dim=1280, backbone="resnet101", flags=None, verbose=False, quiet=False, superquiet=False)

Example:

gs.run_command("i.ann.maskrcnn.train", training_dataset="name", classes="string", logs="name", name="string")

Parameters

training_dataset=name [required]
    Path to the dataset with images and masks
    Name of input directory
model=name
    Path to the .h5 file to use as initial values
    Keep empty to train from a scratch
classes=string [,string,...] [required]
    Names of classes separated with ","
logs=name [required]
    Path to the directory in which will be models saved
    Name of input directory
name=string [required]
    Name for output models
epochs=integer
    Number of epochs
    Default: 200
steps_per_epoch=integer
    Steps per each epoch
    Default: 3000
rois_per_image=integer
    How many ROIs train per image
    Default: 64
images_per_gpu=integer
    Number of images per GPU
    Bigger number means faster training but needs a bigger GPU
    Default: 1
gpu_count=integer
    Number of GPUs to be used
    Default: 1
mini_mask_size=integer [,integer,...]
    Size of mini mask separated with ","
    To use full sized masks, keep empty. Mini mask saves memory at the expense of precision
validation_steps=integer
    Number of validation steps
    Bigger number means more accurate estimation of the model precision
    Default: 100
images_min_dim=integer
    Minimum length of images sides
    Images will be resized to have their shortest side at least of this value (has to be a multiple of 64)
    Default: 256
images_max_dim=integer
    Maximum length of images sides
    Images will be resized to have their longest side of this value (has to be a multiple of 64)
    Default: 1280
backbone=string
    Backbone architecture
    Allowed values: resnet50, resnet101
    Default: resnet101
-e
    Pretrained weights were trained on another classes / resolution / sizes
-s
    Do not use 10 % of images and save their list to logs dir
-b
    Train also batch normalization layers (not recommended for small batches)
-n
    No resizing or padding of images (images must be of the same size)
--help
    Print usage summary
--verbose
    Verbose module output
--quiet
    Quiet module output
--qq
    Very quiet module output
--ui
    Force launching GUI dialog

training_dataset : str, required
    Path to the dataset with images and masks
    Name of input directory
    Used as: input, dir, name
model : str, optional
    Path to the .h5 file to use as initial values
    Keep empty to train from a scratch
    Used as: input, file, name
classes : str | list[str], required
    Names of classes separated with ","
logs : str, required
    Path to the directory in which will be models saved
    Name of input directory
    Used as: input, dir, name
name : str, required
    Name for output models
epochs : int, optional
    Number of epochs
    Default: 200
steps_per_epoch : int, optional
    Steps per each epoch
    Default: 3000
rois_per_image : int, optional
    How many ROIs train per image
    Default: 64
images_per_gpu : int, optional
    Number of images per GPU
    Bigger number means faster training but needs a bigger GPU
    Default: 1
gpu_count : int, optional
    Number of GPUs to be used
    Default: 1
mini_mask_size : int | list[int] | str, optional
    Size of mini mask separated with ","
    To use full sized masks, keep empty. Mini mask saves memory at the expense of precision
validation_steps : int, optional
    Number of validation steps
    Bigger number means more accurate estimation of the model precision
    Default: 100
images_min_dim : int, optional
    Minimum length of images sides
    Images will be resized to have their shortest side at least of this value (has to be a multiple of 64)
    Default: 256
images_max_dim : int, optional
    Maximum length of images sides
    Images will be resized to have their longest side of this value (has to be a multiple of 64)
    Default: 1280
backbone : str, optional
    Backbone architecture
    Allowed values: resnet50, resnet101
    Default: resnet101
flags : str, optional
    Allowed values: e, s, b, n
    e
        Pretrained weights were trained on another classes / resolution / sizes
    s
        Do not use 10 % of images and save their list to logs dir
    b
        Train also batch normalization layers (not recommended for small batches)
    n
        No resizing or padding of images (images must be of the same size)
verbose: bool, optional
    Verbose module output
    Default: False
quiet: bool, optional
    Quiet module output
    Default: False
superquiet: bool, optional
    Very quiet module output
    Default: False

DESCRIPTION

i.ann.maskrcnn.train allows the user to train a Mask R-CNN model on his own dataset. The dataset has to be prepared in a predefined structure.

DATASET STRUCTURE

Training dataset should be in the following structure:

dataset-directory

  • imagenumber
  • imagenumber.jpg (training image)
  • imagenumber-class1-number.png (mask for one instance of class1)
  • imagenumber-class1-number.png (mask for another instance of class1)
  • ...
  • imagenumber2
  • imagenumber2.jpg
  • imagenumber2-class1-number.png (mask for one instance of class1)
  • imagenumber2-class2-number.png (mask for another class instance)
  • ...

The described structure of directories is required. Pictures must be *.jpg files with 3 channels (for example RGB), masks must be *.png files consisting of numbers between 1 and 255 (object instance) and 0s (elsewhere). A mask file for each instance of an object should be provided separately distinguished by the suffix number.

NOTES

If you are using initial weights (the model parameter), epochs are divided into three segments. Firstly training layers 5+, then fine-tuning layers 4+ and the last segment is fine-tuning the whole architecture. Ending number of epochs is shown for your segment, not for the whole training.

The usage of the -b flag will result in an activation of batch normalization layers training. By default, this option is set to False, as it is not recommended to train them when using just small batches (batch is defined by the images_per_gpu parameter).

If the dataset consists of images of the same size, the user may use the -n flag to avoid resizing or padding of images. When the flag is not used, images are resized to have their longer side equal to the value of the images_max_dim parameter and the shorter side longer or equal to the value of the images_min_dim parameter and zero-padded to be of shape

images_max_dim x images_max_dim

. It results in the fact that even images of different sizes may be used.

After each epoch, the current model is saved. It allows the user to stop the training when he feels satisfied with loss functions. It also allows the user to test models even during the training (and, again, stop it even before the last epoch).

EXAMPLES

Dataset for examples:

crops

  • 000000
  • 000000.jpg
  • 000000-corn-0.png
  • 000000-corn-1.png
  • ...
  • 000001
  • 000001.jpg
  • 000001-corn-0.png
  • 000001-rice-0.png
  • ...

Training from scratch

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops

After default number of epochs, we will get a model where the first class is trained to detect corn fields and the second one to detect rice fields.

If we use the command with reversed classes order, we will get a model where the first class is trained to detect rice fields and the second one to detect corn fields.

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=rice,corn logs=/home/user/Documents/logs name=crops

The name of the model does not have to be the same as the dataset folder but should be referring to the task of the dataset. A good name for this one (referring also to the order of classes) could be also this one:

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=rice,corn logs=/home/user/Documents/logs name=rice_corn

Training from a pretrained model

We can use a pretrained model to make our training faster. It is necessary for the model to be trained on the same channels and similar features, but it does not have to be the same ones (e.g. model trained on swimming pools in maps can be used for a training on buildings in maps).

A model trained on different classes (use -e flag to exclude head weights).

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/buildings.h5 -e

A model trained on the same classes.

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/corn_rice.h5

Fine-tuning a model

It is also possible to stop your training and then continue. To continue in the training, just use the last saved epoch as a pretrained model.

i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/mask_rcnn_crops_0005.h5

SEE ALSO

Mask R-CNN in GRASS GIS, i.ann.maskrcnn.detect

AUTHOR

Ondrej Pesek

SOURCE CODE

Available at: i.ann.maskrcnn.train source code (history)
Latest change: Friday Feb 21 10:10:05 2025 in commit 7d78fe3