i.ann.maskrcnn.train
Train your Mask R-CNN network
i.ann.maskrcnn.train [-esbn] training_dataset=name [model=name] classes=string [,string,...] logs=name name=string [epochs=integer] [steps_per_epoch=integer] [rois_per_image=integer] [images_per_gpu=integer] [gpu_count=integer] [mini_mask_size=integer [,integer,...]] [validation_steps=integer] [images_min_dim=integer] [images_max_dim=integer] [backbone=string] [--verbose] [--quiet] [--qq] [--ui]
Example:
i.ann.maskrcnn.train training_dataset=name classes=string logs=name name=string
grass.script.run_command("i.ann.maskrcnn.train", training_dataset, model=None, classes, logs, name, epochs=200, steps_per_epoch=3000, rois_per_image=64, images_per_gpu=1, gpu_count=1, mini_mask_size=None, validation_steps=100, images_min_dim=256, images_max_dim=1280, backbone="resnet101", flags=None, verbose=False, quiet=False, superquiet=False)
Example:
gs.run_command("i.ann.maskrcnn.train", training_dataset="name", classes="string", logs="name", name="string")
Parameters
training_dataset=name [required]
Path to the dataset with images and masks
Name of input directory
model=name
Path to the .h5 file to use as initial values
Keep empty to train from a scratch
classes=string [,string,...] [required]
Names of classes separated with ","
logs=name [required]
Path to the directory in which will be models saved
Name of input directory
name=string [required]
Name for output models
epochs=integer
Number of epochs
Default: 200
steps_per_epoch=integer
Steps per each epoch
Default: 3000
rois_per_image=integer
How many ROIs train per image
Default: 64
images_per_gpu=integer
Number of images per GPU
Bigger number means faster training but needs a bigger GPU
Default: 1
gpu_count=integer
Number of GPUs to be used
Default: 1
mini_mask_size=integer [,integer,...]
Size of mini mask separated with ","
To use full sized masks, keep empty. Mini mask saves memory at the expense of precision
validation_steps=integer
Number of validation steps
Bigger number means more accurate estimation of the model precision
Default: 100
images_min_dim=integer
Minimum length of images sides
Images will be resized to have their shortest side at least of this value (has to be a multiple of 64)
Default: 256
images_max_dim=integer
Maximum length of images sides
Images will be resized to have their longest side of this value (has to be a multiple of 64)
Default: 1280
backbone=string
Backbone architecture
Allowed values: resnet50, resnet101
Default: resnet101
-e
Pretrained weights were trained on another classes / resolution / sizes
-s
Do not use 10 % of images and save their list to logs dir
-b
Train also batch normalization layers (not recommended for small batches)
-n
No resizing or padding of images (images must be of the same size)
--help
Print usage summary
--verbose
Verbose module output
--quiet
Quiet module output
--qq
Very quiet module output
--ui
Force launching GUI dialog
training_dataset : str, required
Path to the dataset with images and masks
Name of input directory
Used as: input, dir, name
model : str, optional
Path to the .h5 file to use as initial values
Keep empty to train from a scratch
Used as: input, file, name
classes : str | list[str], required
Names of classes separated with ","
logs : str, required
Path to the directory in which will be models saved
Name of input directory
Used as: input, dir, name
name : str, required
Name for output models
epochs : int, optional
Number of epochs
Default: 200
steps_per_epoch : int, optional
Steps per each epoch
Default: 3000
rois_per_image : int, optional
How many ROIs train per image
Default: 64
images_per_gpu : int, optional
Number of images per GPU
Bigger number means faster training but needs a bigger GPU
Default: 1
gpu_count : int, optional
Number of GPUs to be used
Default: 1
mini_mask_size : int | list[int] | str, optional
Size of mini mask separated with ","
To use full sized masks, keep empty. Mini mask saves memory at the expense of precision
validation_steps : int, optional
Number of validation steps
Bigger number means more accurate estimation of the model precision
Default: 100
images_min_dim : int, optional
Minimum length of images sides
Images will be resized to have their shortest side at least of this value (has to be a multiple of 64)
Default: 256
images_max_dim : int, optional
Maximum length of images sides
Images will be resized to have their longest side of this value (has to be a multiple of 64)
Default: 1280
backbone : str, optional
Backbone architecture
Allowed values: resnet50, resnet101
Default: resnet101
flags : str, optional
Allowed values: e, s, b, n
e
Pretrained weights were trained on another classes / resolution / sizes
s
Do not use 10 % of images and save their list to logs dir
b
Train also batch normalization layers (not recommended for small batches)
n
No resizing or padding of images (images must be of the same size)
verbose: bool, optional
Verbose module output
Default: False
quiet: bool, optional
Quiet module output
Default: False
superquiet: bool, optional
Very quiet module output
Default: False
DESCRIPTION
i.ann.maskrcnn.train allows the user to train a Mask R-CNN model on his own dataset. The dataset has to be prepared in a predefined structure.
DATASET STRUCTURE
Training dataset should be in the following structure:
dataset-directory
- imagenumber
- imagenumber.jpg (training image)
- imagenumber-class1-number.png (mask for one instance of class1)
- imagenumber-class1-number.png (mask for another instance of class1)
- ...
- imagenumber2
- imagenumber2.jpg
- imagenumber2-class1-number.png (mask for one instance of class1)
- imagenumber2-class2-number.png (mask for another class instance)
- ...
The described structure of directories is required. Pictures must be *.jpg files with 3 channels (for example RGB), masks must be *.png files consisting of numbers between 1 and 255 (object instance) and 0s (elsewhere). A mask file for each instance of an object should be provided separately distinguished by the suffix number.
NOTES
If you are using initial weights (the model parameter), epochs are divided into three segments. Firstly training layers 5+, then fine-tuning layers 4+ and the last segment is fine-tuning the whole architecture. Ending number of epochs is shown for your segment, not for the whole training.
The usage of the -b flag will result in an activation of batch normalization layers training. By default, this option is set to False, as it is not recommended to train them when using just small batches (batch is defined by the images_per_gpu parameter).
If the dataset consists of images of the same size, the user may use the -n flag to avoid resizing or padding of images. When the flag is not used, images are resized to have their longer side equal to the value of the images_max_dim parameter and the shorter side longer or equal to the value of the images_min_dim parameter and zero-padded to be of shape
images_max_dim x images_max_dim
. It results in the fact that even images of different sizes may be used.
After each epoch, the current model is saved. It allows the user to stop the training when he feels satisfied with loss functions. It also allows the user to test models even during the training (and, again, stop it even before the last epoch).
EXAMPLES
Dataset for examples:
crops
- 000000
- 000000.jpg
- 000000-corn-0.png
- 000000-corn-1.png
- ...
- 000001
- 000001.jpg
- 000001-corn-0.png
- 000001-rice-0.png
- ...
Training from scratch
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops
After default number of epochs, we will get a model where the first class is trained to detect corn fields and the second one to detect rice fields.
If we use the command with reversed classes order, we will get a model where the first class is trained to detect rice fields and the second one to detect corn fields.
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=rice,corn logs=/home/user/Documents/logs name=crops
The name of the model does not have to be the same as the dataset folder but should be referring to the task of the dataset. A good name for this one (referring also to the order of classes) could be also this one:
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=rice,corn logs=/home/user/Documents/logs name=rice_corn
Training from a pretrained model
We can use a pretrained model to make our training faster. It is necessary for the model to be trained on the same channels and similar features, but it does not have to be the same ones (e.g. model trained on swimming pools in maps can be used for a training on buildings in maps).
A model trained on different classes (use -e flag to exclude head weights).
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/buildings.h5 -e
A model trained on the same classes.
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/corn_rice.h5
Fine-tuning a model
It is also possible to stop your training and then continue. To continue in the training, just use the last saved epoch as a pretrained model.
i.ann.maskrcnn.train training_dataset=/home/user/Documents/crops classes=corn,rice logs=/home/user/Documents/logs name=crops model=/home/user/Documents/models/mask_rcnn_crops_0005.h5
SEE ALSO
Mask R-CNN in GRASS GIS, i.ann.maskrcnn.detect
AUTHOR
Ondrej Pesek
SOURCE CODE
Available at: i.ann.maskrcnn.train source code
(history)
Latest change: Friday Feb 21 10:10:05 2025 in commit 7d78fe3