pygrass.modules package¶
Subpackages¶
- pygrass.modules.grid package
- pygrass.modules.interface package
- Submodules
- pygrass.modules.interface.docstring module
- pygrass.modules.interface.env module
- pygrass.modules.interface.flag module
- pygrass.modules.interface.module module
- pygrass.modules.interface.parameter module
- pygrass.modules.interface.read module
- pygrass.modules.interface.typedict module
- Module contents
Submodules¶
pygrass.modules.shortcuts module¶
-
class
pygrass.modules.shortcuts.
MetaModule
(prefix, cls=None)[source]¶ Bases:
object
Example how to use MetaModule
>>> g = MetaModule('g') >>> g_list = g.list >>> g_list.name 'g.list' >>> g_list.required ['type'] >>> g_list.inputs.type = 'raster' >>> g_list.inputs.mapset = 'PERMANENT' >>> g_list.stdout_ = -1 >>> g_list.run() Module('g.list') >>> g_list.outputs.stdout '...basin...elevation...'
>>> r = MetaModule('r') >>> what = r.what >>> what.description 'Queries raster maps on their category values and category labels.' >>> what.inputs.map = 'elevation' >>> what.inputs.coordinates = [640000,220500] >>> what.run() >>> v = MetaModule('v') >>> v.import File "<doctest grass.pygrass.modules.shortcuts.MetaModule[16]>", line 1 v.import ^ SyntaxError: invalid syntax >>> v.import_ Module('v.import')
Module contents¶
-
class
pygrass.modules.
Module
(cmd, *args, **kargs)[source]¶ Bases:
object
This class is design to wrap/run/interact with the GRASS modules.
The class during the init phase read the XML description generate using the
--interface-description
in order to understand which parameters are required which optionals.>>> from grass.pygrass.modules import Module >>> from subprocess import PIPE >>> import copy
>>> region = Module("g.region") >>> region.flags.p = True # set flags >>> region.flags.u = True >>> region.flags["3"].value = True # set numeric flags >>> region.get_bash() 'g.region format=plain -p -3 -u' >>> new_region = copy.deepcopy(region) >>> new_region.inputs.res = "10" >>> new_region.get_bash() 'g.region res=10 format=plain -p -3 -u'
>>> neighbors = Module("r.neighbors") >>> neighbors.inputs.input = "mapA" >>> neighbors.outputs.output = "mapB" >>> neighbors.inputs.size = 5 >>> neighbors.inputs.quantile = 0.5 >>> neighbors.get_bash() 'r.neighbors input=mapA size=5 method=average weighting_function=none quantile=0.5 nprocs=1 memory=300 output=mapB'
>>> new_neighbors1 = copy.deepcopy(neighbors) >>> new_neighbors1.inputs.input = "mapD" >>> new_neighbors1.inputs.size = 3 >>> new_neighbors1.inputs.quantile = 0.5 >>> new_neighbors1.get_bash() 'r.neighbors input=mapD size=3 method=average weighting_function=none quantile=0.5 nprocs=1 memory=300 output=mapB'
>>> new_neighbors2 = copy.deepcopy(neighbors) >>> new_neighbors2(input="mapD", size=3, run_=False) Module('r.neighbors') >>> new_neighbors2.get_bash() 'r.neighbors input=mapD size=3 method=average weighting_function=none quantile=0.5 nprocs=1 memory=300 output=mapB'
>>> neighbors = Module("r.neighbors") >>> neighbors.get_bash() 'r.neighbors size=3 method=average weighting_function=none nprocs=1 memory=300'
>>> new_neighbors3 = copy.deepcopy(neighbors) >>> new_neighbors3(input="mapA", size=3, output="mapB", run_=False) Module('r.neighbors') >>> new_neighbors3.get_bash() 'r.neighbors input=mapA size=3 method=average weighting_function=none nprocs=1 memory=300 output=mapB'
>>> mapcalc = Module( ... "r.mapcalc", expression="test_a = 1", overwrite=True, run_=False ... ) >>> mapcalc.run() Module('r.mapcalc') >>> mapcalc.returncode 0
>>> mapcalc = Module( ... "r.mapcalc", ... expression="test_a = 1", ... overwrite=True, ... run_=False, ... finish_=False, ... ) >>> mapcalc.run() Module('r.mapcalc') >>> p = mapcalc.wait() >>> p.returncode 0 >>> mapcalc.run() Module('r.mapcalc') >>> p = mapcalc.wait() >>> p.returncode 0
>>> colors = Module( ... "r.colors", ... map="test_a", ... rules="-", ... run_=False, ... stdout_=PIPE, ... stderr_=PIPE, ... stdin_="1 red", ... ) >>> colors.run() Module('r.colors') >>> p = mapcalc.wait() >>> p.returncode 0 >>> colors.inputs["stdin"].value '1 red' >>> colors.outputs["stdout"].value '' >>> colors.outputs["stderr"].value.strip() "Color table for raster map <test_a> set to 'rules'"
>>> colors = Module( ... "r.colors", map="test_a", rules="-", run_=False, finish_=False, stdin_=PIPE ... ) >>> colors.inputs["stdin"].value = "1 red" >>> colors.run() Module('r.colors') >>> colors.wait() Module('r.colors') >>> colors.returncode 0
>>> colors = Module( ... "r.colors", ... map="test_a", ... rules="-", ... run_=False, ... finish_=False, ... stdin_=PIPE, ... stderr_=PIPE, ... ) >>> colors.inputs["stdin"].value = "1 red" >>> colors.run() Module('r.colors') >>> colors.wait() Module('r.colors') >>> colors.outputs["stderr"].value.strip() "Color table for raster map <test_a> set to 'rules'"
>>> colors.returncode 0
Run a second time
>>> colors.inputs["stdin"].value = "1 red" >>> colors.run() Module('r.colors') >>> colors.wait() Module('r.colors') >>> colors.outputs["stderr"].value.strip() "Color table for raster map <test_a> set to 'rules'"
>>> colors.returncode 0
Run many times and change parameters for each run
>>> colors = Module("r.colors", map="test_a", color="ryb", run_=False) >>> colors.get_bash() 'r.colors map=test_a color=ryb offset=0.0 scale=1.0' >>> colors.run() Module('r.colors') >>> colors.update(color="gyr") >>> colors.run() Module('r.colors') >>> colors.update(color="ryg") >>> colors.update(stderr_=PIPE) >>> colors.run() Module('r.colors') >>> print(colors.outputs["stderr"].value.strip()) Color table for raster map <test_a> set to 'ryg' >>> colors.update(color="byg") >>> colors.update(stdout_=PIPE) >>> colors.run() Module('r.colors') >>> print(colors.outputs["stderr"].value.strip()) Color table for raster map <test_a> set to 'byg' >>> colors.get_bash() 'r.colors map=test_a color=byg offset=0.0 scale=1.0'
Often in the Module class you can find
*args
andkwargs
annotation in methods, like in the __call__ method. Python allow developers to not specify all the arguments and keyword arguments of a method or function.def f(*args): for arg in args: print arg
therefore if we call the function like:
>>> f("grass", "gis", "modules") grass gis modules
or we can define a new list:
>>> words = ["grass", "gis", "modules"] >>> f(*words) grass gis modules
we can do the same with keyword arguments, rewrite the above function:
def f(*args, **kargs): for arg in args: print arg for key, value in kargs.items(): print "%s = %r" % (key, value)
now we can use the new function, with:
>>> f("grass", "gis", "modules", os="linux", language="python") ... grass gis modules os = 'linux' language = 'python'
or, as before we can, define a dictionary and give the dictionary to the function, like:
>>> keywords = {"os": "linux", "language": "python"} >>> f(*words, **keywords) grass gis modules os = 'linux' language = 'python'
In the Module class we heavily use this language feature to pass arguments and keyword arguments to the grass module.
-
make_cmd
()[source]¶ Create the command string that can be executed in a shell
- Returns
the command string
-
run
()[source]¶ Run the module This function will wait for the process to terminate in case finish_==True and sets up stdout and stderr. If finish_==False this function will return after starting the process. Use wait() to wait for the started process
- Returns
A reference to this object
-
update
(*args, **kargs)[source]¶ Update module parameters and selected object attributes.
Valid parameters are all the module parameters and additional parameters, namely: run_, stdin_, stdout_, stderr_, env_, and finish_.
-
wait
()[source]¶ Wait for the module to finish. Call this method if the run() call was performed with self.false_ = False.
- Returns
A reference to this object
-
-
class
pygrass.modules.
MultiModule
(module_list, sync=True, set_temp_region=False)[source]¶ Bases:
object
This class is designed to run a list of modules in serial in the provided order within a temporary region environment.
Module can be run in serial synchronously or asynchronously.
- Synchronously: When calling run() all modules will run in serial order
until they are finished, The run() method will return until all modules finished. The modules objects can be accessed by calling get_modules() to check their return values.
- Asynchronously: When calling run() all modules will run in serial order in a
background process. Method run() will return after starting the modules without waiting for them to finish. The user must call the wait() method to wait for the modules to finish. Asynchronously called module can be optionally run in a temporary region environment, hence invoking g.region will not alter the current region or the region of other MultiModule runs.
Note:
Modules run in asynchronous mode can only be accessed via the wait() method. The wait() method will return all finished module objects as list.
Objects of this class can be passed to the ParallelModuleQueue to run serial stacks of modules in parallel. This is meaningful if region settings must be applied to each parallel module run.
>>> from grass.pygrass.modules import Module >>> from grass.pygrass.modules import MultiModule >>> from multiprocessing import Process >>> import copy
Synchronous module run
>>> region_1 = Module("g.region", run_=False) >>> region_1.flags.p = True >>> region_2 = copy.deepcopy(region_1) >>> region_2.flags.p = True >>> mm = MultiModule(module_list=[region_1, region_2]) >>> mm.run() >>> m_list = mm.get_modules() >>> m_list[0].returncode 0 >>> m_list[1].returncode 0
Asynchronous module run, setting finish = False
>>> region_1 = Module("g.region", run_=False) >>> region_1.flags.p = True >>> region_2 = copy.deepcopy(region_1) >>> region_2.flags.p = True >>> region_3 = copy.deepcopy(region_1) >>> region_3.flags.p = True >>> region_4 = copy.deepcopy(region_1) >>> region_4.flags.p = True >>> region_5 = copy.deepcopy(region_1) >>> region_5.flags.p = True >>> mm = MultiModule( ... module_list=[region_1, region_2, region_3, region_4, region_5], sync=False ... ) >>> t = mm.run() >>> isinstance(t, Process) True >>> m_list = mm.wait() >>> m_list[0].returncode 0 >>> m_list[1].returncode 0 >>> m_list[2].returncode 0 >>> m_list[3].returncode 0 >>> m_list[4].returncode 0
Asynchronous module run, setting finish = False and using temporary region
>>> mm = MultiModule( ... module_list=[region_1, region_2, region_3, region_4, region_5], ... sync=False, ... set_temp_region=True, ... ) >>> str(mm) 'g.region format=plain -p ; g.region format=plain -p ; g.region format=plain -p ; g.region format=plain -p ; g.region format=plain -p' >>> t = mm.run() >>> isinstance(t, Process) True >>> m_list = mm.wait() >>> m_list[0].returncode 0 >>> m_list[1].returncode 0 >>> m_list[2].returncode 0 >>> m_list[3].returncode 0 >>> m_list[4].returncode 0
Constructor of the multi module class
- Parameters
module_list – A list of pre-configured Module objects that should be run
sync – If set True the run() method will wait for all processes to finish -> synchronously run. If set False, the run() method will return after starting the processes -> asynchronously run. The wait() method must be called to finish the modules.
set_temp_region –
Set a temporary region in which the modules should be run, hence region settings in the process list will not affect the current computation region.
Note:
This flag is only available in asynchronous mode!
- Returns
-
get_modules
()[source]¶ Return the list of modules that have been run in synchronous mode
Note: Asynchronously run module can only be accessed via the wait() method.
- Returns
The list of modules
-
run
()[source]¶ Start the modules in the list. If self.finished_ is set True this method will return after all processes finished.
If self.finish_ is set False, this method will return after the process list was started for execution. In a background process, the processes in the list will be run one after the another.
- Returns
None in case of self.finish_ is True, otherwise a multiprocessing.Process object that invokes the modules
-
class
pygrass.modules.
ParallelModuleQueue
(nprocs=1)[source]¶ Bases:
object
This class is designed to run an arbitrary number of pygrass Module or MultiModule processes in parallel.
Objects of type grass.pygrass.modules.Module or grass.pygrass.modules.MultiModule can be put into the queue using put() method. When the queue is full with the maximum number of parallel processes it will wait for all processes to finish, sets the stdout and stderr of the Module object and removes it from the queue when its finished.
To finish the queue before the maximum number of parallel processes was reached call wait() .
This class will raise a GrassError in case a Module process exits with a return code other than 0.
Processes that were run asynchronously with the MultiModule class will not raise a GrassError in case of failure. This must be manually checked by accessing finished modules by calling get_finished_modules().
Usage:
Check with a queue size of 3 and 5 processes
>>> import copy >>> from grass.pygrass.modules import Module, MultiModule, ParallelModuleQueue >>> mapcalc_list = []
Setting run_ to False is important, otherwise a parallel processing is not possible
>>> mapcalc = Module("r.mapcalc", overwrite=True, run_=False) >>> queue = ParallelModuleQueue(nprocs=3) >>> for i in range(5): ... new_mapcalc = copy.deepcopy(mapcalc) ... mapcalc_list.append(new_mapcalc) ... m = new_mapcalc(expression="test_pygrass_%i = %i" % (i, i)) ... queue.put(m) ... >>> queue.wait() >>> mapcalc_list = queue.get_finished_modules() >>> queue.get_num_run_procs() 0 >>> queue.get_max_num_procs() 3 >>> for mapcalc in mapcalc_list: ... print(mapcalc.returncode) ... 0 0 0 0 0
Check with a queue size of 8 and 5 processes
>>> queue = ParallelModuleQueue(nprocs=8) >>> mapcalc_list = [] >>> for i in range(5): ... new_mapcalc = copy.deepcopy(mapcalc) ... mapcalc_list.append(new_mapcalc) ... m = new_mapcalc(expression="test_pygrass_%i = %i" % (i, i)) ... queue.put(m) ... >>> queue.wait() >>> mapcalc_list = queue.get_finished_modules() >>> queue.get_num_run_procs() 0 >>> queue.get_max_num_procs() 8 >>> for mapcalc in mapcalc_list: ... print(mapcalc.returncode) ... 0 0 0 0 0
Check MultiModule approach with three by two processes running in a background process
>>> gregion = Module("g.region", flags="p", run_=False) >>> queue = ParallelModuleQueue(nprocs=3) >>> proc_list = [] >>> for i in range(3): ... new_gregion = copy.deepcopy(gregion) ... proc_list.append(new_gregion) ... new_mapcalc = copy.deepcopy(mapcalc) ... m = new_mapcalc(expression="test_pygrass_%i = %i" % (i, i)) ... proc_list.append(new_mapcalc) ... mm = MultiModule( ... module_list=[new_gregion, new_mapcalc], sync=False, set_temp_region=True ... ) ... queue.put(mm) ... >>> queue.wait() >>> proc_list = queue.get_finished_modules() >>> queue.get_num_run_procs() 0 >>> queue.get_max_num_procs() 3 >>> for proc in proc_list: ... print(proc.returncode) ... 0 0 0 0 0 0
Check with a queue size of 8 and 4 processes
>>> queue = ParallelModuleQueue(nprocs=8) >>> mapcalc_list = [] >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_1 =1") >>> queue.put(m) >>> queue.get_num_run_procs() 1 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_2 =2") >>> queue.put(m) >>> queue.get_num_run_procs() 2 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_3 =3") >>> queue.put(m) >>> queue.get_num_run_procs() 3 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_4 =4") >>> queue.put(m) >>> queue.get_num_run_procs() 4 >>> queue.wait() >>> mapcalc_list = queue.get_finished_modules() >>> queue.get_num_run_procs() 0 >>> queue.get_max_num_procs() 8 >>> for mapcalc in mapcalc_list: ... print(mapcalc.returncode) ... 0 0 0 0
Check with a queue size of 3 and 4 processes
>>> queue = ParallelModuleQueue(nprocs=3) >>> mapcalc_list = [] >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_1 =1") >>> queue.put(m) >>> queue.get_num_run_procs() 1 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_2 =2") >>> queue.put(m) >>> queue.get_num_run_procs() 2 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_3 =3") >>> queue.put( ... m ... ) # Now it will wait until all procs finish and set the counter back to 0 >>> queue.get_num_run_procs() 0 >>> new_mapcalc = copy.deepcopy(mapcalc) >>> mapcalc_list.append(new_mapcalc) >>> m = new_mapcalc(expression="test_pygrass_%i = %i" % (i, i)) >>> queue.put(m) >>> queue.get_num_run_procs() 1 >>> queue.wait() >>> mapcalc_list = queue.get_finished_modules() >>> queue.get_num_run_procs() 0 >>> queue.get_max_num_procs() 3 >>> for mapcalc in mapcalc_list: ... print(mapcalc.returncode) ... 0 0 0 0
Constructor
- Parameters
nprocs (int) – The maximum number of Module processes that can be run in parallel, default is 1, if None then use all the available CPUs.
-
get
(num)[source]¶ Get a Module object or list of Module objects from the queue
- Parameters
num (int) – the number of the object in queue
- Returns
the Module object or list of Module objects or None if num is not in the queue
-
get_finished_modules
()[source]¶ Return all finished processes that were run by this queue
- Returns
A list of Module objects
-
get_max_num_procs
()[source]¶ Return the maximum number of parallel Module processes
- Returns
the maximum number of parallel Module processes
-
get_num_run_procs
()[source]¶ Get the number of Module processes that are in the queue running or finished
- Returns
the number fo Module processes running/finished in the queue
-
put
(module)[source]¶ Put the next Module or MultiModule object in the queue
To run the Module objects in parallel the run_ and finish_ options of the Module must be set to False.
- Parameters
module (Module or MultiModule object) – a preconfigured Module or MultiModule object that were configured with run_ and finish_ set to False,