r.in.xyz

Creates a raster map from an assemblage of many coordinates using univariate statistics.

Command linePython (grass.tools)Python (grass.script)

r.in.xyz [-sgi] input=name output=name [method=string] [separator=character] [x=integer] [y=integer] [z=integer] [skip=integer] [zrange=min,max] [zscale=float] [value_column=integer] [vrange=min,max] [vscale=float] [type=string] [percent=integer] [pth=integer] [trim=float] [--overwrite] [--verbose] [--quiet] [--qq] [--ui]

Example:

r.in.xyz input=name output=name

grass.tools.Tools.r_in_xyz(input, output, method="mean", separator="pipe", x=1, y=2, z=3, skip=0, zrange=None, zscale=1.0, value_column=0, vrange=None, vscale=1.0, type="FCELL", percent=100, pth=None, trim=None, flags=None, overwrite=None, verbose=None, quiet=None, superquiet=None)

Example:

tools = Tools()
tools.r_in_xyz(input="name", output="name")

This grass.tools API is experimental in version 8.5 and expected to be stable in version 8.6.

grass.script.run_command("r.in.xyz", input, output, method="mean", separator="pipe", x=1, y=2, z=3, skip=0, zrange=None, zscale=1.0, value_column=0, vrange=None, vscale=1.0, type="FCELL", percent=100, pth=None, trim=None, flags=None, overwrite=None, verbose=None, quiet=None, superquiet=None)

Example:

gs.run_command("r.in.xyz", input="name", output="name")

Parameters

Command linePython (grass.tools)Python (grass.script)

input=name [required]
    ASCII file containing input data (or "-" to read from stdin)
output=name [required]
    Name for output raster map
method=string
    Statistic to use for raster values
    Allowed values: n, min, max, range, sum, mean, stddev, variance, coeff_var, median, percentile, skewness, trimmean
    Default: mean
    n: Number of points in cell
    min: Minimum value of point values in cell
    max: Maximum value of point values in cell
    range: Range of point values in cell
    sum: Sum of point values in cell
    mean: Mean (average) value of point values in cell
    stddev: Standard deviation of point values in cell
    variance: Variance of point values in cell
    coeff_var: Coefficient of variance of point values in cell
    median: Median value of point values in cell
    percentile: Pth (nth) percentile of point values in cell
    skewness: Skewness of point values in cell
    trimmean: Trimmed mean of point values in cell
separator=character
    Field separator
    Special characters: pipe, comma, space, tab, newline
    Default: pipe
x=integer
    Column number of x coordinates in input file (first column is 1)
    Default: 1
y=integer
    Column number of y coordinates in input file
    Default: 2
z=integer
    Column number of data values in input file
    If a separate value column is given, this option refers to the z-coordinate column to be filtered by the zrange option
    Default: 3
skip=integer
    Number of header lines to skip at top of input file
    Default: 0
zrange=min,max
    Filter range for z data (min,max)
zscale=float
    Scale to apply to z data
    Default: 1.0
value_column=integer
    Alternate column number of data values in input file
    If not given (or set to 0) the z-column data is used
    Default: 0
vrange=min,max
    Filter range for alternate value column data (min,max)
vscale=float
    Scale to apply to alternate value column data
    Default: 1.0
type=string
    Type of raster map to be created
    Storage type for resultant raster map
    Allowed values: CELL, FCELL, DCELL
    Default: FCELL
    CELL: Integer
    FCELL: Single precision floating point
    DCELL: Double precision floating point
percent=integer
    Percent of map to keep in memory
    Allowed values: 1-100
    Default: 100
pth=integer
    Pth percentile of the values
    Allowed values: 1-100
trim=float
    Discard <trim> percent of the smallest and <trim> percent of the largest observations
    Allowed values: 0-50
-s
    Scan data file for extent then exit
-g
    In scan mode, print using shell script style
-i
    Ignore broken lines
--overwrite
    Allow output files to overwrite existing files
--help
    Print usage summary
--verbose
    Verbose module output
--quiet
    Quiet module output
--qq
    Very quiet module output
--ui
    Force launching GUI dialog

input : str, required
    ASCII file containing input data (or "-" to read from stdin)
    Used as: input, file, name
output : str | type(np.ndarray) | type(np.array) | type(gs.array.array), required
    Name for output raster map
    Used as: output, raster, name
method : str, optional
    Statistic to use for raster values
    Allowed values: n, min, max, range, sum, mean, stddev, variance, coeff_var, median, percentile, skewness, trimmean
    n: Number of points in cell
    min: Minimum value of point values in cell
    max: Maximum value of point values in cell
    range: Range of point values in cell
    sum: Sum of point values in cell
    mean: Mean (average) value of point values in cell
    stddev: Standard deviation of point values in cell
    variance: Variance of point values in cell
    coeff_var: Coefficient of variance of point values in cell
    median: Median value of point values in cell
    percentile: Pth (nth) percentile of point values in cell
    skewness: Skewness of point values in cell
    trimmean: Trimmed mean of point values in cell
    Default: mean
separator : str, optional
    Field separator
    Special characters: pipe, comma, space, tab, newline
    Used as: input, separator, character
    Default: pipe
x : int, optional
    Column number of x coordinates in input file (first column is 1)
    Default: 1
y : int, optional
    Column number of y coordinates in input file
    Default: 2
z : int, optional
    Column number of data values in input file
    If a separate value column is given, this option refers to the z-coordinate column to be filtered by the zrange option
    Default: 3
skip : int, optional
    Number of header lines to skip at top of input file
    Default: 0
zrange : tuple[float, float] | list[float] | str, optional
    Filter range for z data (min,max)
    Used as: min,max
zscale : float, optional
    Scale to apply to z data
    Default: 1.0
value_column : int, optional
    Alternate column number of data values in input file
    If not given (or set to 0) the z-column data is used
    Default: 0
vrange : tuple[float, float] | list[float] | str, optional
    Filter range for alternate value column data (min,max)
    Used as: min,max
vscale : float, optional
    Scale to apply to alternate value column data
    Default: 1.0
type : str, optional
    Type of raster map to be created
    Storage type for resultant raster map
    Allowed values: CELL, FCELL, DCELL
    CELL: Integer
    FCELL: Single precision floating point
    DCELL: Double precision floating point
    Default: FCELL
percent : int, optional
    Percent of map to keep in memory
    Allowed values: 1-100
    Default: 100
pth : int, optional
    Pth percentile of the values
    Allowed values: 1-100
trim : float, optional
    Discard <trim> percent of the smallest and <trim> percent of the largest observations
    Allowed values: 0-50
flags : str, optional
    Allowed values: s, g, i
    s
        Scan data file for extent then exit
    g
        In scan mode, print using shell script style
    i
        Ignore broken lines
overwrite : bool, optional
    Allow output files to overwrite existing files
    Default: None
verbose : bool, optional
    Verbose module output
    Default: None
quiet : bool, optional
    Quiet module output
    Default: None
superquiet : bool, optional
    Very quiet module output
    Default: None

Returns:

result : grass.tools.support.ToolResult | np.ndarray | tuple[np.ndarray] | None
If the tool produces text as standard output, a ToolResult object will be returned. Otherwise, None will be returned. If an array type (e.g., np.ndarray) is used for one of the raster outputs, the result will be an array and will have the shape corresponding to the computational region. If an array type is used for more than one raster output, the result will be a tuple of arrays.

Raises:

grass.tools.ToolError: When the tool ended with an error.

input : str, required
    ASCII file containing input data (or "-" to read from stdin)
    Used as: input, file, name
output : str, required
    Name for output raster map
    Used as: output, raster, name
method : str, optional
    Statistic to use for raster values
    Allowed values: n, min, max, range, sum, mean, stddev, variance, coeff_var, median, percentile, skewness, trimmean
    n: Number of points in cell
    min: Minimum value of point values in cell
    max: Maximum value of point values in cell
    range: Range of point values in cell
    sum: Sum of point values in cell
    mean: Mean (average) value of point values in cell
    stddev: Standard deviation of point values in cell
    variance: Variance of point values in cell
    coeff_var: Coefficient of variance of point values in cell
    median: Median value of point values in cell
    percentile: Pth (nth) percentile of point values in cell
    skewness: Skewness of point values in cell
    trimmean: Trimmed mean of point values in cell
    Default: mean
separator : str, optional
    Field separator
    Special characters: pipe, comma, space, tab, newline
    Used as: input, separator, character
    Default: pipe
x : int, optional
    Column number of x coordinates in input file (first column is 1)
    Default: 1
y : int, optional
    Column number of y coordinates in input file
    Default: 2
z : int, optional
    Column number of data values in input file
    If a separate value column is given, this option refers to the z-coordinate column to be filtered by the zrange option
    Default: 3
skip : int, optional
    Number of header lines to skip at top of input file
    Default: 0
zrange : tuple[float, float] | list[float] | str, optional
    Filter range for z data (min,max)
    Used as: min,max
zscale : float, optional
    Scale to apply to z data
    Default: 1.0
value_column : int, optional
    Alternate column number of data values in input file
    If not given (or set to 0) the z-column data is used
    Default: 0
vrange : tuple[float, float] | list[float] | str, optional
    Filter range for alternate value column data (min,max)
    Used as: min,max
vscale : float, optional
    Scale to apply to alternate value column data
    Default: 1.0
type : str, optional
    Type of raster map to be created
    Storage type for resultant raster map
    Allowed values: CELL, FCELL, DCELL
    CELL: Integer
    FCELL: Single precision floating point
    DCELL: Double precision floating point
    Default: FCELL
percent : int, optional
    Percent of map to keep in memory
    Allowed values: 1-100
    Default: 100
pth : int, optional
    Pth percentile of the values
    Allowed values: 1-100
trim : float, optional
    Discard <trim> percent of the smallest and <trim> percent of the largest observations
    Allowed values: 0-50
flags : str, optional
    Allowed values: s, g, i
    s
        Scan data file for extent then exit
    g
        In scan mode, print using shell script style
    i
        Ignore broken lines
overwrite : bool, optional
    Allow output files to overwrite existing files
    Default: None
verbose : bool, optional
    Verbose module output
    Default: None
quiet : bool, optional
    Quiet module output
    Default: None
superquiet : bool, optional
    Very quiet module output
    Default: None

DESCRIPTION

The r.in.xyz module will load and bin ungridded x,y,z ASCII data into a new raster map. The user may choose from a variety of statistical methods in creating the new raster. Gridded data provided as a stream of x,y,z points may also be imported.

Please note that the current region extents and resolution are used for the import. It is therefore recommended to first use the -s flag to get the extents of the input points to be imported, then adjust the current region accordingly, and only then proceed with the actual import.

r.in.xyz is designed for processing massive point cloud datasets, for example raw LIDAR or sidescan sonar swath data. It has been tested with datasets as large as tens of billion of points (705GB in a single file).

Available statistics for populating the raster are (method):

n number of points in cell

min minimum value of points in cell

max maximum value of points in cell

range range of points in cell

sum sum of points in cell

mean average value of points in cell

stddev standard deviation of points in cell

variance variance of points in cell

coeff_var coefficient of variance of points in cell

median median value of points in cell

percentile p-th percentile of points in cell

skewness skewness of points in cell

trimmean trimmed mean of points in cell

Variance and derivatives use the biased estimator (n). [subject to change]
Coefficient of variance is given in percentage and defined as (stddev/mean)*100.

It is also possible to bin and store another data column (e.g. backscatter) while simultaneously filtering and scaling both the data column values and the z range.

NOTES

Gridded data

If data is known to be on a regular grid r.in.xyz can reconstruct the map perfectly as long as some care is taken to set up the region correctly and that the data's native map projection is used. A typical method would involve determining the grid resolution either by examining the data's associated documentation or by studying the text file. Next scan the data with r.in.xyz's -s (or -g) flag to find the input data's bounds. GRASS uses the cell-center raster convention where data points fall within the center of a cell, as opposed to the grid-node convention. Therefore you will need to grow the region out by half a cell in all directions beyond what the scan found in the file. After the region bounds and resolution are set correctly with g.region, run r.in.xyz using the n method and verify that n=1 at all places. r.univar can help. Once you are confident that the region exactly matches the data proceed to run r.in.xyz using one of the mean, min, max, or median methods. With n=1 throughout, the result should be identical regardless of which of those methods are used.

Memory use

While the input file can be arbitrarily large, r.in.xyz will use a large amount of system memory for large raster regions (10000x10000). If the module refuses to start complaining that there isn't enough memory, use the percent parameter to run the module in several passes. In addition using a less precise map format (CELL [integer] or FCELL [floating point]) will use less memory than a DCELL [double precision floating point] output map. Methods such as n, min, max, sum will also use less memory, while stddev, variance, and coeff_var will use more. The aggregate functions median, percentile, skewness and trimmed mean will use even more memory and may not be appropriate for use with arbitrarily large input files.

The default map type=FCELL is intended as compromise between preserving data precision and limiting system resource consumption. If reading data from a stdin stream, the program can only run using a single pass.

Setting region bounds and resolution

You can use the -s scan flag to find the extent of the input data (and thus point density) before performing the full import. Use g.region to adjust the region bounds to match. The -g shell style flag prints the extent suitable as parameters for g.region. A suitable resolution can be found by dividing the number of input points by the area covered. e.g.

wc -l inputfile.txt
g.region -p
# points_per_cell = n_points / (rows * cols)

g.region -e
# UTM project:
# points_per_sq_m = n_points / (ns_extent * ew_extent)

# Lat/Lon project:
# points_per_sq_m = n_points / (ns_extent * ew_extent*cos(lat) * (1852*60)^2)

If you only intend to interpolate the data with r.to.vect and v.surf.rst, then there is little point to setting the region resolution so fine that you only catch one data point per cell -- you might as well use "v.in.ascii -zbt" directly.

Filtering

Points falling outside the current region will be skipped. This includes points falling exactly on the southern region bound. (to capture those adjust the region with "g.region s=s-0.000001"; see g.region)

Blank lines and comment lines starting with the hash symbol (#) will be skipped.

The zrange parameter may be used for filtering the input data by vertical extent. Example uses might include preparing multiple raster sections to be combined into a 3D raster array with r.to.rast3, or for filtering outliers on relatively flat terrain.

In varied terrain the user may find that min maps make for a good noise filter as most LIDAR noise is from premature hits. The min map may also be useful to find the underlying topography in a forested or urban environment if the cells are over sampled.

The user can use a combination of r.in.xyz output maps to create custom filters. e.g. use r.mapcalc to create a mean-(2*stddev) map. [In this example the user may want to include a lower bound filter in r.mapcalc to remove highly variable points (small n) or run r.neighbors to smooth the stddev map before further use.]

Alternate value column

The value_column parameter can be used in specialized cases when you want to filter by z-range but bin and store another column's data. For example if you wanted to look at backscatter values between 1000 and 1500 meters elevation. This is particularly useful when using r.in.xyz to prepare depth slices for a 3D raster — the zrange option defines the depth slice but the data values stored in the voxels describe an additional dimension. As with the z column, a filtering range and scaling factor may be applied.

Reprojection

If the raster map is to be reprojected, it may be more appropriate to reproject the input points with m.proj or cs2cs before running r.in.xyz.

Interpolation into a DEM

The vector engine's topographic abilities introduce a finite memory overhead per vector point which will typically limit a vector map to approximately 3 million points (~ 1750^2 cells). If you want more, use the r.to.vect -b flag to skip building topology. Without topology, however, all you'll be able to do with the vector map is display with d.vect and interpolate with v.surf.rst. Run r.univar on your raster map to check the number of non-NULL cells and adjust bounds and/or resolution as needed before proceeding.

Typical commands to create a DEM using a regularized spline fit:

r.univar lidar_min
r.to.vect -z type=point in=lidar_min out=lidar_min_pt
v.surf.rst in=lidar_min_pt elev=lidar_min.rst

Import of x,y,string data

r.in.xyz is expecting numeric values as z column. In order to perform a occurrence count operation even on x,y data with non-numeric attribute(s), the data can be imported using either the x or y coordinate as a fake z column for method=n (count number of points per grid cell), the z values are ignored anyway.

EXAMPLES

Import of x,y,z ASCII into DEM

Sometimes elevation data are delivered as x,y,z ASCII files instead of a raster matrix. The import procedure consists of a few steps: calculation of the map extent, setting of the computational region accordingly with an additional extension into all directions by half a raster cell in order to register the elevation points at raster cell centers.

Note: if the z column is separated by several spaces from the coordinate columns, it may be sufficient to adapt the z position value.

# Important: observe the raster spacing from the ASCII file:
# ASCII file format (example):
# 630007.5 228492.5 141.99614
# 630022.5 228492.5 141.37904
# 630037.5 228492.5 142.29822
# 630052.5 228492.5 143.97987
# ...
# In this example the distance is 15m in x and y direction.

# detect extent, print result as g.region parameters
r.in.xyz input=elevation.xyz separator=space -s -g
# ... n=228492.5 s=215007.5 e=644992.5 w=630007.5 b=55.578793 t=156.32986

# set computational region, along with the actual raster resolution
# as defined by the point spacing in the ASCII file:
g.region n=228492.5 s=215007.5 e=644992.5 w=630007.5 res=15 -p

# now enlarge computational region by half a raster cell (here 7.5m) to
# store the points as cell centers:
g.region n=n+7.5 s=s-7.5 w=w-7.5 e=e+7.5 -p

# import XYZ ASCII file, with z values as raster cell values
r.in.xyz input=elevation.xyz separator=space method=mean output=myelev

# univariate statistics for verification of raster values
r.univar myelev

Import of LiDAR data and DEM creation

Import the Jockey's Ridge, NC, LIDAR dataset (compressed file "lidaratm2.txt.gz"), and process it into a clean DEM:

# scan and set region bounds
r.in.xyz -s -g separator="," in=lidaratm2.txt
g.region n=35.969493 s=35.949693 e=-75.620999 w=-75.639999
g.region res=0:00:00.075 -a

# create "n" map containing count of points per cell for checking density
r.in.xyz in=lidaratm2.txt out=lidar_n separator="," method=n zrange=-2,50

# check point density [rho = n_sum / (rows*cols)]
r.univar lidar_n
# create "min" map (elevation filtered for premature hits)
r.in.xyz in=lidaratm2.txt out=lidar_min separator="," method=min zrange=-2,50

# set computational region to area of interest
g.region n=35:57:56.25N s=35:57:13.575N w=75:38:23.7W e=75:37:15.675W

# check number of non-null cells (try and keep under a few million)
r.univar lidar_min

# convert to points
r.to.vect -z type=point in=lidar_min out=lidar_min_pt

# interpolate using a regularized spline fit
v.surf.rst in=lidar_min_pt elev=lidar_min.rst

# set color scale to something interesting
r.colors lidar_min.rst rule=bcyr -n -e

# prepare a 1:1:1 scaled version for NVIZ visualization (for lat/lon input)
r.mapcalc "lidar_min.rst_scaled = lidar_min.rst / (1852*60)"
r.colors lidar_min.rst_scaled rule=bcyr -n -e

TODO

Support for multiple map output from a single run.
method=string[,string,...] output=name[,name,...]
This can be easily handled by a wrapper script, with the added benefit of it being very simple to parallelize that way.

KNOWN ISSUES

"nan" can leak into coeff_var maps.
Cause unknown. Possible work-around: "r.null setnull=nan"

If you encounter any problems (or solutions!) please contact the GRASS Development Team.

AUTHORS

Hamish Bowman, Department of Marine Science, University of Otagom New Zealand
Extended by Volker Wichmann to support the aggregate functions median, percentile, skewness and trimmed mean.

SOURCE CODE

Available at: r.in.xyz source code (history)
Latest change: Tuesday Dec 23 01:01:29 2025 in commit bd72a09


n	number of points in cell
min	minimum value of points in cell
max	maximum value of points in cell
range	range of points in cell
sum	sum of points in cell
mean	average value of points in cell
stddev	standard deviation of points in cell
variance	variance of points in cell
coeff_var	coefficient of variance of points in cell
median	median value of points in cell
percentile	p-th percentile of points in cell
skewness	skewness of points in cell
trimmean	trimmed mean of points in cell

r.in.xyz

Parameters

DESCRIPTION

NOTES

Gridded data

Memory use

Setting region bounds and resolution

Filtering

Alternate value column

Reprojection

Interpolation into a DEM

Import of x,y,string data

EXAMPLES

Import of x,y,z ASCII into DEM

Import of LiDAR data and DEM creation

TODO

KNOWN ISSUES

SEE ALSO

AUTHORS

SOURCE CODE