
NAME
r.in.xyz - Create a raster map from an assemblage of many coordinates using univariate statistics.
KEYWORDS
raster
SYNOPSIS
r.in.xyz
r.in.xyz help
r.in.xyz [-sg] input=name output=name [method=string] [type=string] [fs=character] [x=integer] [y=integer] [z=integer] [zrange=min,max] [percent=integer] [--overwrite]
Flags:
- -s
- Scan data file for extent then exit
- -g
- In scan mode, print using shell script style
- --overwrite
- Force overwrite of output files
Parameters:
- input=name
- ASCII file containing input data
- output=name
- Name for output raster map
- method=string
- Statistic to use for raster values
- Options: n,min,max,range,sum,mean,stddev,variance,coeff_var
- Default: mean
- type=string
- Storage type for resultant raster map
- Options: CELL,FCELL,DCELL
- Default: FCELL
- fs=character
- Field separator
- Default: |
- x=integer
- Column number of x coordinates in input file (first column is 1)
- Default: 1
- y=integer
- Column number of y coordinates in input file
- Default: 2
- z=integer
- Column number of data values in input file
- Default: 3
- zrange=min,max
- Filter range for z data (min,max)
- percent=integer
- Percent of map to keep in memory
- Options: 1-100
- Default: 100
DESCRIPTION
The r.in.xyz module will load and bin ungridded x,y,z ASCII data
into a new raster map. The user may choose from a variety of statistical
methods in creating the new raster.
r.in.xyz is designed for processing massive point cloud datasets,
for example raw LIDAR or sidescan sonar swath data.
Available statistics for populating the raster are:
n | number of points in cell |
min | minimum value of points in cell |
max | maximum value of points in cell |
range | range of points in cell |
sum | sum of points in cell |
mean | average value of points in cell |
stddev | standard deviation of points in cell |
variance | variance of points in cell |
coeff_var | coefficient of variance of points in cell |
- Variance and derivatives use the biased estimator (n). [subject to change]
- Coefficient of variance is given in percentage and defined as
(stddev/mean)*100.
NOTES
Memory use
While the input file can be arbitrarily large, r.in.xyz
will use a large amount of system memory for large raster regions (10000x10000).
If the module refuses to start complaining that there isn't enough memory,
use the percent parameter to run the module in several passes.
In addition using a less precise map format (CELL [integer] or
FCELL [floating point]) will use less memory than a DCELL
[double precision floating point] output map. Methods such as n,
min, max, sum will also use less memory, while stddev, variance,
and coeff_var will use more. The default map type=FCELL
is intended as compromise between preserving data precision and limiting
system resource consumption.
Setting region bounds and resolution
You can use the -s scan flag to find the extent of the input data
(and thus point density) before performing the full import. Use
g.region to adjust the region bounds to match. The -g shell
style flag prints the extent suitable as parameters for g.region.
A suitable resolution can be found by dividing the number of input points
by the area covered. e.g.
wc -l inputfile.txt
g.region -p
# points_per_cell = n_points / (rows * cols)
g.region -e
# UTM location:
# points_per_sq_m = n_points / (ns_extent * ew_extent)
# Lat/Lon location:
# points_per_sq_m = n_points / (ns_extent * ew_extent*cos(lat) * (1852*60)^2)
If you only intend to interpolate the data with r.to.vect and
v.surf.rst, then there is little point to setting the region
resolution so fine that you only catch one data point per cell -- you might
as well use "v.in.ascii -zbt" directly.
Filtering
Points falling outside the current region will be skipped. This includes
points falling exactly on the southern region bound.
(to capture those adjust the region with "g.region s=s-0.000001";
see g.region)
Blank lines and comment lines starting with the hash symbol (#)
will be skipped.
The zrange parameter may be used for filtering the input data by
vertical extent. Example uses might include preparing multiple raster
sections to be combined into a 3D raster array with r.to.rast3, or
for filtering outliers on relatively flat terrain.
In varied terrain the user may find that min maps make for a good
noise filter as most LIDAR noise is from premature hits. The min map
may also be useful to find the underlying topography in a forested or urban
environment if the cells are over sampled.
The user can use a combination of r.in.xyz output maps to create
custom filters. e.g. use r.mapcalc to create a mean-(2*stddev)
map. [In this example the user may want to include a lower bound filter in
r.mapcalc to remove highly variable points (small n) or run
r.neighbors to smooth the stddev map before further use.]
Reprojection
If the raster map is to be reprojected, it may be more appropriate to reproject
the input points with m.proj or cs2cs before running
r.in.xyz.
Interpolation into a DEM
The vector engine's topographic abilities introduce a finite memory overhead
per vector point which will typically limit a vector map to approximately
3 million points (~ 1750^2 cells). If you want more, use the r.to.vect
-b flag to skip building topology. Without topology, however, all
you'll be able to do with the vector map is display with d.vect and
interpolate with v.surf.rst.
Run r.univar on your raster map to check the number of non-NULL cells
and adjust bounds and/or resolution as needed before proceeding.
Typical commands to create a DEM using a regularized spline fit:
r.univar lidar_min
r.to.vect -z feature=point in=lidar_min out=lidar_min_pt
v.surf.rst layer=0 in=lidar_min_pt elev=lidar_min.rst
EXAMPLE
Import the Jockey's
Ridge, NC, LIDAR dataset, and process into a clean DEM:
# scan and set region bounds
r.in.xyz -s fs=, in=lidaratm2.txt out=test
g.region n=35.969493 s=35.949693 e=-75.620999 w=-75.639999
g.region res=0:00:00.075 -a
# create "n" map containing count of points per cell for checking density
r.in.xyz in=lidaratm2.txt out=lidar_n fs=, method=n zrange=-2,50
# check point density [rho = n_sum / (rows*cols)]
r.univar lidar_n | grep sum
# create "min" map (elevation filtered for premature hits)
r.in.xyz in=lidaratm2.txt out=lidar_min fs=, method=min zrange=-2,50
# zoom to area of interest
g.region n=35:57:56.25N s=35:57:13.575N w=75:38:23.7W e=75:37:15.675W
# check number of non-null cells (try and keep under a few million)
r.univar lidar_min | grep '^n:'
# convert to points
r.to.vect -z feature=point in=lidar_min out=lidar_min_pt
# interpolate using a regularized spline fit
v.surf.rst layer=0 in=lidar_min_pt elev=lidar_min.rst
# set color scale to something interesting
r.colors lidar_min.rst rule=bcyr
# prepare a 1:1:1 scaled version for NVIZ visualization (for lat/lon input)
r.mapcalc "lidar_min.rst_scaled = lidar_min.rst / (1852*60)"
r.colors lidar_min.rst_scaled rule=bcyr
TODO
- Support for advanced statistics (in parallel with r.univar).
Especially useful for dealing with outliers would be median and
5-10% trimmed means.
The equivalent module from GRASS 5 (
s.cellstats
) contains code for additional statistical options, claiming to use only
16 bytes per cell: skewness, kurtosis, mean of squares, mean of absolute
values, first quartile, median, third quartile.
- Support for multiple map output from a single run.
method=string[,string,...] output=name[,name,...]
BUGS
- n map sum can be ever-so-slightly more than `wc -l`
with e.g. percent=10 or less.
Cause unknown.
- n map percent=100 and percent=xx maps
differ slightly (point will fall above/below the segmentation line)
Investigate with "r.mapcalc diff=bin_n.100 - bin_n.33" etc.
Cause unknown.
- "nan" can leak into coeff_var maps.
Cause unknown. Possible work-around: "r.null setnull=nan"
If you encounter any problems (or solutions!) please contact the GRASS
Development Team.
SEE ALSO
g.region,
m.proj,
r.fillnulls,
r.in.ascii,
r.mapcalc,
r.neighbors,
r.to.rast3,
r.to.vect,
r.univar,
r.univar2,
v.in.ascii,
v.surf.rst
AUTHOR
Hamish Bowman
Department of Marine Science
University of Otago
New Zealand
Last changed: $Date: 2006/06/17 06:23:49 $
Main index - raster index - Full index