GRASS Programmer's Manual  7.0.svn(2012)-r51645
 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
GRASS Vector Library

by GRASS Development Team (http://grass.osgeo.org)

Table of contents

Background

Generally, the vector data model is used to describe geographic phenomena which may be represented by geometric entities like points, lines, and areas. The GRASS vector data model includes the description of topology, where besides the coordinates describing the location of the primitives (points, lines, boundaries, centroids, faces, kernels, and volumes), their spatial relations are also stored. In general, topological GIS requires a data structure where the common boundary between two adjacent areas is stored as a single line, simplifying the vector data maintenance.

Introduction

The GRASS 6/7 vector format is very similar to the previous GRASS 4.x (5.0/5.3) vector format.

This description covers the new GRASS 6/7 vector library architecture. This new architecture overcomes the vector limitations of GRASS 4.x-5.4.x by extending the vector support with attributes stored in the external relational databases, and by new 3D capabilities. Besides internal file based storage the geometry may alternatively be stored in a PostGIS database (accessible via OGR interface). This enables users to maintain large data sets with simultaneous write access. External GIS formats such as SHAPE-files may be used directly, without requiring format conversion.

The current implementation includes:

  • multi-layer: features in one vector map may represent more layers and may be linked to more external tables (see Categories and Layers)
  • 2D and 3D vector geometry with full topology support for 2D and partial topology support for 3D (see Vector library topology management)
  • multi-format: external data formats supported (SHAPE-file, OGR sources etc.)
  • portability: platform independent internal format, read- and writable on 32bit, 64bit etc. computer architectures
  • integrated DGLib (Directed Graph Library) - support for vector network analysis
  • spatial index: based on R-tree method for fast vector geometry access (see Vector library spatial index management)
  • multi-attribute: attributes saved in external Relational Database Management System (RDBMS) connected through DBMI library and drivers (see Attributes)

Vector map definition (native format)

GRASS vector maps are stored in an arc-node representation, consisting of curves called arcs. An arc is stored as a series of x,y,z coordinate pairs. The two endpoints of an arc are called nodes. Two consecutive x,y,z pairs define an arc segment. The user specifies the type of input to GRASS; GRASS doesn't decide. GRASS allows for the feature definition which allows for multiple types to co-exist in the same map. Centroid are assigned to area it is within/inside (geometrically). An area is identified by an x,y,z centroid point geometrically inside with a category number. This identifies the area. Such centroids are stored in the same binary 'coor' file with other primitives. Each element may have none, one or more categories (cats). More cats are distinguished by field number (field, called "layer" at user level). Single and multi-category support on modules level are implemented. Z-coordinate is optional and both 2D and 3D files may be written.

The following vector feature types (primitives) are defined by the vector library (and holds by the coor file; see also Feature types):

  • point: a point (2D or 3D) - GV_POINT
  • line: a directed sequence of connected vertices with two endpoints called nodes (2D or 3D) - GV_LINE
  • boundary: the border line to describe an area (2D only) - GV_BOUNDARY
  • centroid: a point within a closed boundary(ies) to describe an area (2D only) - GV_CENTROID
  • face: a 3D boundary (not implemented yet) - GV_FACE
  • kernel: a 3D centroid in a volume - GV_KERNEL

From vector feature types mentioned above are derived:

  • area: the topological composition of a closed ring of boundary(ies) and optionally a centroid (2D only, 3D coordinates supported but ignored) - GV_AREA
  • isle: an area within area, not touching the boundaries of the outer area (2D only, 3D coordinates supported but ignored)
  • volume: a 3D corpus, the topological composition of faces and kernel (not implemented yet) - GV_VOLUME
  • hole: a volume within volume, 3D equivalent to isle within area (not implemented yet)

Note that all lines and boundaries can consist of multiple segments.

Area topology also holds information about isles. Isles are located within an area, not touching the boundaries of the outer area. Isles consist of one or more areas and are used internally by the vector library to maintain correct topology of areas.

Levels of read access

There are two levels of read access to the vector data:

  • Level One provides simple access to the vector feature information. There is no access to topology information at this level.
  • Level Two provides full access to all the information including topology information. This level requires more from the programmer, more memory, and longer startup time.

Level of access is retured by Vect_open_old().

Note: Higher level of access are planned, so when checking success return codes for a particular level of access (when calling Vect_open_old() for example), the programmer should use >= instead of == for compatibility with future releases.

An existing vector map can be open for reading by Vect_open_old(). New vector map can be created (or open for writing) by Vect_open_new(). Vect_open_old() attempts to open a vector map at the highest possible level of access. It will return the number of the level at which it opened. Vect_open_new() always opens at level 1 only. If you require that a vector map be opened at a lower level (e.g. one), you can call the routine Vect_set_open_level(1); Vect_open_old() will then either open at level one or fail. If you instead require the highest level access possible, you should not use Vect_set_open_level(), but instead check the return value of Vect_open_old() to make sure it is greater than or equal to the lowest level at which you need access. This allows for future levels to work without need for module change.

Directory structure

Vector map is stored in a number of data files. Vector map directory structure and file names were changed in GRASS 6 with respect to previous GRASS versions. All vector files for one vector map are stored in one directory:

$MAPSET/vector/vector_name/

This directory contains these files:

Header file format specification

The header contains meta information, a description of the vector map and many other information. The file is an unordered list of key/value entries. The key is a string separated from value by a colon and optional whitespace.

Keywords are:

  • ORGANIZATION - organization that digitized the data
  • DIGIT DATE - date the data was digitized
  • DIGIT NAME - person who digitized the data
  • MAP NAME - title of the original source map
  • MAP DATE - date of the original source map
  • MAP SCALE - scale of the original source map
  • OTHER INFO - other comments about the map
  • ZONE - zone of the map (e.g., UTM zone)
  • MAP THRESH - digitizing threshold

This information holds dig_head data structure.

Categories and Layers

Note: "layer" was called "field" in earlier version.

In GRASS, a "category" or "category number" is a vector feature ID used to link geometry to attributes which are stored in one or several (external) database table(s). This category number is stored into the vector geometry as well as a "cat" column (integer type) in each attribute database table. The category number is used to lookup an attribute assigned to a vector object. At user level, category numbers can be assigned to vector objects with the v.category command.

In order to assign multiple attributes in different tables to vector objects, each map can hold multiple category numbers. This is achieved by assigning more than one "layer" to the map (v.db.connect command). The layer number determines which table to be used for attribute queries. For example, a cadastrial vector area map can be assigned on layer 1 to an attribute table containing landuse descriptions which are maintained by department A while layer 2 is assigned to an attribute table containing owner descriptions which are maintained by department B.

Each vector feature inside a vector map has zero, one or more <layer,category> tuple(s). A user can (but not must) create attribute tables which are referenced by the layer, and rows which are essentially referenced by the <layer,category> pair.

Categories start with 1 (category '0' is allowed for OGR layers). Categories do not have to be continuous.

Information about categories holds line_cats data structure.

Attributes

The old GRASS 4.x 'dig_cats' files are not used any more and vectors' attributes are stored in external database. Connection with the database is done through drivers based on GRASS DataBase Management Interface. Records in a table are linked to vector entities by layer and category number. The layer identifies table and the category identifies record. I.e., for any unique combination

map+mapset+layer+category

there exists one unique combination

driver+database+table+row

The general DBMI settings are defined in the '$MAPSET/VAR' text file (maintained with db.connect command at user level).

DB link file format specification

Each vector maps has its own DBMI settings stored in the '$MAPSET/vector/vector_name/dbln' text file. For each pair vector map + layer, all of table, key column, database, driver must be defined in a new row. This definition must be written to '$MAPSET/vector/vector_name/dbln' text file. Each row in the 'dbln' file contains names separated by spaces in following order ([ ] - optional):

map[@mapset] layer table [key [database [driver]]]

If key, database or driver are omitted (on second and higher row only) the last definition is used. When reading a vector map from another mapset (if mapset is specified along with map name), definitions in the related "dbln" file may overwrite the DBMI definition in the current mapset. This means that the map-wise definition is always "stronger".

Wild cards * and ? may be used in map and mapset names.

Variables $GISDBASE, $LOCATION_NAME, $MAPSET, and $MAP may be used in table, key, database and driver names (function Vect_subst_var()). Note that $MAPSET is not the current mapset but mapset of the map the rule is defined for.

Note that vector features in GRASS vector maps may have attributes in different tables or may be without attributes. Boundaries form areas but it may happen that some boundaries are not closed (such boundaries would not appear in polygon layer). Boundaries may have attributes. All types may be mixed in one vector map.

The link to the table is permanent and it is stored in 'dbln' file in vector directory. Tables are considered to be a part of the vector and the command g.remove, for example, deletes linked tables of the vector. Attributes must be joined with geometry.

Information about database links holds dblinks data structure.

Examples:

Examples are written mostly for the DBF driver, where database is full path to the directory with dbf files and table name is the name of dbf file without .dbf extension:

* 1 mytable id $GISDBASE/$LOCATION_NAME/$MAPSET/vector/$MAP dbf

This definition says that entities with category of layer 1 are linked to dbf tables with names "mytable.dbf" saved in vector directories of each map. The attribute column containing the category numbers is called "id".

* 1 $MAP id $GISDBASE/$LOCATION_NAME/$MAPSET/dbf dbf

Similar as above but all dbf files are in one directory dbf/ in mapset and names of dbf files are $MAP.dbf

water* 1 rivers id /home/grass/dbf dbf
water* 2 lakes lakeid /home/guser/mydb
trans* 1 roads key basedb odbc
trans* 5 rails

These definitions define more layers (called "field" in the API) for one vector map i.e. in one vector map may be more features linked to more attribute tables. Definitions on first 2 rows are applied for example on maps water1, water2, ... so that more maps may share one table.

water@PERMANENT 1 myrivers id /home/guser/mydbf dbf

This definion overwrites the definition saved in PERMANENT/VAR and links the water map from PERMANENT mapset to the user's table.

Modules should be written so that connections to databases for each vector layer are independent. It should be possible to read attributes of an input vector map from one database and write to some other and even with some other driver (should not be a problem).

There are open questions, however. For one, how does one distinguish when new tables should be written and when not? For example, definitions:

river 1 river id water odbc
river.backup* 1 NONE

could be used to say that tables should not be copied for backups of map river because table is stored in a reliable RDBMS.

Vector libraries

Besides internal library functions there are two main libraries:

For historical reasons, there are two internal libraries:

  • diglib (with dig_*() functions), GRASS 3.x/4.x
  • Vlib (with V1_*(), V2_*() and Vect_*() functions), since GRASS 4.x (except for the 5.7 interim version)

The vector library was introduced in GRASS 4.0 to hide internal vector files' formats and structures. In GRASS 6/7, everything is accessed via Vect_*() functions, for example:

Old 4.x code:

    xx = Map.Att[Map.Area[area_num].att].x;

New 6.x/7.x functions:

    centroid = Vect_get_area_centroid(Map, area_num);
    Vect_read_line(Map, line_p, NULL, centroid);
    Vect_line_get_point(line_p, 0, &xx, NULL, NULL);

In GRASS 6/7, all internal, mostly non-topological vector functions are hidden from the modules' API (mainly dig_*(), V1_*() and V2_*() functions). All available Vect_*() functions are topological vector functions.

The following include file contains definitions and structures required by some of the routines in this library. The programmer should therefore include this file in any code that uses the vector library:

#include <grass/vector.h>

Note: For details please read Blazek et al. 2002 (see below) as well as the references in this document.

Historical notes

The vector library in GRASS 4.0 changed significantly from the Digit Library (diglib) used in GRASS 3.1. Below is an overview of why the changes were made.

The Digit Library was a collage of subroutines created for developing the map development programs. Few of these subroutines were actually designed as a user access library. They required individuals to assume too much responsibility and control over what happened to the data file. Thus when it came time to change vector data file formats for GRASS 4.0, many modules also required modification. The two different access levels for 3.0 vector files provided very different ways of calling the library; they offered little consistency for the user.

The Digit Library was originally designed to only have one file open for read or write at a time. Although it was possible in some cases to get around this, one restriction was the global head structure. Since there was only one instance of this, there could only be one copy of that information, and thus, only one open vector file.

The solution to these problems was to design a new user library as an interface to the vector data files. This new library was designed to provide a simple consistent interface, which hides as much of the details of the data format as possible. It also could be extended for future enhancements without the need to change existing programs.

The new vector library in GRASS 4 provided routines for opening, closing, reading, and writing vector files, as well as several support functions. The Digit Library has been replaced, so that all existing modules was converted to use the new library. Those routines that existed in the Digit Library and were not affected by these changes continue to exist in unmodified form, and were included in the vector library. Most of the commonly used routines have been discarded, and replaced by the new vector routines.

Instead the global head structure was used own local version of it. The structure that replaced structure head is structure dig_head. There were still two levels of interface to the vector files (future releases may include more). Level one provided access only to arc (i.e. polyline) information and to the type of line (AREA, LINE, DOT). Level two provided access to polygons (areas), attributes, and network topology.

Vector library data structures

All data structure used by the vector library are defined in include/vect/dig_structs.h. See the list bellow:

Major:

Supporting:

Format-related:

DB-related:

Geometry-related:

Category-related:

Topology-related:

Misc:

Obsolete:

Vector library feature geometry

Feature types

Feature types are defined in include/vect_dig_defines.h, see the list bellow:

  • GV_POINT
  • GV_LINE
  • GV_BOUNDARY
  • GV_CENTROID
  • GV_FACE
  • GV_KERNEL
  • GV_AREA
  • GV_VOLUME
  • GV_POINTS (GV_POINT | GV_CENTROID)
  • GV_LINES (GV_LINE | GV_BOUNDARY)

Face and kernel are 3D equivalents of boundary and centroid, but there is no support (yet) for 3D topology (volumes). Faces are used in a couple of modules including NVIZ to visualize 3D buildings and other volumetric figures.

Coor file format specification

In the coor file the following is stored: 'line' (element) type, number of attributes and layer number for each category. Coordinates in binary file are stored as double (8 bytes). See Coor_info data structure.

Header

NameTypeNumberDescription
Version_Major C 1 file version (major)
Version_Minor C 1 file version (minor)
Back_Major C 1 supported from GRASS version (major)
Back_Minor C 1 supported from GRASS version (minor)
byte_order C 1 little or big endian flag
head_size L 1 header size of coor file
with_z C 1 2D or 3D flag; zero for 2D
size L 1 coor file size

Body

The body consists of line records:

NameTypeNumberDescription
record headerC1
  • 0. bit: 1 - alive, 0 - dead line
  • 1. bit: 1 - categories, 0 - no categories
  • 2.-3. bit: type - one of: GV_POINT, GV_LINE, GV_BOUNDARY, GV_CENTROID, GV_FACE, GV_KERNEL
  • 4.-7. bit: reserved, not used

ncatsI1

number of categories (written only if categories exist)

fieldIncats

field identifier, distinguishes between more categories append to one feature (written only if categories exist; field is called "layer" at user level)

catIncats

category value (written only if categories exist)

ncoorI1

written for GV_LINES and GV_BOUNDARIES only

xDncoor

x coordinate

yDncoor

y coordinate

zDncoorz coordinate; present if with_z in head is set to 1

Types used in coor file:

TypeNameSize in Bytes
DDouble8
LLong 4
IInt 4
SShort 4
CChar 1

Vector library topology management

Topology general characteristics:

  • geometry and attributes are stored separately (don't read both if it is not necessary - usually it is not)
  • the format is topological (areas build from boundaries)
  • currently only 2D topology is supported

Topology is written for native GRASS vector format; in case of linked OGR sources (see v.external module), only pseudo-topology (boundaries constructed from polygons) is written.

The following rules apply to the vector data:

  • Boundaries should not cross each other (i.e., boundaries which would cross must be split at their intersection to form distict boundaries). On the contrary, lines can cross each other, e.g. bridges over rivers.
  • Lines and boundaries share nodes only if their endpoints are identical. Lines or boundaries can be forced to share a common node by snapping them together. This is particulary important since nodes are not represented in the coor file, but only implicitly as endpoints of lines and boundaries.
  • Common area boundaries should appear only once (i.e., should not be double digitized).
  • Areas must be explicitly closed. This means that it must be possible to complete each area by following one or more boundaries that are connected by common nodes, and that such tracings result in closed areas.
  • It is recommended that area features and linear features be placed in separate layers. However if area features and linear features must appear in one layer, common boundaries should be digitized only once. For example, a boundary that is also a line (e.g., a road which is also a field boundary), should be digitized as a boundary to complete the area(s), and a boundary which is functionally also a line should be labeled as a line by a distinct category number.

Vector map topology can be cleaned at user level by v.clean command.

Topo file format specification

Topo file is read by Vect_open_topo().

Header

Note: plus is an instance of Plus_head data structure.

NameTypeNumber

Description

plus->Version_Major C1file version (major)
plus->Version_Minor C1file version (minor)
plus->Back_MajorC1supported from GRASS version (major)
plus->Back_MinorC1

supported from GRASS version (minor)

plus->port->byte_orderC1

little or big endian flag; files are written in machine native order but files in both little and big endian order may be readl; zero for little endian

plus->head_sizeL1

header size

plus->with_zC1

2D or 3D flag; zero for 2D

plus->boxD6

Bounding box coordinates (N,S,E,W,T,B)

plus->n_nodes, plus->n_lines, etc.I7

Number of nodes, edges, lines, areas, isles, volumes and holes

plus->n_plines, plus->n_llines, etc.I7

Number of points, lines, boundaries, centroids, faces and kernels

plus->Node_offset, plus->Edge_offset, etc.L7

Offset value for nodes, edges, lines, areas, isles, volumes and holes

plus->coor_sizeL1File size

Body (nodes, lines, areas, isles)

Nodes

For each node (plus->n_nodes):

NameTypeNumberDescription
n_linesI1Number of lines (0 for dead node)
linesIn_linesLine ids (negative id for line which ends at the node)
anglesDn_linesAngle value
n_edgesI1Reserved for edges (only for with_z)
x,yD2Coordinate pair (2D)
zD1Only for with_z (3D)

See P_node data structure.

Lines

For each line (plus->n_lines):

NameTypeNumberDescription
feature typeC10 for dead line
offsetL1Line offset
N1I1Start node id (only if feature type is GV_LINE or GV_BOUNDARY)
N2I1End node id (only if feature type is GV_LINE or GV_BOUNDARY)
leftI1Left area id for feature type GV_BOUNDARY / Area id for feature type GV_CENTROID
rightI1Right area id (for feature type GV_BOUNDARY)
volI1Reserved for kernel (volume number, for feature type GV_KERNEL)

See P_line data structure.

Areas

For each area (plus->n_areas):

NameTypeNumberDescription
n_linesI1number of boundaries
linesIn_linesLine ids forming exterior boundary (clockwise order, negative id for backward direction)
n_islesI1Number of isles
islesIn_islesIsle ids
centroidI1Centroid id

See P_area data structure.

Isles

For each isle (plus->n_isle):

NameTypeNumberDescription
n_linesI1number of boundaries
linesIn_linesLine ids forming exterior boundary (counter-clockwise order, negative id for backward direction)
areaI1Outer area id

See P_isle data structure.

Topology levels

The vector library defines more topology levels (only for level of access 2):

  • GV_BUILD_NONE
  • GV_BUILD_BASE
  • GV_BUILD_AREAS
  • GV_BUILD_ATTACH_ISLES
  • GV_BUILD_CENTROIDS
  • GV_BUILD_ALL

Note: Only the geometry type GV_BOUNDARY is used to build areas. The geometry type GV_LINE cannot form an area.

Topology examples

Points

One point (nodes: 0, lines: 1, areas: 0, isles: 0)

    + N1/L1

Line L1 (see P_line)

line = 1, type = 1 (GV_POINT)

Lines

One line (nodes: 2, lines: 1, areas: 0, isles: 0)


   +----L1----+
   N1         N2

Node N1 (see P_node)

node = 1, n_lines = 1, xyz = 634624.746450, 223557.302231, 0.000000
  line =   1, type = 2 (GV_LINE), angle = -0.436257

Node N2 (see P_node)

node = 2, n_lines = 1, xyz = 638677.484787, 221667.849899, 0.000000
  line =  -1, type = 2 (GV_LINE), angle = 2.705335

Line L1 (see P_line)

line = 1, type = 2 (GV_LINE), n1 = 1, n2 = 2

Areas without holes

Two lines (nodes: 1, lines: 2, areas: 1, isles: 1)

          +N1
         /   \
        /     \
       /       \
      /   +L2   \
     /           \
    -------L1------

Node N1 (see P_node)

node = 1, n_lines = 2, xyz = 635720.081136, 225063.387424, 0.000000
  line =   1, type = 4 (GV_BOUNDARY), angle = -2.245537
  line =  -1, type = 4 (GV_BOUNDARY), angle = -0.842926

Line L1 (see P_line)

line = 1, type = 4 (GV_BOUNDARY), n1 = 1, n2 = 1, left = 1, right = -1

Line L2 (see P_line)

line = 2, type = 8 (GV_CENTROID), area = 1

Area A1 (see P_area)

area = 1, n_lines = 1, n_isles = 0 centroid = 2
  line =  -1

Isle I1 (see P_isle)

isle = 1, n_lines = 1 area = 0
  line =   1

Areas with holes

Three lines (nodes: 2, lines: 3, areas: 2, isles: 2)

             +N1
            / \
           /   \
          /     \
         /       \
        /    +L2  \
       /           \
      /   +N2       \
     /   /\          \
    /   /  \          \
   /   /    \          \
  /    ---L3--          \
 /                       \
------------L1-------------

Node N1 (see P_node)

node = 1, n_lines = 2, xyz = 635720.081136, 225063.387424, 0.000000
  line =   1, type = 4 (GV_BOUNDARY), angle = -2.245537
  line =  -1, type = 4 (GV_BOUNDARY), angle = -0.842926

Node N2 (see P_node)

node = 2, n_lines = 2, xyz = 636788.032454, 223173.935091, 0.000000
  line =   3, type = 4 (GV_BOUNDARY), angle = -2.245537
  line =  -3, type = 4 (GV_BOUNDARY), angle = -0.866302

Line L1 (see P_line)

line = 1, type = 4 (GV_BOUNDARY), n1 = 1, n2 = 1, left = 1, right = -1

Line L2 (see P_line)

line = 2, type = 8 (GV_CENTROID), area = 1

Line L3 (see P_line)

line = 3, type = 4 (GV_BOUNDARY), n1 = 3, n2 = 3, left = 2, right = -2

Area A1 (see P_area)

area = 1, n_lines = 1, n_isles = 1 centroid = 2
  line =  -1
  isle =   2

Area A2 (see P_area)

area = 2, n_lines = 1, n_isles = 0 centroid = 0
  line =  -3

Isle I1 (see P_isle)

isle = 1, n_lines = 1 area = 0
  line =   1

Isle I2 (see P_isle)

isle = 2, n_lines = 1 area = 1
  line =   3

Example 1

A polygon may be formed by many boundaries (several connected primitives). One boundary is shared by adjacent areas.

+--1--+--5--+
|     |     |
2  A  4  B  6
|     |     |
+--3--+--7--+

1,2,3,4,5,6,7 = 7 boundaries (primitives)
A,B = 2 areas
A+B = 1 isle

Example 2

This is handled correctly in GRASS: A can be filled, B filled differently.

+---------+
|    A    |
+-----+   |
|  B  |   |
+-----+   |
|         |
+---------+

A, B = 2 areas
A+B  = 1 isle

In GRASS, whenever an 'inner' ring touches the boundary of an outside area, even in one point, it is no longer an 'inner' ring (isle in GRASS topology), it is simply another area. A, B above can never be exported from GRASS as polygon A with inner ring B because there are only 2 areas A and B and one island formed by A and B together.

Example 3

This is handled correctly in GRASS: Areas A1, A2, and A3 can be filled differently.

+---------------------+
|  A1                 |
+   +------+------+   |
|   |  A2  |  A3  |   |
+   +------+------+   |
|          I1         |
+---------------------+

A1,A2,A3 = 3 areas
A1,A2+A3 = 2 isles

In GRASS, whenever an 'inner' ring does not touch the boundary of an outside area, also not in one point, it is an 'inner' ring (isle). The areas A2 and A3 form a single isle I1 located within area A1. The size of isle I1 is substracted from the size of area A1 when calculating the size of area A1. Any centroids falling into isle I1 are excluded when searching for a centroid that can be attached to area A1. A1 above can be exported from GRASS as polygon A1 with inner ring I1.

Example 4

v.in.ogr/v.clean can identify dangles and change the type from boundary to line (in TIGER data for example). Distinction between line and boundary isn't important only for dangles. Example:

+-----+-----+
|     .     |
|     .     |
+.....+.....+
|     .     |
|  x  .     |
+-----+-----+

----  road + boundary of one parcel => type boundary
....  road => type line
x     parcel centroid (identifies whole area)

Because lines are not used to build areas, we have only one area/centroid, instead of 4 which would be necessary in TIGER.

Topology memory management

Topology is generated for all kinds of vector types. Memory is not released by default. The programmer can force the library to release the memory by using Vect_set_release_support(). But: The programmer cannot run Vect_set_release_support() in mid process because all vectors are needed in the spatial index, which is needed to build topology.

Topology is also necessary for points in case of a vector network because the graph is built using topology information about lines and points.

The topology structure does not only store the topology but also the 'line' bounding box and line offset in coor file (index). The existing spatial index is using line ID in 'topology' structure to identify lines in 'coor' file. Currently it is not possible to build spatial index without topology.

Vector library spatial index management

Spatial index (based on R*-tree) is created with topology, see RTree data structure.

Spatial index occupies a lot of memory but it is necessary for topology building. Also, it takes some time to release the memory occupied by spatial index (see dig_spidx_free()). The spatial index can also be built in file to save memory by setting the environment variable GRASS_VECTOR_LOWMEM.

The function building topology - Vect_build() - is usually called at the end of modules (before Vect_close()) so it is faster to call exit() and operating system releases all the memory much faster. By default the memory is not released.

It is possible to call Vect_set_release_support() before Vect_close() to enforce memory release, but it takes some time on large files.

The spatial index is stored in file and not loaded for old vectors that are not updated, saving a lot of memory. Spatial queries are done in file.

Currently most of the modules do not release the memory occupied for spatial index and work like this (pseudocode):

int main
{
     Vect_open_new();
     /* writing new vector */

     Vect_build();
     Vect_close();  /* memory is not released */
}

In general it is possible to free the memory with Vect_set_release_support() such as:

int main
{
     Vect_open_new();
     /* writing new vector */

     Vect_build();
     Vect_set_release_support();
     Vect_close();  /* memory is released */
}

but it takes a bit longer.

It makes sense to release the spatial index if it is used only at the beginning of a module or in permanently running programs like QGIS. Note that this applies only when creating a new vector or updating an old vector. For example:

int main
{
     Vect_open_update();
     /* select features using spatial index, e.g.  Vect_select_lines_by_box() */
     Vect_set_release_support();
     Vect_close();  /* memory is released */

     /* do some processing which needs memory */
}

See also spatial_index data structure.

Sidx file format specification

Spatial index file ('sidx') is read by Vect_open_sidx().

Header

Note: plus is instance of Plus_head structure.

NameTypeNumber

Description

plus->spidx_Version_Major C1file version (major)
plus->spidx_Version_Minor C1file version (minor)
plus->spidx_Back_MajorC1supported from GRASS version (major)
plus->spidx_Back_MinorC1

supported from GRASS version (minor)

plus->spidx_port->byte_orderC1

little or big endian flag; files are written in machine native order but files in both little and big endian order may be readl; zero for little endian

plus->spidx_port.off_t_sizeC1

off_t size (LFS)

plus->spidx_head_sizeL1

header size

plus->spidx_with_zC1

2D/3D vector data

ndimsC1

Number of dimensions

nsidesC1

Number of sides

nodesizeI1

Node size

nodecardI1

Node card (?)

leafcardI1

Leaf card (?)

min_node_fillI1

Minimum node fill (?)

min_leaf_fillI1

Minimum leaf fill (?)

plus->Node_spidx->n_nodesI1

Number of nodes

plus->Node_spidx->n_leafsI1

Number of leafs

plus->Node_spidx->n_levelsI1

Number of levels

plus->Node_spidx_offsetO1

Node offset

plus->Line_spidx->n_nodesI1

Number of nodes

plus->Line_spidx->n_leafsI1

Number of leafs

plus->Line_spidx->n_levelsI1

Number of levels

plus->Line_spidx_offsetO1

Line offset

plus->Area_spidx->n_nodesI1

Number of nodes

plus->Area_spidx->n_leafsI1

Number of leafs

plus->Area_spidx->n_levelsI1

Number of levels

plus->Area_spidx_offsetO1

Area offset

plus->Isle_spidx->n_nodesI1

Number of nodes

plus->Isle_spidx->n_leafsI1

Number of leafs

plus->Isle_spidx->n_levelsI1

Number of levels

plus->Isle_spidx_offsetO1

Isle offset

plus->Face_spidx_offsetO1

Face offset

plus->Volume_spidx_offsetO1

Volume offset

plus->Hole_spidx_offsetO1

Hole offset

plus->coor_sizeO1Coor file size

Vector library category index management

The category index (stored in the cidx file) improves the performance of all selections by cats/attributes (SQL, e.g. d.vect cats=27591, v.extract list=20000-21000). This avoids that all selections have to be made by looping through all vector lines. Category index is also essential for simple feature representation of GRASS vectors.

Category index is created for each field. In memory, it is stored in Cat_index data structure.

Category index is built with topology, but it is not updated if vector is edited on level 2. Category index is stored in 'cidx' file, 'cat' array is written/read by one call of dig__fwrite_port_I() or dig__fread_port_I().

Stored values can be retrieved either by index in 'cat' array (if all features of given field are required) or by category value (one or few features), always by Vect_cidx_*() functions.

To create category index, it will be necessary to rebuild topology for all existing vectors. This is an opportunity to make (hopefully) last changes in 'topo', 'cidx' formats.

Cidx file format specification

Category index file ('cidx') is read by Vect_cidx_open().

Header

Note: plus is instance of Plus_head structure.

NameTypeNumber

Description

plus->cpidx_Version_Major C1file version (major)
plus->cpidx_Version_Minor C1file version (minor)
plus->cpidx_Back_MajorC1supported from GRASS version (major)
plus->cpidx_Back_MinorC1

supported from GRASS version (minor)

plus->cidx_port->byte_orderC1

little or big endian flag; files are written in machine native order but files in both little and big endian order may be readl; zero for little endian

plus->cidx_head_sizeL1

cidx head size

plus->n_cidxI1

number of fields

fieldIn_cidx

field number

n_catsIn_cidx

number of categories

n_ucatsIn_cidx

number of unique categories

n_typesIn_cidx

number of feature types

rtypeIn_cidx * n_types

Feature type

type[t]In_cidx * n_types

Number of items

Vector TIN functions

TINs are simply created as 2D/3D vector polygons consisting of 3 vertices. See Vect_tin_get_z().

OGR interface

Pseudo-topology

Reduced topology: each boundary is attached to one area only, i.e. smoothing, simplification, removing small areas etc. will not work properly for adjacent areas or areas within areas.

Full topology is only available for native GRASS vectors or can only be built after all polygons are converted to areas and cleaned as done by v.in.ogr.

Frmt file format specification

Frmt is a plain text file which contains basic information about external format of linked vector map. Each line contains key, value pairs separated by comma.

OGR specific format is described by:

  • FORMAT - ogr
    • DSN - OGR datasource name
    • LAYER - OGR layer name

Example:

FORMAT: ogr
DSN: /path/to/shapefiles
LAYER: cities

OGR layer can be linked via v.external command. When linking OGR layer pseudo-topology ('topo') is built including spatial index file ('sidx') and category index file ('cidx'). Additionally also feature index file (see Fidx file format specification) is created.

Fidx file format specification

Note: finfo is an instance of Format_info structure.

NameTypeNumber

Description

Version_Major C1file version (major)
Version_Minor C1file version (minor)
Back_MajorC1supported from GRASS version (major)
Back_MinorC1

supported from GRASS version (minor)

byte_orderC1

little or big endian flag; files are written in machine native order but files in both little and big endian order may be readl; zero for little endian

lengthL1

header size

fInfo.ogr.offset_numI1

number of records

fInfo.ogr.offsetIoffset_num

offsets

DGLib (Directed Graph Library)

Directed Graph Library or DGLib (Micarelli 2002, http://grass.osgeo.org/dglib/) provides functionality for vector network analysis. This library released under GPL is hosted by the GRASS project (within the GRASS source code). As a stand-alone library it may also be used by other software projects.

The Directed Graph Library library provides functionality to assign costs to lines and/or nodes. That means that costs can be accumulated while traveling along polylines. The user can assign individual costs to all lines and/or nodes of a vector map and later calculate least costly path connections based on the accumulated costs. Applications are transport analysis, connectivity and more. Implemented applications cover shortest/fastest path, traveling salesman (round trip), allocation of sources (creation of subnetworks), minimum Steiner trees (star-like connections), and iso-distances (from centers).

For details, please read Blazek et al. 2002 (see below).

Related vector functions are: Vect_graph_add_edge(), Vect_graph_init(), Vect_graph_set_node_costs(), Vect_graph_shortest_path(), Vect_net_build_graph(), Vect_net_nearest_nodes(), Vect_net_shortest_path(), and Vect_net_shortest_path_coor().

Vector ASCII Format Specifications

The GRASS ASCII vector map format may contain a mix of primitives including points, lines, boundaries, centroids, faces, and kernels. The format may also contain a header with various metadata (see example below).

Vector map can be converted to the ASCII representation at user level by v.out.ascii format=standard command.

See Vector ASCII functions for list of related functions.

The header is similar as the head file of vector binary format (see Header file format specification) but contains bounding box also. Keywords are:

ORGANIZATION
DIGIT DATE
DIGIT NAME
MAP NAME
MAP DATE
MAP SCALE
OTHER INFO
ZONE
WEST EDGE
EAST EDGE
SOUTH EDGE
NORTH EDGE
MAP THRESH

Example:

ORGANIZATION: NC OneMap
DIGIT DATE:   
DIGIT NAME:   helena
MAP NAME:     North Carolina selected bridges (points map)
MAP DATE:     Mon Nov  6 15:32:39 2006
MAP SCALE:    1
OTHER INFO:   
ZONE:         0
MAP THRESH:   0.000000

The body begins with the row:

VERTI:

followed by records of primitives:

TYPE NUMBER_OF_COORDINATES [NUMBER_OF_CATEGORIES]
 X Y [Z]
....
 X Y [Z]
[ LAYER CATEGORY]
....
[ LAYER CATEGORY]

Everything above in [] is optional.

The primitive codes are as follows:

  • 'P': point
  • 'L': line
  • 'B': boundary
  • 'C': centroid
  • 'F': face (3D boundary)
  • 'K': kernel (3D centroid)
  • 'A': area (boundary) - better use 'B'; kept only for backward compatibility

The coordinates are listed following the initial line containing the primitive code, the total number of vectors in the series, and (optionally) the number of categories (1 for a single layer, higher for multiple layers). Below that 1 or several lines follow to indicate the layer number and the category number (ID).

The order of coordinates is

  X Y [Z]

Note: The points are stored as y, x (i.e., east, north), which is the reserve of the way GRASS usually represents geographic coordinates.

Example:

P  1 1
 375171.4992779 317756.72097616
 1     1 
B  5
 637740       219580      
 639530       219580      
 639530       221230      
 637740       221230      
 637740       219580      
C  1 1
 638635       220405      
 1     2

In this example, the first vector feature is a point with category number 1. The second vector feature is a boundary composed by 5 points. The third feature is a centroid with category number 2. The boundary and the centroid form an area with category number 2. All vector feature mentioned above are located in layer 1.

List of vector library functions

The vector library provides the GRASS programmer with routines to process vector data. The routines in the vector library are presented in functional groupings, rather than in alphabetical order. The order of presentation will, it is hoped, provide better understanding of how the library is to be used, as well as show the interrelationships among the various routines. Note that a good way to understand how to use these routines is to look at the source code for GRASS modules which use them.

Note: All routines start with one of following prefixes Vect_, V1_, V2_ or dig_. To avoid name conficts, programmers should not create variables or routines in their own modules which use this prefix.

The Vect_*() functions are the programmer's API for GRASS vector programming. The programmer should use only routines with this prefix.

Vector area functions

Vector array functions

Vector bounding box functions

Vector break lines functions

Vector break polygons functions

Vector bridges functions

Vector buffer functions

Vector build functions

Vector build (native) functions

Vector build (OGR) functions

Vector categories functions

Vector category index functions

(note: vector layer is historically called "field")

Vector clean nodes functions

Vector close functions

Vector constraint functions

Vector dangles functions

Vector dbcolumns functions

Vector error functions

  • Vect_get_fatal_error()
  • Vect_set_fatal_error()

Vector field functions

(note: vector layer is historically called "field")

Vector find functions

Vector graph functions

Vector header functions

Vector history functions

Vector header functions

Vector intersection functions

Vector valid map name functions

Vector level functions

Vector topological (level 2) functions

Vector feature functions

Vector list functions

Vector map functions

Vector merge line functions

Vector network functions

Vector open functions

  • Vect__open_old()

Vector overlay functions

Vector polygon functions

Vector read functions

Level 1 and 2

Level 2 only

Vector remove areas functions

Vector remove duplicates functions

Vector rewind functions

Vector select functions

Vector spatial index functions

Vector snap functions

Vector TIN functions

Vector type option functions

Vector delete functions

Level 2 only

Vector write functions

Level 1 and 2

Level 2 only

Vector ASCII functions

Vector Simple Feature Access API

Functions from GRASS Simple Feature API (in progress, incomplete).

Vector GEOS functions

Note: The functions are available only if GRASS is compiled with --with-geos switch.

Authors

  • Radim Blazek (vector architecture) <radim.blazek gmail.com>
  • Roberto Micarelli (DGLib) <mi.ro iol.it>

Updates for GRASS 7:

  • Markus Metz (file-based spatial index, vector topology)
  • Martin Landa (GEOS support, direct OGR read access) <landa.martin gmail.com>

References

Text based on: R. Blazek, M. Neteler, and R. Micarelli. The new GRASS 5.1 vector architecture. In Open source GIS - GRASS users conference 2002, Trento, Italy, 11-13 September 2002. University of Trento, Italy, 2002. http://www.ing.unitn.it/~grass/conferences/GRASS2002/proceedings/proceedings/pdfs/Blazek_Radim.pdf

See Also