s.kcv - Randomly partition sites into test/train sets.
       (GRASS Sites Program)


SYNOPSIS

       s.kcv
       s.kcv help
       s.kcv [-dq] kvalue sitesname


DESCRIPTION

       s.kcv  randomly  divides  a  sites  lists  into  k sets of
       test/train data (for k-fold cross validation).  Test  par­
       titions  are  mutually  exclusive.  That  is,  a site will
       appear in only one test partition and k-1 training  parti­
       tions.

       The  program  generates  a random point using the selected
       random number generator and then finds the closest site to
       it.  This site is removed from the candidate list (meaning
       that it will not be selected for any other test  set)  and
       saved  in the first test partition file.  This is repeated
       until enough points have been selected for the test parti­
       tion.   The  number  of  sites  chosen for test partitions
       depends upon the number of sites available and the  number
       of partitions chosen (this number is made as consistent as
       possible while ensuring that all sites will be chosen  for
       testing).  This  process of filling up a test partition is
       done k times.

       Flags:

       -d                Use drand48() (default is rand()).

       -q                Quiet. Don't report progress.

       Parameters:

       kvalue            Positive integer  value  indicating  the
                         number of partitions.


       sitesname         Name  of  a  sites  file to store random
                         points in.

       Test/train pairs are saved as sites list using name  as  a
       basename. Test sites are saved in name-test.i while train­
       ing sites are saved in name-train.i, where i  ranges  from
       zero to k.


NOTES

       Existing files are silently overwritten.

       An ideal random sites generator will follow a Poisson dis­
       only be as random as the  original  sites.   This  program
       simply divides sites up in a random manner.

       Be  warned  that  random number generation occurs over the
       intervals defined by the current region.

       This program may not work properly with Lat-long data.



SEE ALSO

       rand(3), drand48(3), s.rand and g.region


BUGS

       Please send all bug fixes and comments to the author.


AUTHOR

       James Darrell McCauley, Purdue University
       (mccauley@ecn.purdue.edu)



































Man(1) output converted with man2html