Patterns in static

Apophenia

apop_data

Data Structures

Functions


Detailed Description

The apop_data structure represents a data set. It joins together a gsl_vector, a gsl_matrix, an apop_name, and a table of strings. It tries to be minimally intrusive, so you can use it everywhere you would use a gsl_matrix or a gsl_vector.

For example, let us say that you are running a regression: there is a vector for the dependent variable, and a matrix for the dependent variables. Think of them as a partitioned matrix, where the vector is column -1, and the first column of the matrix is column zero. Here is some code to print the entire matrix. Notice that the column counter i starts counting at -1.

  for (j = 0; j< data->matrix->size1; j++){
    printf("%s\t", data->names->row[j]);
    for (i = -1; i< data->matrix->size2; i++)
        printf("%g\t", apop_data_get(data, j, i));
    printf("\n");
    }

We're generally assuming that the data vector and data matrix have the same row count: data->vector->size==data->matrix->size1 . This means that the apop_name structure doesn't have separate vector_names and row_names elements: the rownames are assumed to apply for both.


Function Documentation

apop_data* apop_data_alloc ( const size_t  vsize,
const size_t  msize1,
const int  msize2 
)

Allocate a apop_data structure, to be filled with data. The three arguments are: vector size, matrix rows, matrix cols. Any and all of these may be zero, in which case the vector or matrix will not be allocated, as appropriate.

Your best bet for allocating the categories is to produce them elsewhere, such as apop_query_to_text and then point an apop_data structure with a zero-sized vector to your matrix of strings.

The weights vector is set to NULL. If you need it, allocate it via

 d->weights   = gsl_vector_alloc(row_ct); 

.

See also:
{apop_data_calloc}
Parameters:
vsize vector size, if any.
msize1,msize2 Row and column size for the matrix.

apop_data_alloc(0,0,0) will produce a basically blank set, with out->matrix==out->vector==NULL.

Returns:
The apop_data structure, allocated and ready.
apop_data* apop_data_calloc ( const size_t  vsize,
const size_t  msize1,
const int  msize2 
)

Allocate a apop_data structure, to be filled with data; set everything in the allocated portion to zero. See apop_data_alloc for details.

Parameters:
vsize vector size, if any.
msize1,msize2 Row and column size for the matrix. If size2>0 this exactly follows the format of gsl_matrix_calloc. If size2==-1, then allocate a vector.

apop_data_calloc(0, 0,0) will produce a basically blank set, with out->matrix==out->vector==NULL.

Returns:
The apop_data structure, allocated and zeroed out.
See also:
apop_data_alloc
apop_data* apop_data_copy ( const apop_data in  ) 

Copy one apop_data structure to another. That is, all data is duplicated.

Just a front-end for apop_data_memcpy for those who prefer this sort of syntax.

Parameters:
in the input data
Returns:
a structure that this function will allocate and fill. If input is NULL, then this will be NULL.
void apop_data_free ( apop_data freeme  ) 

Free an apop_data structure.

As with free(), it is safe to send in a NULL pointer (in which case the funtion does nothing).

void apop_data_memcpy ( apop_data out,
const apop_data in 
)

Copy one apop_data structure to another. That is, all data is duplicated.

This function does not allocate the output structure for you for the overall structure or the vector or matrix. If you want such behavior, usr apop_data_copy. Both functions do allocate memory for the text.

Parameters:
out a structure that this function will fill. Must be preallocated
in the input data
void apop_data_rm_columns ( apop_data d,
int *  drop 
)

Remove the columns set to one in the drop vector. The returned data structure looks like it was modified in place, but the data matrix and the names are duplicated before being pared down, so if your data is taking up more than half of your memory, this may not work.

Parameters:
d the apop_data structure to be pared down.
drop an array of ints. If use[7]==1, then column seven will be cut from the output. A reminder: calloc(in->size2 , sizeof(int)) will fill your array with zeros on allocation, and memset(use, 1, in->size2 * sizeof(int)) will quickly fill an array of ints with nonzero values.
apop_data* apop_data_stack ( apop_data m1,
apop_data m2,
char  posn,
char  inplace 
)

Put the first data set either on top of or to the left of the second matrix.

The fn returns a new data set, meaning that at the end of this function, until you apop_data_free() the original data sets, you will be taking up twice as much memory. Plan accordingly.

For the opposite operation, see apop_data_split.

Parameters:
m1 the upper/rightmost data set (default = NULL)
m2 the second data set (default = NULL)
posn If 'r', stack rows of m1's matrix above rows of m2's
if 'c', stack columns of m1's matrix to left of m2's
(default = 'r')
inplace If 'i' 'y' or 1, use apop_vector_realloc to modify v1 in place; see the caveats on that function. Otherwise, allocate a new vector, leaving v1 unmolested. (default='n')
Returns:
The stacked data, either in a new apop_data set or m1
  • If m1 or m2 are NULL, this returns a copy of the other element, and if both are NULL, you get NULL back (except if m1 is NULL and inplace is 'y', where you'll get the original m1 back)
  • text is ignored
  • If stacking rows on rows, the output vector is the input vectors stacked accordingly. If stacking columns by columns, the output vector is just a copy of the vector of m1 and m2->vector doesn't appear in the output at all.
  • The same rules for dealing with the vector(s) hold for the vector(s) of weights.
  • Names are a copy of the names for m1, with the names for m2 appended to the row or column list, as appropriate.

This function uses the Designated initializers syntax for inputs.

SourceForge.net Logo

Autogenerated by doxygen on 23 Nov 2009.