![]() |
|
The apop_data structure represents a data set. It joins together a gsl_vector, a gsl_matrix, an apop_name, and a table of strings. It tries to be minimally intrusive, so you can use it everywhere you would use a gsl_matrix or a gsl_vector.
For example, let us say that you are running a regression: there is a vector for the dependent variable, and a matrix for the dependent variables. Think of them as a partitioned matrix, where the vector is column -1, and the first column of the matrix is column zero. Here is some code to print the entire matrix. Notice that the column counter i starts counting at -1.
for (j = 0; j< data->matrix->size1; j++){ printf("%s\t", data->names->row[j]); for (i = -1; i< data->matrix->size2; i++) printf("%g\t", apop_data_get(data, j, i)); printf("\n"); }
We're generally assuming that the data vector and data matrix have the same row count: data->vector->size==data->matrix->size1 . This means that the apop_name structure doesn't have separate vector_names and row_names elements: the rownames are assumed to apply for both.
| apop_data* apop_data_alloc | ( | const size_t | vsize, | |
| const size_t | msize1, | |||
| const int | msize2 | |||
| ) |
Allocate a apop_data structure, to be filled with data. The three arguments are: vector size, matrix rows, matrix cols. Any and all of these may be zero, in which case the vector or matrix will not be allocated, as appropriate.
Your best bet for allocating the categories is to produce them elsewhere, such as apop_query_to_text and then point an apop_data structure with a zero-sized vector to your matrix of strings.
The weights vector is set to NULL. If you need it, allocate it via
d->weights = gsl_vector_alloc(row_ct);
.
| vsize | vector size, if any. | |
| msize1,msize2 | Row and column size for the matrix. |
apop_data_alloc(0,0,0) will produce a basically blank set, with out->matrix==out->vector==NULL.
| apop_data* apop_data_calloc | ( | const size_t | vsize, | |
| const size_t | msize1, | |||
| const int | msize2 | |||
| ) |
Allocate a apop_data structure, to be filled with data; set everything in the allocated portion to zero. See apop_data_alloc for details.
| vsize | vector size, if any. | |
| msize1,msize2 | Row and column size for the matrix. If size2>0 this exactly follows the format of gsl_matrix_calloc. If size2==-1, then allocate a vector. |
apop_data_calloc(0, 0,0) will produce a basically blank set, with out->matrix==out->vector==NULL.
Copy one apop_data structure to another. That is, all data is duplicated.
Just a front-end for apop_data_memcpy for those who prefer this sort of syntax.
| in | the input data |
| void apop_data_free | ( | apop_data * | freeme | ) |
Free an apop_data structure.
As with free(), it is safe to send in a NULL pointer (in which case the funtion does nothing).
Copy one apop_data structure to another. That is, all data is duplicated.
This function does not allocate the output structure for you for the overall structure or the vector or matrix. If you want such behavior, usr apop_data_copy. Both functions do allocate memory for the text.
| out | a structure that this function will fill. Must be preallocated | |
| in | the input data |
| void apop_data_rm_columns | ( | apop_data * | d, | |
| int * | drop | |||
| ) |
Remove the columns set to one in the drop vector. The returned data structure looks like it was modified in place, but the data matrix and the names are duplicated before being pared down, so if your data is taking up more than half of your memory, this may not work.
| d | the apop_data structure to be pared down. | |
| drop | an array of ints. If use[7]==1, then column seven will be cut from the output. A reminder: calloc(in->size2 , sizeof(int)) will fill your array with zeros on allocation, and memset(use, 1, in->size2 * sizeof(int)) will quickly fill an array of ints with nonzero values. |
Put the first data set either on top of or to the left of the second matrix.
The fn returns a new data set, meaning that at the end of this function, until you apop_data_free() the original data sets, you will be taking up twice as much memory. Plan accordingly.
For the opposite operation, see apop_data_split.
| m1 | the upper/rightmost data set (default = NULL) | |
| m2 | the second data set (default = NULL) | |
| posn | If 'r', stack rows of m1's matrix above rows of m2's if 'c', stack columns of m1's matrix to left of m2's (default = 'r') | |
| inplace | If 'i' 'y' or 1, use apop_vector_realloc to modify v1 in place; see the caveats on that function. Otherwise, allocate a new vector, leaving v1 unmolested. (default='n') |
m1 m1 is NULL and inplace is 'y', where you'll get the original m1 back) m1, with the names for m2 appended to the row or column list, as appropriate.This function uses the Designated initializers syntax for inputs.