Patterns in static

Apophenia

Conversion functions

Functions


Detailed Description

The functions to shunt data between text files, database tables, GSL matrices, and plain old arrays.

Converting from database table to <tt>gsl_matrix</tt> or \ref apop_data

Use fill_me = apop_query_to_matrix("select * from table_name;"); or fill_me = apop_query_to_data("select * from table_name;");. [See apop_query_to_matrix; apop_query_to_data.]


Function Documentation

apop_data * apop_array_to_data ( const double **  in,
const int  rows,
const int  cols 
)

Convert a double ** array to an apop_data set. It will have no names. Input data is copied.

Parameters:
in the array to read in
rows,cols the size of the array.
Returns:
the apop_data set, allocated for you and ready to use.

usage:

 apop_data *d = apop_array_to_data(indata, 34, 4); 

If you want to intialize on the allocation line, this isn't what you want. See apop_line_to_data.

gsl_matrix * apop_array_to_matrix ( const double **  in,
const int  rows,
const int  cols 
)

Convert a double ** array to a gsl_matrix

Parameters:
in the array to read in
rows,cols the size of the array.
Returns:
the gsl_matrix, allocated for you and ready to use.

usage:

 gsl_matrix *m = apop_array_to_matrix(indata, 34, 4); 

If you want to intialize on the allocation line, this isn't what you want. See apop_line_to_matrix.

gsl_vector * apop_array_to_vector ( double *  in,
int  size 
)

Just copies a one-dimensional array to a gsl_vector. The input array is undisturbed.

Parameters:
in An array of doubles. (No default. Must not be NULL);
size How long line is. If this is zero or omitted, I'll guess using the sizeof(line)/sizeof(line[0]) trick, which will work for most arrays allocated using double [] and won't work for those allocated using double *. (default = auto-guess)
Returns:
A gsl_vector (which I will allocate for you).

This function uses the Designated initializers syntax for inputs.

gsl_vector * apop_data_pack ( const apop_data in,
gsl_vector *  out 
)

Sometimes, you need to turn an apop_data set into a column of numbers. Thus, this function, that takes in an apop_data set and outputs a gsl_vector. It is valid to use the out_vector->data element as an array of doubles of size out_vector->data->size.

The complement is apop_data_unpack. I.e.,

apop_data_unpack(apop_data_pack(in_data), data_copy) 

will return the original data set (stripped of text and names).

Parameters:
in an apop_data set. (No default. If NULL, return NULL).
out If this is not NULL, then put the output here. The dimensions must match exactly. If NULL, then allocate a new data set.
Returns:
A gsl_vector with the vector data (if any), then each row of data (if any). If out is not NULL, then this is a pointer there.
void apop_data_to_db ( const apop_data set,
const char *  tabname 
)

Dump an apop_data set into the database.

This function is basically preempted by apop_data_print. Use that one; this may soon no longer be available.

Column names are inserted if there are any. If there are, all dots are converted to underscores. Otherwise, the columns will be named c1, c2, c3, &c.

If apop_opts.db_name_column is not blank (the default is "row_name"), then a so-named column is created, and the row names are placed there.

Parameters:
set The name of the matrix
tabname The name of the db table to be created
Todo:
add text names.
void apop_data_unpack ( const gsl_vector *  in,
apop_data d 
)

This is the complement to apop_data_pack, qv. It converts the gsl_vector produced by that function back to an apop_data set with the given dimensions. It overwrites the data in the vector and matrix elements (and that's it).

Parameters:
in a gsl_vector of the form produced by apop_data_pack.
d that data set to be filled. Must be allocated to the correct size.
apop_data * apop_line_to_data ( double *  in,
int  vsize,
int  rows,
int  cols 
)

A convenience function to convert a double ** array to an apop_data set. It will have no names. The input data is copied, not pointed to.

See also apop_line_to_matrix or apop_array_to_vector; this function will use exactly one of these and then wrap an apop_data struct around the output.

Parameters:
in The array to read in. If there were appropriately placed line breaks, then this would look like the eventual data set. For example,

double params[] = {0, 1, 2
                   3, 4, 5};
apop_data_alloc(params, 2, 2, 2);

will produce a vector $\left[\matrix{0 \cr 3}\right]$ and a matrix $\left[\matrix{1 & 2 \cr 4 & 5}\right]$.

vsize The vector size. If there are also rows/cols, I expect this to equal the number or rows.
rows,cols the size of the array.
Returns:
the apop_data set, allocated for you and ready to use.
gsl_matrix * apop_line_to_matrix ( double *  line,
int  rows,
int  cols 
)

Convert a double * array to a gsl_matrix. Input data is copied.

Parameters:
line the array to read in
rows,cols the size of the array.
Returns:
the gsl_matrix, allocated for you and ready to use.

usage:

 gsl_matrix *m = apop_line_to_matrix(indata, 34, 4); 
See also:
apop_arrary_to_matrix
void apop_matrix_to_db ( const gsl_matrix *  data,
const char *  tabname,
const char **  headers 
)

Dump a gsl_matrix into the database. This function is basically preempted by apop_matrix_print. Use that one; this may soon no longer be available.

Parameters:
data The name of the matrix
tabname The name of the db table to be created
headers A list of column names. If NULL, then the columns will be named c1, c2, c3, &c.
apop_data* apop_text_to_data ( char *  text_file,
int  has_row_names,
int  has_col_names,
int *  field_ends 
)

Read a delimited text file into the matrix element of an apop_data set.

See Notes on input text file formatting.

Parameters:
text_file = "-" The name of the text file to be read in. If "-" (the default), use stdin.
has_row_names = 'n'. Does the lines of data have row names?
has_col_names = 'y'. Is the top line a list of column names? If there are row names, then there should be no first entry in this line like 'row names'. That is, for a 100x100 data set with row and column names, there are 100 names in the top row, and 101 entries in each subsequent row (name plus 100 data points).
field_ends If fields have a fixed size, give the end of each field, e.g. {3, 8 11}.
Returns:
Returns an apop_data set.

example: See apop_ols.

This function uses the Designated initializers syntax for inputs.

apop_data* apop_text_to_data ( char *  text_file,
int  has_row_names,
int  has_col_names 
)

Read a delimited text file into the matrix element of an apop_data set.

See Notes on input text file formatting.

Parameters:
text_file = "-" The name of the text file to be read in. If "-" (the default), use stdin.
has_row_names = 'n'. Does the lines of data have row names?
has_col_names = 'y'. Is the top line a list of column names? If there are row names, then there should be no first entry in this line like 'row names'. That is, for a 100x100 data set with row and column names, there are 100 names in the top row, and 101 entries in each subsequent row (name plus 100 data points).
Returns:
Returns an apop_data set.

example: See apop_ols.

This function uses the Designated initializers syntax for inputs.

int apop_text_to_db ( char *  text_file,
char *  tabname,
int  has_row_names,
int  has_col_names,
char **  field_names,
int *  field_ends 
)

Read a text file into a database table.

See Notes on input text file formatting.

Using the data set from the example on the apop_ols page, here's another way to do the regression:

#include <apop.h>

int main(void){ 
  apop_data       *data; 
  apop_model   *est;
    apop_db_open(NULL);
    apop_text_to_db("data", "d",.has_row_names= 0,.has_col_names=1);
    data       = apop_query_to_data("select * from d");
    est        = apop_estimate(data, apop_ols);
    printf("The OLS coefficients:\n");
    apop_model_print(est);
} 

By the way, there is a begin/commit wrapper that bundles the process into bundles of 2000 inserts per transaction. if you want to change this to more or less frequent commits, you'll need to modify and recompile the code.

Parameters:
text_file The name of the text file to be read in. If "-", then read from STDIN. (default = "-")
tabname The name to give the table in the database (default = apop_strip_dots (text_file, 'd'); default in Python/R interfaces="t")
has_row_names Does the lines of data have row names? (default = 0)
has_col_names Is the top line a list of column names? All dots in the column names are converted to underscores, by the way. (default = 1)
field_names The list of field names, which will be the columns for the table. If has_col_names==1, read the names from the file (and just set this to NULL). If has_col_names == 1 && field_names !=NULL, I'll use the field names. (default = NULL)
field_ends If fields have a fixed size, give the end of each field, e.g. {3, 8 11}.
Returns:
Returns the number of rows. This function uses the Designated initializers syntax for inputs.
int apop_text_to_db ( char *  text_file,
char *  tabname,
int  has_row_names,
int  has_col_names,
char **  field_names 
)

Read a text file into a database table.

See Notes on input text file formatting.

Using the data set from the example on the apop_ols page, here's another way to do the regression:

#include <apop.h>

int main(void){ 
  apop_data       *data; 
  apop_model   *est;
    apop_db_open(NULL);
    apop_text_to_db("data", "d",.has_row_names= 0,.has_col_names=1);
    data       = apop_query_to_data("select * from d");
    est        = apop_estimate(data, apop_ols);
    printf("The OLS coefficients:\n");
    apop_model_print(est);
} 

By the way, there is a begin/commit wrapper that bundles the process into bundles of 2000 inserts per transaction. if you want to change this to more or less frequent commits, you'll need to modify and recompile the code.

Parameters:
text_file The name of the text file to be read in. If "-", then read from STDIN. (default = "-")
tabname The name to give the table in the database (default = apop_strip_dots (text_file, 'd'); default in Python/R interfaces="t")
has_row_names Does the lines of data have row names? (default = 0)
has_col_names Is the top line a list of column names? All dots in the column names are converted to underscores, by the way. (default = 1)
field_names The list of field names, which will be the columns for the table. If has_col_names==1, read the names from the file (and just set this to NULL). If has_col_names == 1 && field_names !=NULL, I'll use the field names. (default = NULL)
Returns:
Returns the number of rows. This function uses the Designated initializers syntax for inputs.
double * apop_vector_to_array ( const gsl_vector *  in  ) 

Converts a gsl_vector to an array.

Parameters:
in A gsl_vector
Returns:
A double*, which will be malloced inside the function.

SourceForge.net Logo

Autogenerated by doxygen on 23 Nov 2009.