Patterns in static

Apophenia

apop_regression.c File Reference

Typedefs

Functions


Detailed Description

Generally, if it assumes something is Normally distributed, it's here.


Function Documentation

apop_data* apop_data_to_dummies ( apop_data d,
int  col,
char  type,
int  keep_first 
)

A utility to make a matrix of dummy variables. You give me a single vector that lists the category number for each item, and I'll return a gsl_matrix with a single one in each row in the column specified.

After running this, you will almost certainly want to join together the output here with your main data set. E.g.,:

apop_data *dummies  = apop_data_to_dummies(main_regression_vars, .col=8, .type='t');
apop_data_stack(main_regression_vars, dummies, 'c');
Parameters:
d The data set with the column to be dummified (No default.)
col The column number to be transformed (default = 0)
type 'd'==data column (-1==vector), 't'==text column. (default = 't')
keep_first if zero, return a matrix where each row has a one in the (column specified MINUS ONE). That is, the zeroth category is dropped, the first category has an entry in column zero, et cetera. If you don't know why this is useful, then this is what you need. If you know what you're doing and need something special, set this to one and the first category won't be dropped. (default = 0)

This function uses the Designated initializers syntax for inputs.

apop_model* apop_estimate_fixed_effects_OLS ( apop_data data,
gsl_vector *  categories 
)

A fixed-effects regression. The input is a data matrix for a regression, plus a single vector giving the fixed effect vectors.

The solution of a fixed-effects regression is via a partitioned regression. Given that the data set is divided into columns $\beta_1$ and $\beta_2$, then the reader may

Todo:
finish this documentation. [Was in a rush today.]
void apop_estimate_parameter_t_tests ( apop_model est  ) 

For many, it is a knee-jerk reaction to a parameter estimation to test whether each individual parameter differs from zero. This function does that.

Parameters:
est The apop_estimate, which includes pre-calculated parameter estimates, var-covar matrix, and the original data set.

Returns nothing. At the end of the routine, the est->parameters->matrix includes a set of t-test values: p value, confidence (=1-pval), t statistic, standard deviation, one-tailed Pval, one-tailed confidence.

apop_data* apop_f_test ( apop_model est,
apop_data contrast,
int  normalize 
)

Runs an F-test specified by q and c. Your best bet is to see the chapter on hypothesis testing in Modeling With Data, p 309. It will tell you that:

\[{N-K\over q} {({\bf Q}'\hat\beta - {\bf c})' [{\bf Q}' ({\bf X}'{\bf X})^{-1} {\bf Q}]^{-1} ({\bf Q}' \hat\beta - {\bf c}) \over {\bf u}' {\bf u} } \sim F_{q,N-K},\]

and that's what this function is based on.

Parameters:
est an apop_model that you have already calculated. (No default)
contrast The matrix ${\bf Q}$ and the vector ${\bf c}$, where each row represents a hypothesis. (Defaults: if matrix is NULL, it is set to the identity matrix; if the vector is NULL, it is set to zero; if the entire apop_data set is NULL or omitted, both of these settings are made.)
normalize If 1, then I will normalize the data set at est->data so that each column has mean zero (that is, I run apop_matrix_normalize (data, 'c', 'm');).If zero, then I will copy off the entire dataset and do the normalization on my copy, leaving the input data as-is. (Default: 0)
Returns:
An apop_data set with a few variants on the confidence with which we can reject the joint hypothesis.
Todo:
There should be a way to get OLS and GLS to store $(X'X)^{-1}$. In fact, if you did GLS, this is invalid, because you need $(X'\Sigma X)^{-1}$, and I didn't ask for $\Sigma$.

This function uses the Designated initializers syntax for inputs.

apop_data* apop_text_to_factors ( apop_data d,
size_t  textcol,
int  datacol 
)

Convert a column of text in the text portion of an apop_data set into a column of numeric elements, which you can use for a multinomial probit, for example.

Parameters:
d The data set to be modified in place.
datacol The column in the data set where the numeric factors will be written (-1 means the vector, which I will allocate for you if it is NULL)
textcol The column in the text that will be converted.

For example:

apop_data *d  = apop_query_to_mixed_data("mmt", "select 1, year, color from data");
apop_text_to_factors(d, 0, 0);

Notice that the query pulled a column of ones for the sake of saving room for the factors.

Returns:
A table of the factors used in the code. This is an apop_data set with only one column of text.
apop_data* apop_text_unique_elements ( const apop_data d,
size_t  col 
)

Give me a column of text, and I'll give you a sorted list of the unique elements. This is basically running "select distinct * from datacolumn", but without the aid of the database.

Parameters:
d An apop_data set with a text component
col The text column you want me to use.
Returns:
An apop_data set with a single sorted column of text, where each unique text input appears once.
See also:
{apop_vector_unique_elements}
gsl_vector* apop_vector_unique_elements ( const gsl_vector *  v  ) 

Give me a vector of numbers, and I'll give you a sorted list of the unique elements. This is basically running "select distinct * from datacolumn", but without the aid of the database.

Parameters:
v a vector of items
Returns:
a sorted vector of the distinct elements that appear in the input.
See also:
{apop_text_unique_elements}

SourceForge.net Logo

Autogenerated by doxygen on 23 Nov 2009.