![]() |
|
Generally, if it assumes something is Normally distributed, it's here.
A utility to make a matrix of dummy variables. You give me a single vector that lists the category number for each item, and I'll return a gsl_matrix with a single one in each row in the column specified.
After running this, you will almost certainly want to join together the output here with your main data set. E.g.,:
apop_data *dummies = apop_data_to_dummies(main_regression_vars, .col=8, .type='t'); apop_data_stack(main_regression_vars, dummies, 'c');
| d | The data set with the column to be dummified (No default.) | |
| col | The column number to be transformed (default = 0) | |
| type | 'd'==data column (-1==vector), 't'==text column. (default = 't') | |
| keep_first | if zero, return a matrix where each row has a one in the (column specified MINUS ONE). That is, the zeroth category is dropped, the first category has an entry in column zero, et cetera. If you don't know why this is useful, then this is what you need. If you know what you're doing and need something special, set this to one and the first category won't be dropped. (default = 0) |
This function uses the Designated initializers syntax for inputs.
| apop_model* apop_estimate_fixed_effects_OLS | ( | apop_data * | data, | |
| gsl_vector * | categories | |||
| ) |
A fixed-effects regression. The input is a data matrix for a regression, plus a single vector giving the fixed effect vectors.
The solution of a fixed-effects regression is via a partitioned regression. Given that the data set is divided into columns
and
, then the reader may
| void apop_estimate_parameter_t_tests | ( | apop_model * | est | ) |
For many, it is a knee-jerk reaction to a parameter estimation to test whether each individual parameter differs from zero. This function does that.
| est | The apop_estimate, which includes pre-calculated parameter estimates, var-covar matrix, and the original data set. |
Returns nothing. At the end of the routine, the est->parameters->matrix includes a set of t-test values: p value, confidence (=1-pval), t statistic, standard deviation, one-tailed Pval, one-tailed confidence.
| apop_data* apop_f_test | ( | apop_model * | est, | |
| apop_data * | contrast, | |||
| int | normalize | |||
| ) |
Runs an F-test specified by q and c. Your best bet is to see the chapter on hypothesis testing in Modeling With Data, p 309. It will tell you that:
and that's what this function is based on.
| est | an apop_model that you have already calculated. (No default) | |
| contrast | The matrix and the vector , where each row represents a hypothesis. (Defaults: if matrix is NULL, it is set to the identity matrix; if the vector is NULL, it is set to zero; if the entire apop_data set is NULL or omitted, both of these settings are made.) | |
| normalize | If 1, then I will normalize the data set at est->data so that each column has mean zero (that is, I run apop_matrix_normalize (data, 'c', 'm');).If zero, then I will copy off the entire dataset and do the normalization on my copy, leaving the input data as-is. (Default: 0) |
apop_data set with a few variants on the confidence with which we can reject the joint hypothesis.
. In fact, if you did GLS, this is invalid, because you need
, and I didn't ask for
.This function uses the Designated initializers syntax for inputs.
Convert a column of text in the text portion of an apop_data set into a column of numeric elements, which you can use for a multinomial probit, for example.
| d | The data set to be modified in place. | |
| datacol | The column in the data set where the numeric factors will be written (-1 means the vector, which I will allocate for you if it is NULL) | |
| textcol | The column in the text that will be converted. |
For example:
apop_data *d = apop_query_to_mixed_data("mmt", "select 1, year, color from data"); apop_text_to_factors(d, 0, 0);
Notice that the query pulled a column of ones for the sake of saving room for the factors.
apop_data set with only one column of text. Give me a column of text, and I'll give you a sorted list of the unique elements. This is basically running "select distinct * from datacolumn", but without the aid of the database.
| d | An apop_data set with a text component | |
| col | The text column you want me to use. |
| gsl_vector* apop_vector_unique_elements | ( | const gsl_vector * | v | ) |
Give me a vector of numbers, and I'll give you a sorted list of the unique elements. This is basically running "select distinct * from datacolumn", but without the aid of the database.
| v | a vector of items |