![]() |
|
Go to the source code of this file.
| #define Apop_assert | ( | test, | |||
| returnval, | |||||
| level, | |||||
| stop, | |||||
| ... | ) |
do \ if (!(test)) { \ if (apop_opts.verbose >= level) { fprintf(stderr, "%s: ", __func__); fprintf(stderr, __VA_ARGS__); fprintf(stderr, "\n");} \ if (stop == 's' || stop == 'h') assert(test); \ return returnval; \ } while (0);
A convenient front for apop_error, that tests the first element and basically runs apop_error if it is false. See also Apop_assert_void and apop_error.
Following the tradition regarding assert functions, this is a macro but is not in all caps.
| test | The expression that you are asserting is nonzero. | |
| returnval | If the assertion fails, return this. If you want to halt on error, this is irrelevant, but still has to match your function's return type. | |
| level | Print the warning message only if apop_opts.verbose is greater than or equal to this. Zero usually works, but for minor infractions use one. | |
| stop | If 's', halt the program (using the standard C assert); if 'c', continue by returning the return value and printing an error message if appropriate. | |
| ... | The error message in printf form, plus any arguments to be inserted into the printf string. I'll provide the function name and a carriage return. |
| #define Apop_assert_void | ( | test, | |||
| level, | |||||
| stop, | |||||
| ... | ) |
do \ if (!(test)) { \ if (apop_opts.verbose >= level) { fprintf(stderr, "%s: ", __func__); fprintf(stderr, __VA_ARGS__); fprintf(stderr, "\n");} \ if (stop == 's' || stop == 'h') assert(test); \ } while (0);
Like Apop_assert, but no return step. It is thus useful in void functions.
Following the tradition regarding assert functions, this is a macro but is not in all caps.
| test | The expression that you are asserting is nonzero. | |
| level | Print the warning message only if apop_opts.verbose is greater than or equal to this. Zero usually works, but for minor infractions use one. | |
| stop | If 's', halt the program (using the standard C assert); if 'c', continue by returning the return value and printing an error message if appropriate. | |
| ... | The error message in printf form, plus any arguments to be inserted into the printf string. I'll provide the function name and a carriage return. |
| #define Apop_settings_alloc | ( | type, | |||
| out, | |||||
| ... | ) | apop_ ##type ##_settings *out = apop_ ##type ##_settings_alloc(__VA_ARGS__); |
This is obsolete. Use Apop_model_add_group.
For what it's worth, this is a convenience macro. Expands:
Apop_settings_alloc(mle, ms, data, model);
to:
apop_mle_settings *ms = apop_mle_settings_alloc(data, model);
As of this writing, options for the first argument include mle, histogram, and update. See the respective documentations for the arguments to be sent to the respective allocation functions. Because this is an obsolete function, that list may shrink.
If there is an NaN anywhere in the row of data (including the matrix, the vector, and the weights) then delete the row from the data set.
The function returns a new data set with the NaNs removed, so the original data set is left unmolested. You may want to apop_data_free the original immediately after this function.
NULL. inplace = 'y', then I'll free each element of the input data set and refill it with the pruned elements. Again, I'll take up (up to) twice the size of the data set in memory during the function. If every row has an NaN, then your apop_data set will have a lot of NULL elements. | d | The data, with NaNs | |
| inplace | If 'y', clear out the pointer-to-apop_data that you sent in and refill with the pruned data. If 'n', leave the set alone and return a new data set. |
inplace=='y', redundant with the input. This function sorts the whole of a apop_data set based on one column. Sorts in place, with little additional memory used.
Uses the gsl_sort_vector_index function internally, and that function just ignores NaNs; therefore this function just leaves NaNs exactly where they lay.
| data | The input set to be modified. (No default, must not be NULL.) | |
| sortby | The column of data by which the sorting will take place. As usual, -1 indicates the vector element. (default: column zero of the matrix) | |
| asc | If 'd' or 'D', sort in descending order; else sort in ascending order. (Default: ascending) |
apop_data_show(apop_data_sort(d, -1)).This function uses the Designated initializers syntax for inputs.
| void apop_error | ( | int | level, | |
| char | stop, | |||
| char * | msg, | |||
| ... | ||||
| ) |
Inform the user of a faux pas. See also Apop_assert, which allows the function to return a value.
| level | At what verbosity level should the user be warned? E.g., if level==2, then print iff apop_opts.verbosity >= 2. You can set apop_opts.verbose==-1 to turn off virtually all messages, but this is probably ill-advised. | |
| stop | Either 's' or 'c', indicating whether the program should stop or continue. If stopping, uses assert(0) for easy debugging. You can use 'h' (halt) as a synonym for 's'. | |
| msg | The message to write to STDERR (presuming the verbosity level is high enough). This can be a printf-style format with following arguments. You can produce much more informative error messages this way, e.g., apop_error(0, 's', "Beta is %g but should be greater than zero.", beta);. |
| double apop_generalized_harmonic | ( | int | N, | |
| double | s | |||
| ) |
Calculate
| apop_model* apop_histogram_model_reset | ( | apop_model * | base, | |
| apop_model * | m, | |||
| long int | draws, | |||
| gsl_rng * | rng | |||
| ) |
Give me an existing histogram (i.e., an apop_model) and I'll create a new histogram with the same bins, but with data from draws random draws from the parametrized model you provide.
Unlike with most other histogram-genrating functions, this one will normalize the output to integrate to one. It uses the Designated initializers syntax for inputs.
| base | An apop_model produced using a form like apop_estimate(yourdata, apop_histogram). I.e. a histogram model to be used as a template. (No default) | |
| m | The model to be drawn from. Because this function works via random draws, the model needs to have a draw method. (No default) | |
| draws | The number of random draws to make. (arbitrary default = 1e5) | |
| rng | The gsl_rng used to make random draws. (default: see note on Auto-allocated RNGs) |
| apop_model* apop_histogram_moving_average | ( | apop_model * | m, | |
| size_t | bandwidth | |||
| ) |
Return a new histogram that is the moving average of the input histogram.
| m | A histogram, in apop_model form. | |
| bandwidth | The number of elements to be smoothed. |
| void apop_histogram_normalize | ( | apop_model * | m | ) |
Scale a histogram so it integrates to one (and is thus a proper PMF).
| apop_model* apop_histogram_vector_reset | ( | apop_model * | template, | |
| gsl_vector * | indata | |||
| ) |
Give me an existing histogram (i.e., an apop_model) and I'll create a new histogram with the same bins, but with data from the vector you provide
| template | An apop_model produced using a form like apop_estimate(yourdata, apop_histogram). | |
| indata | The new data to be binned. |
| apop_data* apop_histograms_test_goodness_of_fit | ( | apop_model * | m0, | |
| apop_model * | m1 | |||
| ) |
Test the goodness-of-fit between two histograms (in apop_model form). I assume that the histograms are aligned.
| apop_model* apop_ml_imputation | ( | apop_data * | d, | |
| apop_model * | mvn | |||
| ) |
Impute the most likely data points to replace NaNs in the data, and insert them into the given data. That is, the data set is modified in place.
| d | The data set. It comes in with NaNs and leaves entirely filled in. | |
| mvn | A parametrized apop_model from which you expect the data was derived. if NULL, then I'll use the Multivariate Normal that best fits the data after listwise deletion. |
apop_ml_imputation_model. Also, the data input will be filled in and ready to use. | double apop_test | ( | double | statistic, | |
| char * | distribution, | |||
| double | p1, | |||
| double | p2, | |||
| char | tail | |||
| ) |
This is a convenience function to do the lookup of a given statistic along a given distribution. You give me a statistic, its (hypothesized) distribution, and whether to use the upper tail, lower tail, or both. I will return the odds of a Type I error given the model---in statistician jargon, the
-value. [Type I error: odds of rejecting the null hypothesis when it is true.]
For example,
apop_test(1.3);
will return the density of the standard Normal distribution that is more than 1.3 from zero. If this function returns a small value, we can be confident that the statistic is significant. Or,
apop_test(1.3, "t", 10, tail='u');
will give the appropriate odds for an upper-tailed test using the
-distribution with 10 degrees of freedom (e.g., a
-test of the null hypothesis that the statistic is less than or equal to zero).
Several more distributions are supported; see below.
| statistic | The scalar value to be tested. | |
| distribution | The name of the distribution; see below. | |
| p1 | The first parameter for the distribution; see below. | |
| p2 | The second parameter for the distribution; see below. | |
| tail | 'u' = upper tail; 'l' = lower tail; anything else = two-tailed. (default = two-tailed) |
-value).Here is a list of distributions you can use, and their parameters.
"normal" or "gaussian"
"lognormal"
"uniform"
"t"
"chi squared", "chi", "chisq":
-value for typical cases)"f"
| apop_data* apop_test_kolmogorov | ( | apop_model * | m1, | |
| apop_model * | m2 | |||
| ) |
Run the Kolmogorov test to determine whether two distributions are identical.
| m1,m2 | Two matching apop_histograms, probably produced via apop_histogram_vector_reset or apop_histogram_model_reset. |
-value from the Kolmogorov test that the two distributions are equal. | apop_model* apop_update | ( | apop_data * | data, | |
| apop_model * | prior, | |||
| apop_model * | likelihood, | |||
| gsl_rng * | rng | |||
| ) |
Take in a prior and likelihood distribution, and output a posterior distribution.
This function first checks a table of conjugate distributions for the pair you sent in. If the names match the table, then the function returns a closed-form model with updated parameters. If the parameters aren't in the table of conjugate priors/likelihoods, then it uses Markov Chain Monte Carlo to sample from the posterior distribution, and then outputs a histogram model for further analysis. Notably, the histogram can be used as the input to this function, so you can chain Bayesian updating procedures.
To change the default settings (MCMC starting point, periods, burnin...), add an apop_update_settings struct to the prior.
Here are the conjugate distributions currently defined:
Prior | Likelihood | Notes |
Gamma likelihood represents the distribution of | ||
Assumes prior with fixed | ||
Uses sum and size of the data |
| data | The input data, that will be used by the likelihood function (default = NULL.) | |
| prior | The prior apop_model (No default, must not be NULL.) | |
| likelihood | The likelihood apop_model. If the system needs to estimate the posterior via MCMC, this needs to have a draw method. (No default, must not be NULL.) | |
| rng | A gsl_rng, already initialized (e.g., via apop_rng_alloc). (default: see Auto-allocated RNGs) |
check_conjugacy subfuction), is a little short, and can always be longer.| gsl_vector* apop_vector_moving_average | ( | gsl_vector * | v, | |
| size_t | bandwidth | |||
| ) |
Return a new vector that is the moving average of the input vector.
| v | The input vector, unsmoothed | |
| bandwidth | The number of elements to be smoothed. |
| double* apop_vector_percentiles | ( | gsl_vector * | data, | |
| char | rounding | |||
| ) |
Returns a vector of size 101, where returned_vector[95] gives the value of the 95th percentile, for example. Returned_vector[100] is always the maximum value, and returned_vector[0] is always the min (regardless of rounding rule).
| data | a gsl_vector of data. (No default, must not be NULL.) | |
| rounding | This will either be 'u', 'd', or 'a'. Unless your data is exactly a multiple of 101, some percentiles will be ambiguous. If 'u', then round up (use the next highest value); if 'd' (or anything else), round down to the next lowest value; if 'a', take the mean of the two nearest points. If 'u' or 'a', then you can say "5% or more of the sample is below returned_vector[5]"; if 'd' or 'a', then you can say "5% or more of the sample is above returned_vector[5]". (Default = 'd'.) |
This function uses the Designated initializers syntax for inputs.