% Grubbs tests for one or two outliers in data sample % % Description: % % Performs Grubbs' test for one outlier, two outliers on one tail, % or two outliers on opposite tails, in small sample. % % Usage: % % [pval,G,U] = grubbstest(x,type,opposite,twosided) % % Arguments: % % x: a numeric vector or matrix of data values. Matrices are treated % columnwise (each column as independent set). % % opposite: a logical (default 0) indicating whether you want to check not the value % with largest difference from the mean, but opposite (lowest, % if most suspicious is highest etc.) % % type: Integer value indicating test variant. 10 is a test for one % outlier (side is detected automatically and can be reversed % by 'opposite' parameter). 11 is a test for two outliers on % opposite tails, 20 is test for two outliers in one tail. Default 10. % % two.sided: Logical value indicating if there is a need to treat this % test as two-sided. Default 0. % % Details: % % The function can perform three tests given and discussed by Grubbs % (1950). % % First test (10) is used to detect if the sample dataset contains % one outlier, statistically different than the other values. Test % is based by calculating score of this outlier G (outlier minus % mean and divided by sd) and comparing it to appropriate critical % values. Alternative method is calculating ratio of variances of % two datasets - full dataset and dataset without outlier. The % obtained value called U is bound with G by simple formula. % % Second test (11) is used to check if lowest and highest value are % two outliers on opposite tails of sample. It is based on % calculation of ratio of range to standard deviation of the sample. % % Third test (20) calculates ratio of variance of full sample and % sample without two extreme observations. It is used to detect if % dataset contains two outliers on the same tail. % % The p-values are calculated using 'grubbscdf' function. % % Value: % % G,U: the value statistic. For type 10 it is difference between % outlier and the mean divided by standard deviation, and for % type 20 it is sample range divided by standard deviation. % Additional value U is ratio of sample variances with and % withour suspicious outlier. According to Grubbs (1950) these % values for type 10 are bound by simple formula and only one % of them can be used, but function gives both. For type 20 the % G is the same as U. % % pval: the p-value for the test. % % % Author(s): % % Lukasz Komsta, ported from R package 'outliers'. % See R News, 6(2):10-13, May 2006 % % References: % % Grubbs, F.E. (1950). Sample Criteria for testing outlying % observations. Ann. Math. Stat. 21, 1, 27-58. % %