Fractional factorial design

In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design. The subset is chosen so as to exploit the sparsity-of-effects principle to expose information about the most important features of the problem studied, while using a fraction of the effort of a full factorial design in terms of experimental runs and resources.

Notation

Fractional designs are expressed using the notation l^k − p, where l is the number of levels of each factor investigated, k is the number of factors investigated, and p describes the size of the fraction of the full factorial used. Formally, p is the number of generators, assignments as to which effects or interactions are confounded, i.e., cannot be estimated independently of each other (see below). A design with p such generators is a 1/(l^p) fraction of the full factorial design.

For example, a 2^5 − 2 design is 1/4 of a two level, five factor factorial design. Rather than the 32 runs that would be required for the full 2⁵ factorial experiment, this experiment requires only eight runs.

In practice, one rarely encounters l > 2 levels in fractional factorial designs, since response surface methodology is a much more experimentally efficient way to determine the relationship between the experimental response and factors at multiple levels. In addition, the methodology to generate such designs for more than two levels is much more cumbersome.

The levels of a factor are commonly coded as +1 for the higher level, and −1 for the lower level. For a three-level factor, the intermediate value is coded as 0.

To save space, the points in a two-level factorial experiment are often abbreviated with strings of plus and minus signs. The strings have as many symbols as factors, and their values dictate the level of each factor: conventionally, $-$ for the first (or low) level, and $+$ for the second (or high) level. The points in this experiment can thus be represented as $--$ , $+-$ , $-+$ , and $++$ .

The factorial points can also be abbreviated by (1), a, b, and ab, where the presence of a letter indicates that the specified factor is at its high (or second) level and the absence of a letter indicates that the specified factor is at its low (or first) level (for example, "a" indicates that factor A is on its high setting, while all other factors are at their low (or first) setting). (1) is used to indicate that all factors are at their lowest (or first) values.

Generation

In practice, experimenters typically rely on statistical reference books to supply the "standard" fractional factorial designs, consisting of the principal fraction. The principal fraction is the set of treatment combinations for which the generators evaluate to + under the treatment combination algebra. However, in some situations, experimenters may take it upon themselves to generate their own fractional design.

A fractional factorial experiment is generated from a full factorial experiment by choosing an alias structure. The alias structure determines which effects are confounded with each other. For example, the five factor 2^5 − 2 can be generated by using a full three factor factorial experiment involving three factors (say A, B, and C) and then choosing to confound the two remaining factors D and E with interactions generated by D = A*B and E = A*C. These two expressions are called the generators of the design. So for example, when the experiment is run and the experimenter estimates the effects for factor D, what is really being estimated is a combination of the main effect of D and the two-factor interaction involving A and B.

An important characteristic of a fractional design is the defining relation, which gives the set of interaction columns equal in the design matrix to a column of plus signs, denoted by I. For the above example, since D = AB and E = AC, then ABD and ACE are both columns of plus signs, and consequently so is BDCE. In this case the defining relation of the fractional design is I = ABD = ACE = BCDE. The defining relation allows the alias pattern of the design to be determined.

Treatment combinations for a 2^5 − 2 design
Treatment combination	I	A	B	C	D = AB	E = AC
de	+	−	−	−	+	+
a	+	+	−	−	−	−
be	+	−	+	−	−	+
abd	+	+	+	−	+	−
cd	+	−	−	+	+	−
ace	+	+	−	+	−	+
bc	+	−	+	+	−	−
abcde	+	+	+	+	+	+

Resolution

An important property of a fractional design is its resolution or ability to separate main effects and low-order interactions from one another. Formally, the resolution of the design is the minimum word length in the defining relation excluding (1). The most important fractional designs are those of resolution III, IV, and V: Resolutions below III are not useful and resolutions above V are wasteful in that the expanded experimentation has no practical benefit in most cases—the bulk of the additional effort goes into the estimation of very high-order interactions which rarely occur in practice. The 2^5 − 2 design above is resolution III since its defining relation is I = ABD = ACE = BCDE.

Resolution	Ability	Example
I	Not useful: an experiment of exactly one run only tests one level of a factor and hence can't even distinguish between the high and low levels of that factor	2^1 − 1 with defining relation I = A
II	Not useful: main effects are confounded with other main effects	2^2 − 1 with defining relation I = AB
III	Estimate main effects, but these may be confounded with two-factor interactions	2^3 − 1 with defining relation I = ABC
IV	Estimate main effects unconfounded by two-factor interactions Estimate two-factor interaction effects, but these may be confounded with other two-factor interactions	2^4 − 1 with defining relation I = ABCD
V	Estimate main effects unconfounded by three-factor (or less) interactions Estimate two-factor interaction effects unconfounded by two-factor interactions Estimate three-factor interaction effects, but these may be confounded with other two-factor interactions	2^5 − 1 with defining relation I = ABCDE
VI	Estimate main effects unconfounded by four-factor (or less) interactions Estimate two-factor interaction effects unconfounded by three-factor (or less) interactions Estimate three-factor interaction effects, but these may be confounded with other three-factor interactions	2^6 − 1 with defining relation I = ABCDEF

The resolution described is only used for regular designs. Regular designs have run size that equal a power of two, and only full aliasing is present. Nonregular designs are designs where run size is a multiple of 4; these designs introduce partial aliasing, and generalized resolution is used as design criteria instead of the resolution described previously.

References

Box, G.E.; Hunter, J.S.; Hunter,W.G. (2005). Statistics for Experimenters: Design, Innovation, and Discovery, 2nd Edition. Wiley. ISBN 0-471-71813-0.

External links

Design of experiments

Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size

Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable

Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison

Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test

Glossary Category Statistics portal Statistical outline Statistical topics

Statistics

Descriptive statistics

Continuous data

Center	Mean arithmetic geometric harmonic Median Mode

Dispersion	Variance Standard deviation Coefficient of variation Percentile Range Interquartile range

Shape	Moments Skewness Kurtosis L-moments

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Population Statistic Effect size Statistical power Sample size determination Missing data

Survey methodology	Sampling Standard error stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Interaction Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in

Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife

Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F

Goodness of fit	Chi-squared Kolmogorov–Smirnov Anderson–Darling Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC

Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the 11/19/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.