OPTIMAL DESIGNS FOR THE RESTRICTED MAXIMUM LIKELIHOOD ESTIMATORS IN A RANDOM SPLIT-PLOT MODEL

The design effect for the restricted maximum likelihood estimators of variance components in a completely randomized split-plot model is studied. The model was used to represent the response generated from an experimental scenario where the whole-plot and split-plot factors are random. The work generated groups of balanced designs from same number of experimental runs and compared them for optimality using the derived Fisher Information matrix of the restricted maximum likelihood (REML) estimators. The measure for optimality is the D-optimality criterion; the resulting optimal designs depend on the relative magnitudes of the true values of the variance components. The results show that when the factor variances are larger than the error variances, designs where the absolute difference between the number of whole-plots and the number of levels of the splitplot factor is relatively small show substantial gain in statistical efficiency over other designs.


Introduction
The importance of split-plot designs in industrial experiments has been long recognized.This is because it's typically infeasible to perform a complete randomization of experimental runs.Split-plot designs are used if one or more experimental factor is hard to change and the other experimental factors are easy to change.These factors are distinguished by the ease in which they can be changed from one experimental run to the other: Box et al (2005) described an experiment to study the corrosion resistance of steel bars treated with four coatings at three furnace temperatures.Furnace temperature is the hard-to-change factor because of the time it takes to reset the furnace and reach a new equilibrium temperature.Many works in literature for designing split-plot experiments assume fixed experimental factor effects.Bingham and Sitter (2001) use a split-plot model for experiments in the wood industry.Goos and Vanderbroek (2001) computed D-optimal split-plot designs for an autonomously determined number and size of whole-plot.Goos and Vandebroek (2003) later extended the work by developing an algorithm to construct Doptimal designs for the number and size of whole-plot.Goos and Donev (2007) presented an algorithmic approach to construct tailor made split-plot experiments without having to specify a candidate set.The results of these works cannot be applied to situations where the levels of one or more experimental factors are random variables.
This article is devoted to obtaining an optimal design for a balanced completely randomized split-plot experiment with random whole-plot and split-plot factors where the ANOVA sums of squares are orthogonal.The main aim is to allocate experimental resources for a fixed number of runs.Consider an experiment to study the variation in the intensity of radiation from a furnace at different temperatures and locations.Because of the time it would take to reset the furnace temperature for each run, the hard to change factor or the whole-plot factor is temperature and the easy to change factor or the split-plot factor is location.The levels of temperature and the locations were randomly chosen from well-defined ranges and the sizes of their variances indicate variability in the intensity of radiation across locations and temperature.In the next section we describe the split-plot model used to represent response from the experiments, the variance structure and the steps to derive the fisher information matrix of the restricted maximum likelihood (REML) estimators of variance components accordingly.In section 3 we describe the procedure that will enable us generate the design space, we also present the computer algorithm written in the context of the described methodology.In section 4 generated designs are compared for D-optimality with the aid of the written computer algorithm using selected configurations.Conclusions are presented in section 5.

Split-plot model
The model equation for the split-plot design with one whole-plot factor (Factor A) and one split-plot factor (Factor B) can be written as

Matrix Formulation of the Random Effects Model
In matrix notations, the full random model can be written as Y is a vector of N observations, N 1  is a vector of means, the Z's are indicator matrix associated with individual random effects and the two error terms.The Z's are matrices of 0 and 1, 4 The random effects are assumed to be mutually and completely uncorrelated i.e.
Under these assumptions, the variance covariance matrix of the observation ) var( y can be written as The structure of the covariance matrix depends on the design.The inverse of V can be obtained by following the general procedure given by Searle (1979).The general form for V given by Searle (1979)

Optimal Designs for the Restricted Maximum Likelihood Estimators in a Random Split-plot Model
Following the procedure, ( 5) is obtained explicitly as where a J : matrix of order 'a' with all element unity, a I : Identity matrix of order 'a',  5) by joining similar terms and carrying out the Kronecker product results in

Method of Estimation
In estimating variance components, the restricted maximum likelihood (REML) estimators are derived by maximizing that part of the likelihood function which is location invariant.It is a useful property which allows the estimation of variance components by taking into account the degrees of freedom that are involved in estimating fixed effects.The resultants estimators are the same as the ANOVA estimators that are minimum variance unbiased estimators.The Fisher Information (FI) matrix of the REML estimators as given by Searle et where (  ′   ) is the sum of squares of the elements of the matrix (  ′   ), the derivation of two of the fifteen elements of ( 8) In this work, we seek the design that maximizes the determinant of the derived Fisher Information matrix.

Design Space Generation
The steps to generate candidate designs corresponding to a number of experimental runs for the random effect will be discussed in this section.
For a fixed number of experimental runs Rb arb N   where ar R  , the total number of designs that can be generated (Design Space) includes all distinct combination a, b, r satisfying the following conditions: (i) R includes all non-prime numbers between this range 4≤R≤ N/2 that are factors of N.
for each R generated in (i).
(iii) 2 ≤ a<R i.e. 'a' is a multiplier of R.
Where a = the number of levels of the whole-plot factor, b = the number of levels of the split-plot factor.
i r = the number of whole-plot within each level of the whole-plot factor A.
The number of possible designs equals the total number of ways of distributing R equally into a, i.e. the total number of ways of distributing whole-plots into levels of the whole-plot factor in such ways that form a balanced oneway design. .

The Algorithm
A computer program was written in the R programming language to reflect the methodology that has been presented.The general algorithm is presented: Step 1: Choose the number of experimental runs e.g.N=30 (design space is generated) Step 2: Specify the available information about each variance component Step 3: Calculate the criterion of optimality for all candidate designs in the design space, i.e. the determinant (Dvalue) for each design is shown and optimal designs depending on the criterion of optimality are identified.
The R program implementing this algorithm will be made available upon request.

Comparison and Optimality
In this section comparisons were made using selected configurations of the magnitudes of variances.In many studies on variance components, the factor variance components are usually larger than the error variance components.Goos (2002) recommended that whenever there is no prior information about the ratio of the error variances, then we can set this ratio to be equal to one.The work therefore compares the design based on the configurations., ) (

Optimal Designs for the Restricted Maximum Likelihood Estimators in a Random Split-plot Model
The idea is to allow each factor variance components to be the dominating variance components.

Results when the whole-plot variance is largest
Two configurations {(i) and (ii) above} and three preliminary vectors were used to compare designs when the whole-plots variance is the largest variance.The three preliminary vectors are [2.8,2.7, 2.5, 1, 1], [6.5, 1, 0.9.0.8, 0.8] and [3.8, 0.9, 3.7, 0.8, 0.8].Tables 1, 2 and 3 show the D-optimal designs (designs that have the largest determinant of the derived fisher information matrix), the D-worst designs (designs that have the lowest determinant of the derived information matrix) and the D-efficiency corresponding to N=24, 30, 36 for each of the three preliminary vectors.The D-efficiency of a given design compares the determinant of that design with the determinant of another comparable design, where D in the terms D-optimal, D-worst, D-value and D-efficiency stands for determinant.
The relative D-efficiency of design 1 compared to design 2 is given as Where 1 D is the determinant of the Fisher information matrix of design 1 and 2 D the determinant of the fisher information matrix of design 2, and v is the number of parameters corresponding to the factors and interaction effect.For the random effect model v=3.D-efficiency is used to evaluate the performance of D-optimal designs over the Dworst designs.

Discussion of Results
The optimal designs and the worst designs corresponding to each N are unchanged in seven of the nine preliminary vectors used.The changes are observed in the last configuration of sub-section 4.1 and 4.3.In general designs were the absolute difference between the number of whole-plots and the number of levels of the split-plot factor is relatively small b R  are the D-optimal designs while designs where this difference is large are the worst designs If we classify the designs generated for every N using the structure of its masters designs (design at the first level of randomization); designs with r a  and r a  are classified as Group-A and Group-B respectively.It is observed that many designs in Group-A have higher efficiency values than Group-B.For example

Summary and Conclusions
This work embarks on a search for D-optimal designs in a completely randomized split-plot experiment.The whole-plot factor and split-plot factor are assumed to be random factors.A design construction algorithm was written to generate a group of balanced designs and to compare them for D-optimality using the derived asymptotic variances of the REML estimators.The designs generated were classified based on the structure of their master design.Comparisons were made using 9 different preliminary vectors of magnitude of variance components where the factor variances are larger than the error variances.The results show that designs in Group-A with relatively small b R  show a substantial gain in statistical efficiency than other designs.Although three preliminary vectors were used, in each case for comparisons in this work, the results of several other preliminary vectors where the factor variances were larger than Herea= the number of levels of the whole-plot factor, b = the number of levels of the split-plot factor, i r = the number of whole-plot within each level of the whole-plot factor A,

Table 1 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 2 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 3 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 4 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 5 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 6 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 7 :
D-optimal designs and D-worst designs for different number of experimental runs using

Table 8 :
D-optimal designs and D-worst designs for different number of experimental runs using Table 10 below shows the efficiency values for all candidates designs corresponding to N=56 using [2.5, 2.7, 2.8, 1, 1]. Figure 1 also shows the efficiency value for all candidate designs corresponding to N=30 using [2.8, 2.5, 2.7,1, 1].

Table 10 :
D-efficiency REML of candidates designs for N=56 using