Effect of anisotropy coeﬁcient on error rates of linear discriminant functions

. Paper deals with statistical classiﬁcation of spatial data as a part of widely applicable statistical approach to pattern recognition.Error rates in supervised classiﬁcation of Gaussian random ﬁeld observation into one of two populations speciﬁed by different constant means and common stationary geometric anisotropic covariance are considered.Formula for the exact Bayesian error rate is derived. The inﬂuence of the ratio of anisotropy to the error rates is evaluated numerically for the case of complete parametric certainty.


Introduction
The extension of classical statistical classification techniques in spatial data analysis is a problem of practical interest. Supervised statistical classification sometimes called discriminant analysis (DA) (see, e.g., McLachlan, [4]) traditionally assumes that observations to be classified and classified observations from training sample are independent. However, in practical situations with spatially distributed data this is usually not the case. Data that are close together in space, are likely to be correlated. Thus, to include spatial dependencies in the classification problem is very important.
When populations are completely specified (case of complete parametric certainty), an optimal classification rule in the sense of minimum misclassification probability is the Bayesian classification rule (BCR) (Anderson [1]). Switzer [5]) was the first to treat classification of spatial data, a work that was expended by Mardia [3]. However, neither of these authors analysed the error rate of classification.
In this paper we derived formula for Bayes error rate assuming that training sample observations and an observation to be classified are spatially correlated.

Statistical models for spatial population
In this paper, we consider the performances of BCR and plug-in BCR with parameters estimators from training sample.
Suppose that observation to be classified and training sample are considered as observations of a Gaussian random field {Z(s): s ∈ D ⊂ R m }.
The model of Z(s) in population l is µ l -unknown constant mean (l = 1, 2). The error term is generated by zero-mean stationary Gaussian random field {ε(s): s ∈ D} with covariance function defined by model for all s, u ∈ D, cov{ε(s), ε(u)} = C(s − u), where C() is the covariance function of known parametric structure.
Geostatistians usually use semivariogram as measure of spatial variation instead of covariance. On the assumption of stationarity the semivariogram is defined as: There are three common used parameters of the stationary and isotropic semivariogram. If lim |h|→∞ γ (| h |) = γ ∞ < ∞ , then γ ∞ is called sill of semivariogram.. The nugget effect shows the pure random variation in population density or it may be associated with sampling error. If γ (| h |) → θ 0 > 0 when | h |→ 0, then θ 0 is called nugget effect. The third parameter is the range of influence parameter, that defines the distance within which observations are correlated.
In many applications are found empirical evidence of directional effects in the covariance structure. The simplest way to deal with is by introducing geometric anisotropy into the assumed covariance structure. Geometric anisotropy means that models of the covariances (or semivariograms) have the same nugget, sill but different ranges in to perpendicular directions (Wackernagel,[6]).
Geometric anisotropy means that the correlation is stronger in one direction than it is in the other directions. If one plots the directional ranges in 2D case they would fall on the edge of an ellipse, where major and minor axes of ellipse correspond to the largest and shortest ranges of directional semivariograms.
Algebraically, it adds to the isotropic model two more parameters: the anisotropy angle j and anisotropy ratio In this paper we restrict our attention to the nuggetless model of covariance, i.e., C(h) = σ 2 r(h), where σ 2 is the variance (sill) and r(h) is the spatial correlation function.
Procedures of fitting of the geometrically anisotropic semivariogram models to the environmental data can be easily realised by software system R package Gstat (see [2]).

Bayesian error rate in complete parametric certainty
Consider the problem of classification of the observation Z 0 = Z(s 0 ), with s 0 ∈ D, into one of two populations specified above with given training sample T. Training sample T is specified by T = (T 1 , T 2 ), where T l is the vector formed by n l observations of Z(s) from l , l = 1, 2. Let n 1 + n 2 = n.
Denote by R the correlation n × n matrix for vector T and by r 0 the vector of correlations between Z 0 and T. Since the observation Z 0 is correlated with training sample, we have to deal with conditional means µ 0 lT and variance σ 2 0T that are defined by and where with 1 n specified as n vector of ones.
In this chapter we assume all covariance parameters are fixed except anisotropy ratio λ defined in (1).
Then spatial correlation function r(h) may be considered as function of λ and it is obvious that K is also the function of λ, i.e., K = K(λ).
So we can consider P B as the function of anisotropy ratio λ.

Numerical example
As an example we consider the case with D being integer regular 2-dimensional lattice. We assume that location of observation to be classified is s 0 = (3, 3) and (1, 1), (2, 1), Here D i is the set of points in D, where training sample T i is taken, i = 1, 2. So n 1 = n 2 = 4. Let h = (h x , h y ). Exponential geometric anisotropic correlation function r(h) with anisotropy angle φ = π 2 specified by r(h) = exp{− h 2 x + λ 2 h 2 y /α} is considered. Example was considered to evaluate numerically the effect of the anisotropy ratio on the error rates of classification in case of parametric certainty. With an insignificant loss of generality the case with π 1 = π 2 = 0.5 and α = 2 was considered. Then the  Bayesian error rate is specified by Denote by (− * 0 /2) the Bayes error rate for given sampling design with λ = 1. Three values 0.5, 1 and 2 of were chosen to represent weak, moderate and strong separation between populations. In Table 1 the values of (− 0 /2) and η = (− 0 /2)/ (− * 0 /2) are presented. The ratio η can be used as relative measure of the influence or effect of the λ on Bayesian error rate.
The figures in Table 1 show that the effect of anisotropy ratio on Bayesian error rate is greater when separation between classes is stronger.