Robust Feature Extraction for Hyperspectral Imagery Using Both Spatial and Spectral Redundancies

Jorge E. Pinzón1, Susan L. Ustin2, John F. Pierce3
1Dept. of Applied Mathematics
University of California Davis, CA, 95616
2Dept. of Land, Air, and Water Resources
University of California Davis, CA, 95616
3KT-Tech, Inc.
9801 Greenbelt Road, Suite 314, Lanham MD 20706

1 Introduction

As we move into the next century, a wide range of new satellites and airborne sensors will become available with a variety of interesting problems for data analysis and signal processing. In particular, hyperspectral sensors with both, a large set of spatially contiguous spectra and a large set of spectrally contiguous images will require new techniques that ideally would treat the spatial and spectral patterns in the data simultaneously. Resolving the significant spectral and spatial properties associated to ecological processes and interactions is critical to successful interpretation of remote sensing data.

In hyperspectral images is desirable to classify images within the conventional frame of reference of field and laboratory observations with methods that avoid intrinsic singular problems. In this respect, spectral mixture analysis (SMA) has become a well established procedure for analyzing imaging spectrometry data [17, 16, 11, 12, 3]. SMA is a structured and integrated framework that simultaneously addresses the mixed-pixel problem, calibration, and variations in lighting geometry and displays the results in terms of proportions of endmembers that can be related easily to standard ecological observational units (e.g., cover). The general form of the SMA equation for each band is expressed as:
 
(1)
where Rb is the radiance at band b, Fem is the fraction coefficient of each endmember Rem weighting their radiance at band b, and Eb is an error term accounting for the unmodeled radiance in band b. Endmembers are chosen to explain the spectrally distinct materials that form the convex hull of the spectral volume. This approach works best when describing a few spectral types that, in various mixtures, can account for most of the variance in an image data set. It does not mean, however, that it is possible to identify any specific material. SMA works less well when the spectral features of interest are minor components of the total variance. In fact, SMA has the disadvantage, at least for this application, of approximating linearly the natural (non-linear) complexity of materials represented by the mixture of endmembers. This produces a non-unique mixing model to identify and quantify materials that occur at the sub-pixel scale [13]. In summary, the technique is relatively insensitive to subtle absorption features, and produces significant quantification errors due to endmember variability from linear and nonlinear mixtures (e.g. from scattering, and lighting geometry) in a pixel.

Boardman [1], used a geometric approach based on the convex hull of the spectra projected into the mixing space to find a solution that minimized spectral variation for some features while accentuating others. His technique is still a SMA approach that automatically derives the number of endmembers and estimates their pure spectral composition [1], but it is suboptimal in the presence of multiple mixing. More recently, Harsanyi and Chang [5] developed a mixture technique that rejects undesired interference by performing an orthogonal subspace projection (OSP). This technique simultaneously reduces data volume and emphasizes the presence of a signature of interest. Bolster et al. [2] seeking the same goals, instead use the first difference partial least squares regression (PLS) that is based on a singular value decomposition (SVD) of the whole spectrum data set. SVD reduces noise-related interference, common in a first difference analysis, and reduces the analysis into a smaller set of independent variables. Both, OSP and PLS, achieve good performance in detecting material abundances at low levels for a particular scenario by incorporating the variability of the material abundance into the more important independent variables (factors) but they are unable to extend the application to other scenarios. In order to develop a directed search methodology to locate the desired robustness (analytic) property, Smith et al. [14] proposed a revised SMA technique, that they termed Foreground/Background Analysis (FBA). Harsanyi's approach shares the properties of orthogonal space projection and a similar rationale with the FBA technique. In this technique, spectral measurements are divided in two groups of foreground and background spectra that comprise a selected subset of spectra which emphasizes the presence of a signature of interest. In defining both groups they do not include intermediate mixtures between foreground and background. In that way, FBA vectors should be sensitive to minor sources of foreground spectral variation and insensitive to background spectral variation. The goal of FBA is to project spectral variation along the most relevant axis of variance that maximizes the spectral differences between the foreground and background, while minimizing spectral variation within each group. Their FBA approach defines a weighting vector w = (w1 , w2 , ..., wNb), with components wb at each channel b = 1, ..., Nb, such that all foreground spectral vectors, Rf = (Rf,1 , Rf,2 , ..., Rf,Nb), are projected to 1 while background spectral vectors, Rb, to 0. This property is defined by the FBA system of equations:
 
      foreground 
(2)
and
 
      background
where T provides a translation that is typically required to optimize the FBA system. As stated FBA is in essence another linear classifier of the spectra that can be applied to identify low and high material abundances. Pinzón et al. [10, 9] modified the FBA linear system to project a subset of spectra into relevant axis of continuous property variation, like chemical content.

In this paper, we present a supervised classification technique that discriminates broad categories of materials of the surface in terms of ground truth features, such as vegetation characteristics, and soil properties. The actual relationships between these two ecological units are often difficult to resolve with respect to understanding which of many potential interacting factors is significant in a particular locality. We decompose the interaction between the spatial and spectral domains associated to these units by using wavelet tools and a hierarchical foreground background analysis (HFBA). Wavelets provide spatial coherence information that should allow us to generalize the results from the spectral features extracted by HFBA.

2 Methods

For most purposes the problem of supervised classification can be formulated as follows: given an input space X and a desired property in an output space Y, there is an unknown (functional) relationship, F, between X and Y that is represented by a subset of m samples, from which one wants to guess the X - Y relationship. In general, F takes the form of a deterministic function + noise. One is given the training set of m samples and the guess functional , the problem is to guess, using , what output space value, , is the most appropriate for a given input x. The precise meaning of "appropriate" can be difficult, and is measured through loss functions. A popular choice is the quadratic error function:
 
(3)
The loss function and the way it is minimized determine the method used and its ability to generalize. Under this definition the problem of supervised classification has been identified by many other names, such as inductive inference, regression, statistical inference, model inversion, etc.
 

2.1 HFBA

Pinzón et al. [8, 7] modified the FBA system to project a subset of spectra into the most relevant axis of variation of a desired property. In this case, the system of equations is given by
 
(4)
That is, the reflectance matrix R times the FBA weighting vector w is equal to the desired ground characteristic C. The goal of this system is to relate spectral and ground variation along the most relevant axis of spectral variance. The general form of the FBA system has the form of a generic Finite Impulse Response (FIR) filter equation in time domain [15]:
 
where fs is the sampling frequency and h(k) are the Fourier series coefficients of the frequency response of the FIR filter, H(f). Therefore, solving the FBA system plays similar role as the design process of a FIR filter.

To improve the detection of minor sources of spectral variation, we can apply the process iteratively obtaining a system of equations that works at different levels of accuracy. We stop at the level of the system noise. Solving each equation in the iteration system is the so called hierarchical FBA technique (HFBA) which derives sequentially a series of FBA vectors, with different general discriminating features. In essence, the HFBA system is an iteratively decimation process which extracts details in each of the levels.

The power of the HFBA method becomes apparent as we begin to catalogue more precisely the performance of the SVD in energy-packing and avoidance of overfitting problems due to its stability properties. First, r, the rank or dimension of the matrix R, could be estimated by examining the number of non-zero singular values [4]. Second, the decomposition R = US V* provide an approximation of the matrix R by a sum of rank-one matrices [4]. That is,
 
(5)
Here, r is the rank of the matrix R, and s j its singular values, uj, and vj the left and right singular column vectors respectively. This is easily shown by noticing that S can be written as a sum of r matrices S = S j = 1r S j, where each S j has just one nonzero entry s j. Then Equation 5 follows. One can find a very large number of different representations of R as a sum of rank-one matrices. However, Equation 5 represents the best approximation of R. That means that the hyper-ellipsoid with principal axis of length s j 's, provides a very important property: the q-th partial sum captures as much of the detail of R as possible [4]. That is, the best least squared approximation of a matrix R by matrices of lower rank q (q < r), is given by Rq = S qj=1 s j uj vj*. Third, when solving the FBA equations at each level with spectral matrices R close to rank-deficient, it turns out that most of the standard algorithms used to solve such systems have ill conditioned stability properties. In such cases, SVD is a good stable alternative [4]. Computationally, SVD is more expensive than the standard methods, but more accurate and stable. This is the principle advantage of using SVD in the solution of the FBA equations: a stable method to process hyperspectral (rank-deficient) matrices R.

2.2 Wavelets

Wavelets are mathematical functions that split data (image or signal) into different scale components that provide the best approximation at each scale. The wavelet analysis starts with a function (x), called mother wavelet that is well localized and oscillating. By localization, we mean (x) decreases rapidly to zero as x tends to infinity. Oscillating requires that  behaves as a wave, that is, integrals of  and its firsts k moments be zero.

In summary, a wavelet decomposition can be seen as an application of a pair of complementary low and high pass filters, H and G respectively. Thus, a generic wavelet transform is depicted in Figure 1.

The properties of the wavelets are determined by the properties of the filters H and G, and by the properties of the signal being analyzed. The construction of wavelets then begins by designing the filters that could be a basis of the space we want to transform. To lead to high compression and get coherent information we use a coi ets with 4 vanishing moments, avoiding at the same time to include noise into the estimated generalization functional, .

3 Results

We present two applications of the HFBA: 1) retrieval of biochemical properties using laboratory spectra and chemical content of fresh leaves samples, and 2) discrimination of soil in the Santa Monica Mountains.
 

3.1 Retrieval of biochemical properties

For this application, we have fresh leaf samples from 3 different sites: from Santa Monica Mountains, CA, from Joint Research Center, Ispra, Italy, and from Jasper Ridge Biological Preserve at Stanford University. The samples are botanically very heterogenous, specially those in JRC. We have trained each HFBA vector with 20% of the samples from JRC and validated the results with the remaining data set. Three levels of detection were obtain, the first discriminates monocots from dicots, the second low water content from high water content and finally the actual chemical content was predicted (here we present nitrogen and water results). Monocot and dicot samples are identified by their spectral features in the visible region, where monocots are brighter due to their higher chlorophyll (a and b) content. That property is precisely the characteristic manifested in the HFBA vector, Figure 2(a). Similarly, low and high water contents are spectrally discriminated by the main water absorption features at 1400 nm and 1900 nm and their interaction in the blue visible region, Figure 2(b).

The statistics of the prediction indicates the good performance of HFBA at the laboratory level: regressions of 0.71 and 0.75 with good fit of the distribution of actual data.
 

3.2 Discrimination of Soil in the Santa Monica Mountains

We have used two levels of HFBA to discriminate soils and soil properties from two valleys in the Santa Monica Mountains (Serrano and La Jolla) using AVIRIS data. The region is highly susceptible to erosion and wildfires due to the xeric soil moisture regime typical of Mediterranean climates, as well the steep terrain. The combination of all these factors markedly increases heterogeneity in the distribution of soil properties. Large coverage and sufficient spatial resolution are required to understand soil pattern differences. AVIRIS satisifies these two requirements. The classification vector discriminates soils with high organic matter from those with low organic matter (see Figure 4). Ninety four percent of the samples were correctly classified. The other six percent show intermediate organic matter contents.

It can be observed that the two spectral areas most important for discrimination are between 1000nm and 2200nm (OH-AL and Mg-OH absorptions). The characteristic of the vector between 600 and 800 nm also could be used to detect vegetation and it will work like NDVI for this purpose. The first image in Figure 5 shows the HFBA spatial distribution. After applying coi et wavelets (Figure 5, second row), the spatial coherence is manifested and this allows noise reduction and improves the performance of HFBA vectors. The final classification allows a better interpretation of the ecological processes involved. Image classification follows known spatial characteristics. Finally, Figure 6 shows the organic matter spatial distribution from AVIRIS data predicted by HFBA and coi et noise reduction. High values are concentrated near ridges of the mountains as expected. It can be observed that the pixels mapped as La Jolla soils in the classification image also show high content of organic matter which agrees with our laboratory data.

4 Conclusions

A new robust approach for the detection and classification of materials was developed and tested. The technique uses a combination of an iterative hierarchical application of a modified FBA technique and coi e noise reduction to detect functional relationships between spectra and ground truth features at different levels of accuracy.

The power of the HFBA technique is based on the attractive properties of the SVD transform in information packing and avoidance of overfitting problems by minimizing extraneous noise in the analysis. The technique was trained over laboratory data and applied to AVIRIS images. It is clear from the above experiments that the proposed approach is promising.

By the iterative hierarchical procedure we force the system to account for important non-linear dependencies directly related to spectral scaling. In that respect, one of the strong points of the proposed method is that we can group together samples with similar anatomical properties manifested spectrally. However, if the distribution of these properties is continuous, samples near the boundaries of the discriminant regions could be misclassified weakening the helpfulness of the classification step. In particular, as spatial variation of vegetation is high, the selection of a training set that explains the mixing presented at different spatial scales is critical. This process seems to be a key factor for understanding the good performance of HFBA dealing with sub-pixel scaling issues in this application, although HFBA was not properly equipped to deal directly with these spatial issues. There are more appropriate image analysis methodologies concerning spatial scaling problems such as wavelet transforms. The wavelet decomposition gives a better representation of spatial distribution (at different scales) of the data, and especially a better description of the properties of samples near to discriminant boundaries. Clearly, these points have to be further investigated to identify the relationship between spatial-spectral scales.

As a conclusion, we consider that a combination of HFBA and wavelets or other spatial scaling transforms has significant potential and certainly deserves further investigation. There are many aspects for the discrimination among materials that still need investigation. The aspects we have in mind are aptly illustrated by Yves Meyer in his book Wavelets: algorithms and applications [6]: "It is notable that Mandelbrot used the word describe and not explain or interpret. We are going to follow him in this, ostensibly, very modest approach. This is our answer to the problem about the objectives of the choices: Wavelets, whether they are of the time-scale or time-frequency type, will not help us to explain scientific facts, but they will serve to describe the reality around us, whether or not it is scientific. Our task is to optimize the description. This means that we must make the best use of the resources allocated to us to obtain the most precise possible description."

References

[1] J. W. Boardman. Geometric mixture analysis of imaging spectrometry data. In IGARSS 94: Proceedings International Geosciences Remote Sensing Symposium, volume 4, pages 2369-2371, 1994.

[2] K. L. Bolster, M. E. Martin, and J. D. Aber. Determination of Carbon fraction and Nitrogen concentration in tree foliage by near infrared reflectance: a comparison of statistical methods. Can. J. For. Res., 26:590-600, 1996.

[3] J. A. Gamon, C. B. Field, D. A. Roberts, S. L. Ustin, and R. Valentini. Functional patterns in an annual grassland during an AVIRIS overflight. Remote Sensing of Environment, 44(2):239-253, 1993.

[4] G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, Baltimore, Maryland, 1989.

[5] J. C. Harsanyi and C. I. Chang. Hyperspectral image classification and dimensionality reduction: an orthogonal subspace projection approach. IEEE Transactions on Geoscience and Remote Sensing, 32(4):779-785, 1994.

[6] Y. Meyer. Wavelets: Algorithms and Applications. SIAM press, Philadelphia, 1993.

[7] J. E. Pinzón. Applications of spectrometry to vegetation studies using hierarchical foreground/background analysis. Master's thesis, University of California, Davis, June 1996. Master degree.

[8] J. E. Pinzón, S. L. Ustin, C. M. Castañeda, and M. O. Smith. Investigation of leaf biochemistry by hierarchical foreground/background analysis. IEEE Transactions on Geoscience and Remote Sensing, 36:1-15.

[9] J. E. Pinzón, S. L. Ustin, Q. L. Hart, S. Jacquemoud, and M. O. Smith. Using foreground/background analysis to determine leaf and canopy chemistry. In R. O. Green, editor, Proc. 5th. annual JPL Airborne Earth Science Workshop: AVIRIS Workshop, Jan 23-27, 1995, vol. 95-1, pp. 129-132, 1995.

[10] J. E. Pinzón, S. L. Ustin, Q. L. Hart, S. Jacquemoud, and M. O. Smith. Comparison of multivariate statistical techniques for estimating vegetation parameter. In Spectral Analysis Workshop: The Use of Vegetation as an Indicator of Environmental Contamination, Reno, Nevada, Nov 9-10, 1994.

[11] D. A. Roberts, J. B. Adams, and M. O. Smith. Predicted distribution of visible and near infrared radiant flux above and below a transmittant leaf. Remote Sensing of Environment, 34:1-17, 1990.

[12] D. E. Sabol, J. B. Adams, and M. O. Smith. Predicting the spectral detectability of surface materials using spectral mixture analysis. In Proceedings of the IEEE International Geoscience Remote Sensing Symposium 1990, volume 2, pages 967-970, 1990.

[13] D. E. Sabol, J. B. Adams, and M. O. Smith. Quantitative subpixel spectral detection of targets in multispectral images. Journal of Geophysical Research, 97(E2):2659-2672, 1992.

[14] M. O. Smith, D. A. Roberts, J. Hill, W. Mehl, B. Hosgood, J. Venderbout, G. Schmuck, C. Koechler, and J. Adams. A new approach to quantifying abundancies of materials in multispectral images. In IGARSS 94: Proceedings International Geosciences Remote Sensing Symposium, volume 4, pages 2372-2374, 1994.

[15] M. O. Smith, R. Weeks, and A. Gillespie. Using background factors to optimize roughness estimates from multipolarized SAR images. In ERIM 96: Second International Airborne Remote Sensing Conference and Exhibition, San Francisco, volume 1, 24-27 June, 1996.

[16] S. L. Ustin, Q. J. Hart, G. Scheer, and L. Duan. Estimating dry grass biomass residues using AVIRIS image analysis. In IGARSS 94: Proceedings International Geosciences Remote Sensing Symposium, volume 2, pages 1211-1212, 1994.

[17] S. L. Ustin, M. O. Smith, and J. B. Adams. Remote sensing of ecological processes: A strategy for developing and testing ecological models using spectral mixture analysis. In J. R. Ehleringer and C. B. Field, editors, Scaling Physiological Processes: Leaf to Globe, pages 339-357, San Diego, 1993. Academic Press.

[18] M. V. Wickerhauser. Adapted wavelet analysis from theory to software. A. K. Peters, Ltd., Wellesley, MA, 1994.

1998, Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis