Vegetation Spectral Feature
Extraction Model Qian Tan , Hui
Lin Keywords:vegetation, hyperspectral, spectral
feature extraction Dept. of Geography & Joint Lab. For Geoinformation Science , The Chinese University of Hong Kong , Hong Kong Tel: (+852)-26098105 E-mail: tanqian@cuhk.edu.hk Yongchao Zhao , Tong Qingxi , Zhen Lanfeng Lab. Of Remote sensing Information Sciences , Institute of Remote Sensing Applications ,CAS , Beijing , 100101 , China Abstract A new spectral feature selection and extraction model(for vegetation only!)--- Vegetation Spectral Feature Extraction Model (VSFEM) is presented . A lot of vegetation field spectrum analyzed , 8 vegetation spectral feature positions are acquired , through which a series of feature parameters are achieved . 1. Introduction Spectral feature extraction were mostly from or for target classification . Principle Component Analysis (PCA) is used by most people . This method produces a new series of images , put in order by information content (or variance ) . Relationships among images are essentially removed . With forward principle components , most information content can be seen , which is the optimal result for minimum mean square error . Green (1988) developed PCA , who applied MNF (Minimum Noise Fraction) so as to make every component after MNF transform in order by signal-to-noise ratio (S/N) from large to small , instead of variance . Jia (1999) developed PCA to segment PCA to make feature extraction , whose classification and display result has made some progress . Though PCA can compress and extract information with minimum mean square error , information of principle component images is hard to explain directly according to spectrum , moreover , sample distribution not considered , it's uncertain to get optimal classification result . After realizing this problem , Fukunaga (1970) proposed a new transform method , that was , to find an transform matrix , who satisfied the formula , T(S1+S2)T-T=I , in which Si is the correlation matrix of class i . This method is effective when differences of average values are small and covariance play a key role , but for common situation , is ineffective (Foley , 1975) . Kazakos (1978) put forward a feature extraction algorithm ---linear scalar extraction , making minimum classification error probability for two classes of multi-dimension normal distribution . This method can find an optimal vector to get minimum classification error probability , but when more than one feature is required , this method has no power . After considering within-class and between-class distance , Richard (1986 ) posed a feature extraction method---Canonical Analysis (CA) , Lee (1993) raised a feature extraction method based on decision boundary . This paper puts forward VSFEM(Vegetation Spectral Feature Extraction Model) , which is very different from above methods . VSFEM aims at vegetation spectral feature extraction , from or for controlling vegetation spectral curves , through an amount of analysis of field vegetation spectral curves to get some regularity . Compared to above methods , VSFEM pay much attention to target spectral reflection of biological and physical features , not thinking feature extraction just as pattern recognition or information compression of a branch of mathmatics . 2. Study Area and Data Collection Study Area A study area near Changzhou city, Jiangsu Province, China was chosen for collecting ground data . The study focused on vegetation , which included principle types of agricultural crops and trees in the study area , such as , rice , maize , peanut , sweet potato , cotton ,soybean , cabbage , carrot , etc . Data Collection During the period from late August(middle season in vegetation growing circle) to mid-October(later season in vegetation growing circle), 55 field vegetation spectral data were acquired from 20 sites in Changzhou by SE-590 ---a portable field spectroradiometer . At the same time , some biochemical parameters , such as chlorophll concentration , leaf area index(LAI) were measured . Data were collected in nadir orientation of the radiometer and at about 45° solar zenith angle. Four scans at a time were averaged as the final spectra in each measurement. In addition , the data were collected from 11:00 to 13:00 . When trees were measured , branches with leaves were picked down and laid on the ground. The specific parameters of SE-590 Portable Field Spectroradiometer are shown as follows: Wavelength: 400 - 1100 nm Spectral resolution : 4.0 nm Sample channels: 252 Field of view:150 3. Methodology 3.1 Vegetation Spectral Feature Extraction Model There are some special features, such as "green peak", "red valley" and "NIR platform", in the curve of the reflectance, reflectance intensities of these featured positions vary remarkably or regularly with the species or growth periods. So, it is possible that we design special parameters that are good tokens of curve shape of different species or growth stages. Moreover , if we want to discuss correlation between spectra and vegetation biochemical properties, we also need to find some special spectral parameters. For this case, we define eight special positions(feature position) and design many parameters(feature parameter) like NDVI to discuss the species and property(including the growth stage) difference of typical vegetation in Changzhou. All the eight feature positions as M, B, G, Y, R, V, I1 and I, and some feature parameters are shown in fig.1 . This figure shows a typical spectral curve R(l) .The definition of 8 feature positions as shown in fig.1 and their agorithms are as follows:
lG'= l(R'(lÎ500-600nm)=0) lR'=l(R'( lÎ600-720nm)=0) Fig 1. sketch of 8 feature positions and some feature parameters for the green vegetation in Changzhou The reflectance is about Rice measured in Aug. 31 at Changzhou. The parameters and positions labeled in this sketch are defined in the text. As shown in fig.1, the eight special positions determine, on the whole, shape and spectral feature of reflectance spectra of vegetation in visible-near infrared band. It is distinct that the multi-line MBGYR determines the feature of green peak while the multi-line GYRVI1 determines the general shape of red absorption peak. Line I1I can be looked upon as the representation of NIR platform. As shown in Table 1 , these 8 positions almost keep constant with outer changes while their corresponding reflectivity intensities vary greatly .Thus there is some possibility that we can use a variety of these 8 special positions and their relations to represent spectral change of different vegetation. In order to figure out the correlative variety of these special position and therefore to show the changing rules of reflectance spectra with vegetation species, we designed some parameters(feature parameters) on the base of the 8 special positions according to the spectral features in fig 1. They are:
It should be pointed out that the usually applied parameters in vegetation study such as NDVI, red edge lre and red edge slope drre could also be obtained from these parameters as follows: NDVI=(RI1-RR)/(RI1+RR) , lre=lV , drre»SV Therefore, the definition of eight feature positions not only reflects the general feature of the reflectance data, but also can get many high-information-content feature parameters that may have good relationships with some property parameters of vegetation such as chlorophyll concentration. 3.2 Analysis for Effectiveness of VSFEM---Relative Stability of Position To study characteristics of the above feature positions and parameters , especially to study relationships between feature positions and vegetation types , as well as action of feature parameters for reflecting vegetation parameters . Based on above definitions and algorithms , more than 100 spectral data of about 20 types of vegetation in Chang Zhou were analyzed to get spectral feature positions and parameters. Table 1 gives principle results , from which we can see that , under the research conditions of the experiment , all feature positions , especially 8 feature positions are stable , these positions are : M:404, B:525, G:556,Y:573,R:671,V:723,I1:758,I :900nm, take I position as example , which has the largest change range , for 62 samples at different time and place , its confidence width is 6.4nm , others less than 3nm , even for standard deviation , are less than 6nm generally . This kind of confidence interval has been super than spectral resolution of many instruments .
3.3 Rediscussion for VSFEM---Reduction of Some Feature Positions In addition , we calculate the correlation coefficients between different bands for all the spectral vegetation data measured by Se590 to show the independence of the bands and to select the most information-containing band group in order to indicate more efficiently the difference among different vegetation species , through which we can prove the effectiveness of VSFEM where S is the correlation coefficient matrix of all bands, rij is the absolute value of correlation coefficient between band i and band j. Obviously: rij=rji=|Lij/SQRT(Liix Ljj)| where Lij is the covariance between band i and band j. Fig. 2 is the simulated image of the correlation coefficient matrix among 187 bands that is calculated on the base of 71 vegetation . Curves in Fig. 2 are isolines , from diagonal line to outside , the values are 0.9999,0.999,0.99,0.95,0.9,0.5,0.3 and 0.1 in the order . From Fig. 2 we can see that , Except band 100(band number of Se590) to 126 , for vegetation , correlation coefficient are all very high (more than 99.99% generally) , that is to say, it's practical for band reduction .It shows two high-correlative platforms of 400-670nm and 760-950nm respectively, which means within these two regions , a few bands are enough to extract vegetation information . Moreover , around 550nm and 670nm , there are relatively wider areas . It's unnecessary to subdivide bands in these regions . but between 675nm---775nm and around 522nm and around 573nm , correlation coefficient are low generally . This shows more information here and bands should be subdivided . compared to 8 feature positions referred before , we can get that , for vegetation research , those regions that should be subdivided are just blue edge B , yellow edge Y and red edge V, while those regions that needn't be subdivided are blue absorption valley M , green peak G , red absorption valley R , and NIR platform I . In Fig. 2 , position B , Y and V are located in the center of narrow regions of isolines , while M , G , R , I in the center of broad regions . The arrows in Fig. 2 explain this clearly . Table 2 lists correlation matrix of 8 feature positions , from Table 2 , we can get priority order for 8 feature positions : first , select two positions with minimum correlation coefficient (absolute value) , remove these two bands and those bands with highest correlation coefficient with them , then , select repeatedly for the rest bands until we get to band number we predetermine.
Fig. 2 The simulated images of the correlation coefficient matrix among 187 bands(400-950nm) of Se590 from Table 2 , we can get priority order for 8 feature positions : first , select two positions with minimum correlation coefficient (absolute value) , remove these two bands and those bands with highest correlation coefficient with them , then , select repeatedly for the rest bands until we get to band number we predetermine. 4. Conclusion According to above analysis , we find out that:
References
The authors wish to thank Laboratory of Remote Sensing Information Sciences , Institute of Remote Sensing Applications for providing the data and their sincere help . |