GISdevelopment.net ---> AARS ---> ACRS 1999 ---> Poster Session 3

Band Selection using Hyperspectral Data of subtropical Tree Species

Tung FUNG, Fung yan MA, Wai Lok SIU
Department of Geography The Chinese University of hong Kong
Shatin, New Territories, Hong Kong, China
Tel: 852-2609-6535 Fax: 852-2603-5006
Email : tungfung@cuhk.edu.hk

Keyword: hyperspectral, band selection

Abstract
Band selection was performed based on hyperspectral data taken at 400-900 nm spectral range for 25 subtropical tree species in Hong Kong. Stepwise discriminate analysis and hierarchical clustering were used to select bands with high discriminatory power. In addition to the blue, green, red and near-infrard bands, spectral bands along the blue-green, red and near-infrared bands, spectral bands along the blue-green edge, green red edge and red curves were found to have important information to discriminate the 25 tree species which could identify tree species with 89% overall accuracy.

Introduction
The increasing availability of hyperspectral data and image has enriched us with better data for environmental monitoring, forest tree species, geological exploration and many other applications as well. Currently, most studies focus on either using airborne hyperspectral sensors such as the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) or in situ collected field spectra. With hyperspectral scanner onboard satellites (e.g. Orbiview-4), hyperspectral image analysis will certainly be a major arena of research and applications.

The use of hyperspectral data analysis, however, brings along research issues. Price (1994) noted three major issues including (1) the trade-off between the number of spectral intervals (bands) and spatial resolution of the imagery; (2) the trade -off between higher spectral resolution and reduced signal-to-noise ratio and (3) appropriate positioning of the selected spectral bands in order to provide information about the objects of interest. Hyperspectral data undoubtedly possess a rich amount of information. Nevertheless, redundancy in information among the hundreds of bands opens and area for research to explore the optimal selection of bands for analysis. Price (1994) indicated that for natural materials, 15-25 spectral bands were sufficient whereas the number of bands had to be doubled for the studies of minerals. Petric and heasler (1998) examined optimal band selection strategies using spatial autocorrelation; spectral autocorrelation and optimization with distance metrics. Warner and Shank (1997) also examined the use of spatial autocorrelation for narrow band selection.

The objective of this studyis to investigate how many bands and in which spectral ranges these bands are needed to identify subtropical tree species based on their hyperspectral reflectance's. Tropical and subtropical environment is endowed with a rich diversity of flora which also renders it a difficult but challenging environment for monitoring. Under this project, we have earlier examined the effectiveness of subtropical tree identification using hyperspectral data (Fung et al., 1999). The results of using back propagation feed forward neural network and linear discriminate analysis for classification revealed that about 80% overall accuracy could be attained with autumn being the best season for hyperspectral data and suggest suitable bands for tree species recognition.

Hyperspectral data collection
A high spectral resolution spectrometer PSD2000 was used for taking hyperspectral data. The spectrometer was linked with a notebook computer for data acquisition and analysis. It took data ranging from 210 mm to 1050 nm with spectral resolution of approximately 2.6 nm. It had a field of view of 22o. During data collection, two references, a white and a dark, were used for calibration. Based on the illumination condition, an integration time for collecting photons was selected to adjust the master/slave sampling frequency to avoid saturationor shortage. Sample spectra could then be collected through dividing the sample radiance by that of the standard while reference under the same illumination condition. In this study, we only examined data from 400-900 nm to avoid bands with too much noise.

Subtropical trees were the prime focus of this study. Twenty-five tree species within the campus of the Chinese University of Hong Kong were selected fro the study. They included both native and exotic; needleleaf and broadleaf species. These trees were also commonly planted in the country parks and urban areas of Hong Kong. They are listed as followings:

Acacia confusa Ficus variegata
Araucaria heterophylla Hibiscus tiliaceus
Acacia mangium Lophostemon conferta
Bauhinia variegate Liquidambar formosana
Cinnamomum camphora Lagerstroemia speciosa
Casuarina equisetifolia Mealaleuca quanqueenervia
Castanopsis fissa Macaranga tanarius
Cratoxylum ligustrinum Pinus elliottii
Aleurites moluccana Thuja orientalis
Dimocarpus longan Schima superba
Delonix regia Sapium sebiferum
Ficus microcarpa Taxodium distichum
Firmiana simplex

Owing to the difficult in acquiring in situ ground measurement in the subtropical environment (Fung et al., 1999), the experiment was undertaken in a controlled environment. The following procedures were adopted:
  1. A dark room was used with constant illumination from two 500W tungstenlamps at a distance of 1. m apart. Both lamps were stationed with a height of 0.8 m. The intensity was 190 W/m as read from a pyranometer.
  2. On a sunny data, five to six branches of leaves of sampled trees wee collected in the field and brought to the dark room for immediate data measurement.
  3. The leaves were lied on a black cloth on the ground.
  4. An integration time of 30 millisecond was used to take the references and sample data.
  5. One branch of leaves was lied on the black cloth for taking spectral measurement with the fiber optic sensor positioned at 1 m from the ground.
  6. The field of view of the sensor should be a 1 m circle on the ground. A digital photograph was taken for each sample. The proportion of leaves is measured later with image processing techniques.
  7. Procedures 5 and 6 were repeated with additional branches simulating an increasingly dense tree canopy.
For each type of tree, 3 different levels of density were used. And under each density level, 12 samples were taken. A total of 36 samples was thus taken for each tree. For each sample spectra, Fast Fourier transform was applied to smoothen the raw spectra in order to eliminate noise.

Data Analysis
Two algorithms were adopted for band selection in this study. A preliminary band selection was first done using stepwise discriminant analysis. By adjusting the F ratio for bands to enter into the discriminant model, the number of bands was significantly reduced to include only those having the highest discriminant power. Two criteria were used. The first one used a criterion that bands entered if the significance level of F value was smaller than 0.05 and removed I the significance level of F was greater than 0.1. the second used a criterion that bands were removed for F value less than 2.71 and entered if F value greater than 3.84. The tree identification results were compared with those using all bands based on significant test of Kappa. These two criteria were compared with the results of using all 138 bands for tree identification. In the second analysis, hierarchical clustering was performed using the autumn data set. The 400-900 nm spectral data was grouped into clusters. Based on the analysis of dendrogram, bands within cluster were selected for investigation. Different combinations of bands were used to examine their accuracy in tree identification so as to explore the most informative combinations.

Result and discussions
The result of band selection using stepwise linear discriminant analysis is shown in Table 1. Using the first criterion, the number of bands ranged from 9 to 18. Less bands, 7-9, were selected if the second criterion was adopted. The classification results had no Significant difference when the corresponding Kappa values were compared. However, adopting either criterion can significantly improve the classification results when compared with those using all the 138 spectral bands. The improvement was particularly significant when the winter data was used. The overall accuracy yielded using 9 bands was 91.27% whilst only 70.16% was produced when all the bands were used. Amongst the bands selected, they lied mainly in the green peak and the red edge showing that bands along these regions had stronger discriminating power.

  All bands Criterion 1 Criterion2
No.of bands QA(%) K (x100) No.of bands OA(%) K (x100) No. of bands OA(%) K (x100)
Spring 138 74.27 73.19 10 79.89 78.89 7 77.60 76.67
Summer 138 68.93 67.64 11 81.20 80.42 7 81.73 80.97
Autumn 138 80.69 79.86 18 90.28 89.28 8 87.78 87.25
Winter 138 70.16 68.67 9 91.27 90.83 9 91.27 90.83
Average Accuracy   73.51 72.67   85.62 85.00   84.58 83.93
Table 1 Classification results using stepwise linear discriminant analysis for band selection

The result of hierarchical clustering showed that clusters not only appeared in the typical spectral regions such as blue (400-504 nm), green (526-582 nm), red (654-686 nm) and near infrared (766-900 nm), many f them wee generated along the edges. These include the blue-green edge (504-526 nm), the green-red (582-654nm) and the red edge (697-755 nm).

Seven different combinations of bands were selected to test the discriminating power of different spectral regions (Table 2). The result was shown in Table 3. Using the conventional 4-band combination (B,G,R, NIR) generated an average overall accuracy of 46.11%. The result has no significantly different from that (44.66%) using only 3 edges (BG, GR, R). Both results are inferior to that using 5 bands along the red edge (55.62%). The result improved substantially if more bands and appropriate combination was tested. The combination of 7 edges yielded an average overall accuracy of 72.97% whilst that using the conventional 4 bands plus 3 edges was 77.16%. Further addition of bands along the edges can further improve the result to 87.26% (11 bands) and 89.30% (13 bands). The result was also superior to those produced using discriminate analysis.

Conclusion
This study shows that band selection helps improve the classification accuracies for treespecies recognition. The bands that were selected by stepwise discriminate analysis mainly lied in the spectral regions around the green peak and the red edge. Hierarchical clustering provided a more detailed analysis for band selection. It showed the value of spectral regions along the BG, GR and edge which received less focus in many studies.

Spectral band sets Description Selected bands
4 bands Represents the four traditional
blue, green, red (RGB) and
near-infrared (NRI) broad
bands.
450.28 nm (blue)
550.00 nm (green)
670.37 nm (red)
835.39 nm (infrared)
3 edges Includes the two edges before
and after the green peak and
one red edge.
516.89 nm (edge before green peak)
619.51 nm 9edge after green peak)
728.14 nm (red edge)
7 bands Includes the RGB and NIR
bands and 3 edges
450.28 nm (blue)
516.89 nm (edge before green peak)
550.00 nm (green peak)
619.51 nm (edge after green peak)
670.37 nm (red)
728.14 nm (red edge)
835.39 nm (infrared)
5 red edges Includes five red edges to
determine the discriminating
power of the red edge only.
695.69 nm (red edge)
710.13 nm (red edge)
728.14 nm (red edge)
742.52 nm (red edge)
760.46 nm (red edge)
7 edges Includes the two edges before
and after the green peak and
five red edges to determine the
discriminating power of the
edges.
516.89 nm (edge before green peak)
619.51 nm (edge after green peak)
695.69 nm (red edge)
710.13 nm (red edge)
728.14 nm (red edge)
11 bands Includes the RGB and NIR
bands and the seven edges.
450 nm (blue)
516.89 nm (edge before green peak)
550.00 nm (green peak)
619.51 nm (edge after green peak)
670.37 nm (red)
695.69 nm (red edge)
710.13 nm (red edge)
728.14 nm (red edge)
742.52 nm (red edge)
760.46 nm (red edge)
835.39 nm (infrared)
13 bands Includes bands from "11
bands" set with the blue region
and the edge after green peak
region divided into two bands.
427.29 nm (blue)
476.24 (blue)
516.89 nm (edge before green peak)
550.00 nm (green)
597.62 nm (edge after green peak)
630.43 nm (edge after green peak)
670.37 nm (red edge)
710.13 nm (red edge)
728.14 nm (red edge)
742.52 nm (red edge)
760.46 nm (red edge)
835.39 nm (infrared)
Table 2 Spectral band sets selected from hierarchical clustering


  Spring Summer Autumn Winter Average overall Accuracy (%)
4 bands OA (%) 48.27 39.20 50.00 46.98 46.11
K (x100) 46.11 36.67 47.83 44.33 43.74
3 edges OA (%) 42.40 47.73 39.44 49.05 44.66
K(x100) 40.00 45.56 36.81 46.50 42.22
7 bands OA (%) 76.00 76.27 79.86 76.51 77.16
K(x100) 75.00 75.28 78.99 75.33 76.15
5 red edges OA(%) 54.53 55.20 54.17 58.57 55.62
K(x100) 52.64 53.33 52.17 56.50 53.66
7 edges OA(%) 72.13 73.60 69.31 76.83 72.97
K(x100) 70.97 72.50 67.97 75.67 71.78
11 bands OA (%) 86.80 87.33 88.89 86.03 87.26
K(x100) 86.25 86.81 88.41 85.33 86.70
13 bands OA (%) 87.47 88.40 91.94 89.37 89.30
K(x100) 86.94 87.92 91.59 88.83 88.82
Table 3 Classification results of the selected bands sets generated from hierarchical clustering

Acknowledgement
This research is supported by a RGC earmarked grant (CU96223/UPG).

References
  • Fung, T., F.Y. Ma and W.L. Siu, 1999, Hypersectral data analysis for Subtropical tree species identification Processing of 1999 ASPRS Annual Conference, (CD-Rom), 108-119.
  • Price, J.C., 1994, Band selection procedure for multispectral scanners, Applied Optics, 33 (15), 3281 -88.
  • Petrie, G.M. and P.G. Heasler, 1998, Optimal band select strategies for hyperspectral data setsProceedings of IGRARSS'98, 1582-1584.
  • Warner, T.C. and M.C. Shank, 1997, Spatial autocorrelation analysis of hyperspectral imagery for feature selection of Environment, 60, 58-70.