Assessment of mapping accuracy of Landslides using image classification techniques

Assessment of mapping accuracy of Landslides using image classification techniques

Scott L. Huang, Been K. Chen, Robert C. Speck
Department of Mining and Geological Engineering
University of Alaska Fairbanks, Fairbanks, Alaska 99775, U.S.A.

Abstract
Lnadsat-5 TM scene images of Healy, Alaska and terrain information (i.e. elevations, drainage system, bedrock formation, and geological structures) were processed using minimum distance, parallelepiped and Bayesian classifiers. Among the three methods, the Bayesian classification with a threshold value of 10^-1 revealed the best mapping accuracy of 16.60%.

Introduction
In the past two decades, remote sensing technologies including use of aerial photographs and satellite imageries have been applied widely to regional landslide investigations. Simonett et al. (1970), Scully (1973), Mc Donald and Grubbs (1975), Anderson et al. (1976), and Sauchyn and Trench (1978) relied on visual interpretation for classification of landslide phenomena on either aerial photographs or satellite images. Remotely sensed data along with image interpretation can provide terrain information pertaining to Landsliding (Gagon, 1975). Those important factors for assessment of landslide potentials regional physiography, geomorphology, and geological structures.

With an improvement of computer technology, research undertaken by Heath and Dowling (1980) and Stephens (1988) applied digital image processing to delineate landslide areas on satellite images. Certain terrain information such as elevations, drainage patterns, bedrock formations, and geological structures are valuable for predicting Landslides and this information, however, could not be obtained directly from image interpretation. Therefore, in the study the authors attempted to identify areas in Lignite Creek coal basin, Healy, Aaska (Figure 1) where Landslides are likely to occur by taking advantage of the terrain information while processing digital satellite images. The intention of this research was to assess the reliability of image classification techniques in the study of Landslides. Large number of Landslides that occurred in Lignite Creek basin, Healy, Alaska influence surface mining operation of a coal mine in the vicinity (Corser and Paker, 1987) . Prior knowledge of the potential Landslides can often permit more flexible and accurate design of a mining method to minimize financial risk and unnecessary engineering problems associated with slope movements.

Landsat TM-Images
A Thematic Mapper (TM) scene image (Figure 1) was acquired by Landsat -5 on September 22, 1984 (scene ID Y5020520430x0). The digital image was later loaded on the ADVAL VAX 11/750 computer at the University of Alaska Fairbanks. The scene's geographic center is at latitude N64°14'00" and longitude W147°58'00". The entire coverage includes about 4,300mi² (10,500 km²) in the interior of Alaska.

The Land Analysis System (LAS) modules including Cooredt, Trancoord-II2utm, Tiemerage, Nullcorr, Tiefit, and Geom were performed to register the TM images to a common Universal Transverse Mercator- based (UTM) grid. In the registration process, twelve control points were chosen from a topographic map. The pixel size of the images was reformatted from 30 meters to 25 meters to increase geographic precision (Goodenough, 1988), and pixel values were resampled using the nearest neighbor (NN) method.

Figure 1. TM images with existing landslide deposits

Terrain Information
A Digital Elevation Model (DEM) image of the Healy quadrangle was utilized as the original spatial data to generate elevation contours (DN_Con) Percent slope (DN_SlP) and slope aspect (DN_asp) images through LAS's Topo modules.

The existing landslide deposits, bedrock lithologies, drainage system, and faults were digitized from a geologic map through AMS to generate spatial terrain images, designated as DN_eld , DN_grp, DN_flt, respectively . The images were then referred to the same UTM grid coordinates as that for TM's and the elevation derived images. Both drainage system (DN_drn) anf faults (DNflt) images were binary images composed of two digital values (i.e. 50 and 200 for better contrast) Binary images were created by running LAS's Filter module to dilate the linear boundary to a distance of 5 to 45 pixels on either side of the trace. The existing landslide deposits (DN_eld) image was a binary image as well, with digital value 50 representing non-landslide deposits and digital value 200 indicating existing landslide deposits. The bedrock lithology image (DN_grp) was comprised of eight rock units based on berrock formation in the area.

Image Classifications

Minimum Distance Classifier

In this classification, the existing landslide deposits (Figure 1) were considered as training area. In the other words, existing landslide deposits were used to compile a numerical " interpretation key" that described the digital values of Landslides in each input parameter image. Each pixel in the image was then compared numerically to the interpretation key and labeled with either landslide or non-landslide class. To do the analysis, the minimum distance classifier was employed to make this comparison between unknown pixels and the interpretation key pixels.

In this task, the authors first took six Landsat TM-images as input for minimum distance classification then the six terrain images were added to the TM's to create the second set of input images. This allowed an evaluation of the improvement of landslide classification over the initial TM-images.
Parallelepiped Classifier

In the parallelepiped classification, the ranges of values in each class may be defined by the lowest and highest pixel values in each image. In this study, the range of landslide class was defined by dividing the values (0 to 255) of TM images into 3,5, and 7 intervals, although other intervals could be chosen based on different algorithm and computer capacity. The otimal range of landslide class for each input parameter image and its mapping accuracy were obtained through computer search.
bayesian Classifier

Unlike equal weighting for input images in parallelepiped and minimum distance classifications, Bayesian classifier assumes that the importance of each input parameter image is unequal in terms of construction to an event (i.e. landslide event B). Bayesian theorem, introduced by Thomas Bayes in the 1800's, is a statistical approach concerning conditional, prior and posterior probabilities for inferential and decision-making procedures. It was applied in the study to calculate probabilities of Landslides.

In this study, the landslide deposits shown in the geological map were considered as the existing occurrences of Landslides for later prediction of the potential landslide area. The ratonale of applying Bayesian the Orem here was to revise the prior probabilities P (A₁) (i.e. existing Landslides) to posterior probabilities P (A₁|B) (i.e. predicted Landslides) through available information for predicting new or undetected Landslides.

Results and Discussion

Minimum Distance Classification

Twenty training areas consisting of the digitized landslide deposits were categorized as the potential landslide class; the only class defined in this study. Of those pixel values in images not lying in the range of potential Landslides class were classified to non-landslide class. Prior to executing the classification, the mean vectors of the six TM images and the six terrain images were computed in order to calculate the minimum distances (i.e. Euclidean distances) to class means for those input images.

Two sets of input images were chosen, although there could be thousands of combinations between those six TM images and six terrain images. One set of the input images analyzed was the TM images (dataset 1) , the other was all of the twelve TM and terrain images (dataset 2). Defining a proper Euclidean distance, ED_j , was the pre-classification task. Different ED_j resulted different classification with differing accuracy of prediction. Among the accuracy indices commonly used, mapping accuracy was applied to evaluate the accuracy of classification Mapping accuracy, MA, defined by short (1982) and Piper (1983), is usually applied to evaluate results for land cover classification. The advantage of applying this index is that MA possess the following characteristics equals zero if no positive match, equal one if perfect matches, takes into account user's accuracy, and producer's accuracy, and is not affected by sample size.

Figure 2 shows the results of minimum distance classification for six TM images alone and the six TM images and six terrain images combined. As noted in the diagram, the mapping accuracy of dataset 2 reaches its highest accuracy of 12.19% as EDj becomes 120. The highest mapping accuracy of dataset i was, however, much lower (i.e. MA = 3.65%) than that of the dataset 2. Figure 3 shows the classified image of dataset 2 with ED_j equals to 120, which was the optimal result for both of the input datasets.
Parallelepiped Classification

The main task for applying this classification was to define the ranges and logical operators between images. As a result of the classification, out put binary images showed the predicted landslide and non-landslide areas. The classification was performed empirically on the

Figure 2. Euclidean distance vs. mapping accuracy for datasets 1 and 2

Figure 3. The classified image of dataset 2 with ED_j of 120
(black: predicted landslides, white polygons: existing landslides)

basis of visual quality of the processed images and statistical characteristic of the training areas . The optimal combination, which possed the highest mapping accuracy among TM and the processed images, and terrain images was again obtained through computer search. The following is the resulting algorithm from the search. Figure 4 shows the output image from this equation with mapping accuracy of 9.25%

{ ( 28 £ TM2 £37).AND.( 650 m £ CONTOUR INTERVAL £ 850m).AND . {DRAINAGE = 1250 m dilation AND (LITHOLOGY = coal-bearing)}

Bayesian Classification

Table 1 lists the range of pixel values of each of each input parameter image having the maximum weighting factor (I.E. (w⁺- w^-) The larger value of (w⁺- w^-) indicated the higher capability for distiguishing Landslides and non-landsliding areas. The DN_flt, which was dilated by 35 two hihest values of (W ⁺ -W). This meant that the fault image, DN_grp, were the two most important factors for istinguishing Landslides and non-landsliding areas in Healy, Alaska . The

Figure 4. Landslide image predicted using parallelepiped classification
(black: predicted landslides, white polygons: existing landslides)

Table 1. Summary of the optimal veighting factors for each input image

Pattern	W⁺	W^-	W⁺-W^-	Pattern	W⁺	W^-	W^-
TM1[60,68]	0.6511	-1780	0.8291	DN_slp [00,01]	0.2569	-0.1326	0.3895
TM[22,31]	0.5310	-0.3660	0.8970	DN_asp[68,82]	08026	-0.0318	0.8380
TM3[25,36]	0.5018	-0.2735	0.7753	DN_con[14,14]	1.3133	-0.4166	1.7300
TM4[37,50]	0.3345	-0.1918	0.5263	DN_flt[35,35]	1,4677	-3,4357	4.9034
TM5 [21,24]	0.0911	-0.0140	-0.1015	DN_dra[35,35]	0.5345	-1,9501	2.4846
TM7[19,19]	0.1550	-0.0031	0.1581	DN_grp[06,06]	0.7273	-4,1137	4,8410

pixel values of the images lying in the specified ranges listed in Table 1 were replaced by W+, otherwise by W- for each image, to form a weighted image of itself. Then, by applying Bayesian formula, which integrated posterior probability of each input parameter image, the posterior probability image was obtained. The pixel values of posterior probability image ranged from 0 to 1. The two datasets which had been used in parallelepiped classification were also chosen for creating posterior probability images.

Figure 5 shows the posterior probability image of dataset 2. The brighter area on the image indicates the higher probability to landslide. Based on the histograms of posterior probability images, various threshold values were chosen to creating binary images showing lanslide and non-lanslide classes. Figure 6 shows the variation of mapping accuracy vs. various threshold values selected for both sets of input data. The best result of Bayesian classification was obtained by processing both terrain and TM images with threshold value of 10-1 (Figure 7)

Conclusions
Bayesian classification, applying prior and posterior probabilities for unequal weighting input parameter images, is more accurate than minimum distance classification and parallelelepiped classification, which equally weighs input images. With the terrain information added, the accuracy of all three classifications can be much improved from that generated by the TM images alone. The higher mapping accuracy indicates the more satisfactory method for landslide prediction. Bayesian classification taking 10-1as a threshold value has the highest mapping accuracy (i.e.16.60%)and is the best result of this study.

Figure 5. Posterior probability images of TM and terrain data

Figure 6. Variation of mapping accuracy vs. threshold values used in Bayesian classification

Figure 7. Posterior probability binary image of TM and terrain data
(black: predicted landslides, white polygons: existing landslides)

Acknowledgements
The authors wish to express their sincere gratitude to the Generic Mineral Technology Center in Mine Systems Design and Ground Control, U.S. Bureau of Mines' Office of mineral Institutes for the financial support of the study.

References

Anderson, A.T. Schulz, D., and Nock ., 1976, Satellite Data for Subsurface mine Inventory, Report X-923-76-199, Goddard space Flight Center, Greenbelt , Maryland , 13p
Corser , P. and Parker , W., 1987, Coal mining in Alaska's Interior, Problems and Solutions, Proc. int. sym. on Cold Regions Engineering , Anchorage, Alaska, pp 619-633
Gagon, H. 1975, Remote Sensing of Landslide Hazards on Quick Clays of Eastern Canada, Proc 10th Int. Sym. Remote Sensing of Environment , October 6-10, 1975, ERIM, Ann Arbor , Michigan, pp. 803-810
Goodenough, D.G. 1988 Thematic Mapper and Spot Integration with a Geographic Information System, P.E. & R.S., Vol. 54, No.2, pp 167-176
Heath, W. and Dowling, J.W., 1980, Examples of the use of Terrestrial Photogrammetry in Highway Engineering , Transport and Road research Lab, Crowthorue, England 19 P.
Mc Donald , H.C. and Grubbs, R.C. 1975, Landsat imagery Analysis: An Aid for Predicting Landslide Prone Areas for Highway construction , NASA Earth Resource Survey Sym. Houston, Texas , June 9-12, 1975, Vol 1-b, pp 769-778
Piper, S.E. 1993 the Evaluation of the Spatial Accuracy of Computer Classification , Sym. machine Processing of remotely sensed data, pp 303-310
Sauchyn, D.J. and Trench, N.R. 1978 Landsat Applied to Landslide Mapping , P.E. & R.S. Vol 44, No.6, 1978, pp. 735-741
Scully, J., 1973, Landslides in the Pierce Shale in Central South Dakota , South Dakota Department of Highway , 737p
Short, N.M. 1982, The Landslides in the Pierce Shale in Central South Dakota, South Dakota department of Highway , 737p
Short N.M. 1982 , The Landsat Tutorial Workbook, Reference publication 1078, NASA, Washington, DC, pp. 327-389.
Simonett, D.S. Schuman , R.L.and Williams, D.L. 1970 , The Use of Air Photos in a study of Landslides in New Guinea, Kansas University, 62 p
Stephens, P.R. 1988, Use of satellite Data to Map Landslides, 9th Asian Conf. on Remote Sensing, Bangkok, Thailand, November 23-29, 1988, pp. J11. 1-J11.7.