A spatial analysis and modeling system of solve environmental problems

A spatial analysis and modeling system of solve environmental problems

C. H. Vermillion, F. L. Stetina and J. Hill
NASA (USA)

Background
Through our economic and technological activity, we are now contributing to significant global changes on the Earth within the span of a few human generations. We have become a part of the Earth System and one of the forces for Earth change.

Research holds the key to a deeper understanding of the Earth as a integrated system of interacting components, and of the consequences of global change for humanity. To achieve this understanding, we need a new approach to Earth Studies- Earth System Science-which builds upon the traditional disciplines, but promises to provide a deeper understanding of the interactions that bind the Earth's components into a unified, dynamical system. Fundamental to this new approach is a view of the Earth System as a related set of interacting processes operating on a wide range of spatial and temporal scales, rather than as a collection of individual components. The goal of this new Earth System Science is to obtain a scientific understanding of the entire Earth System on a global scale by describing how its component parts and their interactions have evolved, how the function and how the ma be expected to continue to evolve on all time scales.

The challenge to Earth System Science is to develop the capability to predict those changes that will occur in the next decade to century, both naturally and in response to human activity. Complimenting our innate curiosity Complimenting our innate curiosity about our planet, the search for practical benefits to improve the quality of human life continues to provide an important motivation or Earth science. The problem is that the global changes cannot readily be distinguished from the results of natural change on the same time scale. We require a set of Earth Observations that will permit us to disentangle the complex interactions among the Earth's components and to document their effects over extended time periods. such observations will allow us to establish casual relationships among the processes involved and therefore to distinguish between the consequences of human economics and technological activity, on the one hand, and the results of natural ledge, we will then be able to take timely action to ensure an abundant Earth for future generations.

we can begin to meet this challenge today:

Programs of global observations relevant to a number of Earth system properties have already been carried out with great success.
- Global Vegetation Index
- Sea Surface Temperature
- Ocean Color
- Global Weather/Global Cloud Types
- Earth Radiation Budget
- Global Weather Experiment
Future Missions
- Ocean State/Currents
  - Tropical Rain Measurements
  - Earth Observing System.
Information Systems specifically constructed to process individual sets of global data are already in operation. New developments in computing technology have now made feasible an advanced information system to provide worldwide access to more extensive global data to be obtained in the future, and the facilitate data analysis and interpretation by the scientific community.

A new such network-WETNET is discussed in another presentation during this conference.
A worldwide political awareness of the necessity for a coordinated, international approach to the global study of the Earth has been created, and cooperative research efforts b many nations across the globe are underway.

To facilitate this cooperation, NASA has developed a Spatial Analysis and Modeling System which allows easy exchange of data; these software systems also allow the assembly of information essential for effective decision-making for economic development, emergency preparedness, and natural resources planning and management. Thus the software which facilitates problem solving on local and national levels also extends to regional and global scales without design changes ad utilizing a realistic multidiscipline approach.

Introduction to the NASA spatial and modeling system
The need and usefulness of spatial information is crucial in all applications.

Experts estimate that almost 70 percent of all information is location-based. Spatial data, thus, this a unique role in the design and analysis of development policies and options. Since verifiable information is scarce and expensive, it is important to have a spatial information system which effectively handles all relevant spatial data. It should allow for the comparison and integration of data from a variety of satellite and ancillary sources, varying map scales, and fields map scales, and fields of inguiry. Outputs of spatial data analyses may be fed back to the system for further use in their applications, spatial simulation models need to be integrated with the various data types. The entire system complexity is exponentially increased by the varying data types, scales, models and data structures. Expert systems are finally needed to relate the rich fabric of the comprehensive spatial information and modeling system to the needs of decision makers in various departments and disciplines (discipline experts).

Spatial data comes from many sources. Data sources include remote sensing systems, analog maps, digital maps and ancillary ( textual ) data. One should not the efficacy remotely sensed data in largely unmapped, rapid growing and/or inaccessible regions. The launching of satellites and development of various imaging sensors ( i. e., multispectral scanners, thermal radiometers, radar) have added a wealth of additional economically acquired and qualitatively improved spatial, spectral, and temporal data. Ancillary, non-spatial data provide attributes of a spatial entity such as the function of a plant, the permits issued, the population of a city or the type of materials in a dump. Ancillary information can readily be processed by standard data base management systems in current use, the spatial data are not so readily handled.

The term geographic information system (GIS), is often used to describe these systems. Most geographic information systems. Most geographic information systems, however, are designed for map data. The most common data source for geographic information systems has been the analog map. For this reason most of these systems are organized to effective process analog map data and or link to spatial, simulation models.

Hence, we use terminology Spatial Analysis ad Modeling System (SAMS) for our system which includes remote sensing and image analysis functions, simulation models as well as the mapping functions ascribed to geographic information systems. It must be understood that all the components: image analysis, modeling and information storage and retrieval must effectively function as a system. SAMS tools are made accessible by recent advances in the acquisition, interpretation and synthesis of data.

Since data is the primary factor in determining the structure and functions of varied spatial analysis systems, the discussion below examines systems particularly with respect to spatial data needs, SAMS must integrate the characteristics the two major types of spatial systems: largely vector-based Geographic Information Systems (GIS). The discussion below highlights the pertinent features of body systems that are critical in the design the SAMS. The discussion appropriately begins by listing the important attributes of GIS.

Geographic Information and Image Processing Systems
GIS systems have become used on s wide basis in recent ears. This has occupancy because of advances in computing technology and GIS software. Present systems provide:
1. Map Digitizing and Editing: This includes changing scales and projections. Joining separate map sheets and correcting for map distortions.
2. Ancillary data Entry and Management: The map features may have attributes such as ownership, stream flow, etc. This allows one to manage nonspatial data.
3. Map production: This provides for output on plotters & displays. Scale changes and various projections are supported.
4. Analysis Function: Substantial analysis are supported such as area, perimeter and distance calculations. Vehicle routing and facilities sitting are also supported.
5. Statistics: Statistical functions such as means, histograms and multivariate analysis are supported.
6. Data Management: This supports storage and retrieval of spatial and associated data.
Functions of SAMS
Various functions or capabilities that need to be integrated for the development of an operational Spatial Analysis and Modeling Systems (SAMS).
1. Database Structures
  The interaction of algorithms and data structures in spatial systems is a complex issue and an important area of research. Spatial databases tend to be vary large. The queries made of the database consists of those common to other databases, but also include queries that relate to distance, containment or connected-to. There may be a large number of possible relationships that can exist. The systems must effectively handle spatial relationships. The data in a database are composed of entities and relationships between the gentilities. The complexities in system comes from the manner in which the relationships are represented. Present systems use either tree network or relational schemes for organizing databases. Relational Systems seems, the most promising. In these systems seems, the most promising. In these systems, relationships are given by tables. A query language is used to describe relationships which are used in retrieval functions.
  
  Most GIS's utilize vector data-structures. A GIS will have algorithms suitable for the data structure. This means that most systems have vector based algorithms. Image processing is very rich in algorithm development, but has been oriented toward the raster data structure.
  
  Remotely sensed data re collected in raster format. The output of complicated processing of pixel information might be the input to SAMS. A problem is that currently the image processing classification algorigthms are not accurate enough for input to a SAMS. It is of course true that the SAMS can be used as a knowledge base to create from the updating capabilities of remotely sensed data. Significant benefits will derived from an integration of SAMS and remote sensing and image processing systems.
2. Vector-Raster Capabilities
  SAMS effectively handles the two major data formats amongst others: raster and vector data. The data entities commonly utilized in spatial system are: points, lines and polygons, and pixels. Points, lines and Polygons are readily represented by sequences of x- pairs. These are known as vector data and are easily processed b graphic output devices and digitizing table input devices. This format also provides a compact way to store the data. Vector format is used to encode a wealth of available data including many geographic features, like political boundaries, that cannot be remotely sensed. A vector is a straight line usually defined by specifying the geographic ordinates of its endpoints. Many short vectors can be laid end to define curves, or they can be arranged to form closed polygons. Such vector groups are usually assigned a number representing some attribute o the area enclosed by a polygon or the path defined by a vector line tracing. Vector images such as contour maps or highway maps consist of sets of named vector groups organized for rapid access or storage efficiency. The resolution of a vector image is defined by the accuracy of the vector endpoints as well as the accuracy with which curved boundaries are approximated by straight line segments. Vector data sets typically range in size from several kilobytes up to a few megabytes per image.
  
  Pixels or gird data are image data and are normally colleted by remote sensing systems (i.e., multispectral scanners, radar, video cameras). These data are called raster data and are readily processed by raster based image area into a regular grid whose resolution many vary from cells sizes of several square kilometers down to less than one square meter each. Each grid square is represented b a number that signifies some characteristics of the imaged piece of land. The collection of numbers from all grid squares forms a rectangular array are known as a digital image. Raster images can have multiple channels where each channel has a separate array of numbers, with their sampling grids assumed to be perfectly matched. Raster data sets are typically quite large, with image sizes in excess of 24 megabytes becoming increasingly commonplace.
  
  The raster and vector representations of spatial data have both their strengths and weaknesses such that neither data type can adequately replace the other. Raster data are easily acquired by remote sensing techniques and are available in prolific quantity on a near real-time basis. The raster format has the advantage that the data are stored in the computer in a manner that preserves spatial geometry. Neighboring pixels are neighbours on the earth's surface. This format is convenient for man tasks, especiall image analysis and some simulation tasks. Raster images, however, require large amounts of storage and cannot reliably resolved objects that are smaller than the image grid cell size. Vector images, on the other hand, can afford a very high spatial resolution due to the greater storage efficiency of the format. Most GIS systems process vector data and most image analysis systems process raster data.
  1. Raster-Vector Conversions
    The potential of merging the two technologies, GIS and remote sensing, has been recognized for several years. From the GIS side, the remote sensing data represent an important source of data, which are easier to input, relatively up-to-date, and can be available on a continuous basis. This helps to release the major bottleneck of developing a large-scale GIS: The problem of digitization and inputting data into the computer. From the remote side, GIS represents an excellent approach to handling and analyzing the large amount of spatial data. In addition, the conventional map data contained in most GIS provide another dimension of information which remote sensing data lack, such as county or census tract boundaries and socio-economic data.
    
    Despite its great potential, research and development in the integration of these two technologies is still limited due to the many technical as well as theoretical problems involved. Among the technical problems are: different computer hardware and software, different data storage formats and different procedures for handling and analyzing the data. These technical problems are further complicated b the accuracy of raster-vector conversions, conversions between different data structures, and the reliability of the map overlay and interpolation procedure must be addressed before accurate interpretation of the results from the integrated GIS is made.
  2. Vector and Raster Processing Systems
    Integration of vector and raster displays involves more than simply drawing vectors on the raster display device. The software must keep track of the data residing on the screen. Physically the screen tracking data must be centrally located and readily accessible to all application programs. A library of data management routines must also be provide to assist applications in navigating and updating the complex tracking structures.
    
    Many of the standard vector graphics systems define textual equivalents of graphic device commands called graphics metacode. These metacode instructions allow drawings to be saved in ordinary text files and later redrawn using a standard metacode interpreter program. As such, the metacode description of a drawing contains all information necessary to perform display screen tracking of vector images. To accommodate raster image tracking, the metacode could be augmented with special image description code which contain the information necessary to regenerate the image on the screen. Such augmented metacode would provide a uniform data structure capable of tracking both raster and vector data in a given screen presentation. An added advantage to such a metacode based tracking system is that the textual metacodes are human readable and, thus easy to debug.
Simulation Models
The running of predictive models is one of the main attractions offered by SAMS. These models are often quite complex, each representing ma man-years of development, and will usually be garnered from a wide variety of sources. Each research site of interest will generally have a different format for the required data. The data format will often fundamentally affect the processing strategy of the modeling software. When importing a new model from an outside site, it is preferable to modify the in-house data to fit the models requirements rather than modifying the model to fit the data. The most widespread of such data format conversions will occur between raster and vector format data.

Data in the spatial data bank will be acquired from a wide variety of sources, all offering very little choice in the format of their data. To ensure consistency in the data bank, all data should be converted to an uniform standard format that is strictly enforced throughout the data archive. The definition of such a standard format poses demanding requirements in the data storage strategy. The standard format must handle raster and vector data types independently, as the fidelity of either data type is still served by permanent conversion to the other type. The data management software must be capable of converting between data types as required to suit each application. Provisions must also be made to handle non-image data such as surveyour's notes or tables of toxic agent half-lives.

The working spatial data bank may contain hundreds of separate images. To keep track of such a large volume of data, a management system must be included to support queries on data bank coverage based on the requested locations or time frames. New image can be ingested into the data bank by simply identifying them to the data bank manager, which would also maintain a complete account of the history of each image in the data bank. Separate data sources within the data bank would be integrated into the configuration requested for a given application. This involves automatic registration of images having disparate resolutions and data types. Such a scheme for automatic data integration could be greatly enhanced b the enormous storage capacity of laser disk media. Offering the possibility of maintaining the entire data bank in a constantly on-line mode.
Expert Systems
Expert systems are needed to integrate image processing and GIS's. Both systems tools for integration and utilization in an effective interactive mode with human operators. Effective vision systems in the future will have expert system shells in the implementation. In additation. In addition, the effective analysis of spatial data requires discipline experts. This knowledge must be available to the system if effective results are to be achieved. Expert system tools are needed to incorporate and extract expert knowledge from discipline experts. The overall system will necessitate expert system tools in system building.

Expert systems are generally defined to be programs that perform intellectual tasks. A system of this type can then give advice to make decisions based on the expert knowledge available to it. Expert systems have been successfully applied to such areas as medical diagnosis, geological exploration, petroleum production, circuit design and computer vision. Expert systems have an appeal over other forms of analysis because they can explain and justify their results. This is done be translating the rules and assertions used to draw a conclusion into a line of reasoning. This allows the personnel developing and evaluating the system to medical diagnosis, geological exploration, petroleum production, constantly monitor and evaluate its performance. A system must have good communication capabilities so that experts through simple dialogues can examine the reasoning and improve the system as needed. The advantages of developing an expert system is that such a system can be: 1) continually improved as more knowledge is obtained, 2) reproduced easily for other machines, and 3) combined with other expert systems to build a single system with increased capabilities.

Expert systems have matured to the point that the can be applied to spatial analysis problems. Tools are available as a basis for developing these systems. The variables in our spatial system are so complex and interwoven that it is unlikely that any other approach will predict the behavior of these systems.

Expert systems are frequently constructed from a class of programming languages commonly referred to as rule-based production system.

A rule-based system can be (used to integrate the information contained in maps, aerial images, and demographic data bases with the textual informations produced b various spatial models. The incorporation of symbolic processing capabilities with the vector and raster processing capabilities will be an aspect of this project. These capabilities will be important to enhance the capabilities of the SAMS. These capabilities will be utilized interactively with human operators and with image processing operators as the reliability of the vision operations warrant.
Applications Software Modules
Numerous applications specific software packages have been developed b scientists at GSFC, government agencies, and universities. These include the following:
1. Atmospheric and Oceanic Software
  There are 3 major programs for this area. These are GEMPAK, SEAPAK, and the international TOVS processing package (ITPP) GEMPAK, and SEAPAK are products of NASA, and read data from a variety of sources, performs analysis and image display. They output pots in monitors such as that on the IIS systems or standard graphics terminals. GEMPAK and SEAPAK also can be run under the NASA developed Transportable Applications Executive (TAE). TAE provides the use with a friendly interface to applications programs. For example, under TAE, GEMPAK, and SEAPAK can be easily used on a menudriven basis, and the user can conveniently and interactively choose a variety of options.
  
  GEMPAK is meant for atmospheric and meteorological applications. It can process, analyze and display data. It can perform objective analysis using the Barnes algorithm and derive standard meteorological parameters such as winds from appropriate data. It can evaluate and plot pressure temperature profiles for sounding data, and can draw Stuve, Skew-T, Log P, or vers T graphs. Geopotential heights, potential temperature and equivalent potential temperature ma be obtained and displayed. Variables can also be contoured, and wind barbs and streamlines can also be drawn. Derivatives of data can also be obtained.
  
  These are done in a interactive and menudriven manner, and the user can conveniently and flexibly choose options. GEMPAK can also run on the existing system and is compatible with IIS and a variety of DEC terminals.
  
  SEAPAK is similar to GEMPAK, put is oriented towards oceanic and coastal applications. Image data such as that from AVHRR and one Coastal Zone Color Scanner (CZCS) a rocessed, analsed and displaed. VPAK and also work in conjunction with me IIS image processing system. Parameters such as sea surface temperatures, oceanic primary production and sedimentation can be measured. Dynamical and biological processes can be inferred. Physical processes such as currents, eddies and instabilities can be studies.
  
  The ITPP is a product of the National Oceanic and Atmospheric Administration (NOAA) and the Cooperative Institute for Meteorological Satellite Studies (CIMSS) and retrieves geophysical parameters from the TOVS, data. The TOVS instrument consist of measurement in 27 spectral channels in the visible, infrared and microwave regions to sound the atmosphere. As such atmospheric temperatures, and water vapor content at selected altitudes can be retrieved b 'inverting' the radiative transfer equation. In addition, total ozone amount are also available. These results can be input to GEMPAK for analysis and display. The availability of temperatures allow for a quantitative atmosphere. Additional applications software will also be written for TOVS analysis.
2. Land Analysis
  The NASA-developed Land Analysis System (LAS) software will be provided. As with GEMPAK and SEAPAK, this software can also run under TAE and consists of a comprehensive set of routines to display and analyze LANDSAT and SPOT data. Currently, the software is being enhanced to work in conjunction with a geographic information system. LAS is currently also able to work compatibly with the IIS system.
  
  LAS can perform a wide variety of data manipulation, mathematical and geometrical function on images. These include all commonly required analysis functions.
3. Mathematical and Statistical Packages
  A numerical analysis packaged used by US is the International Mathematical and Statistical Package (IMSL). This package includes a subroutine library containing all of he commonly needed numerical and statistical subroutines. Examples are spectral analysis, least squares fitting, numerical quadruture, solution of differential equations, correlative analysis and interpolation. The user can write a driver which reads data and calls the subroutine package to perform the required functions.
  
  There are also packages which are more interactive and useroriented. An example is the Interactive Data Language (IDL), which also can perform some image analysis and graphics.
4. Graphics
  There are many comprehensive commercial and government graphics packages that are available. This software provide for a wide variety of displays, including simple graphs, three-dimensional drawings, animation and contouring. It is proposed that, at a minimum, some user-oriented, interactive package such as IDL be implemented. IDL is also compatible with IIS system. IDL can also perform much numerical analysis and image processing. If needed, more sophisticated graphics packages can be obtained and installed. The latter require more software development. It may be best to implement both types of capability.
5. Data Base Management Systems (DBMS)
  It is important to have data base management and archiving capability to handle all the data. Among commercial systems that are popular are the products from Ingress, Oracle and DEC. These are relational databases and provide for automated catalogs and storage of data for convenient usage.
SAMS data flow
It is clear from the above discussion that a flexible spatial data analysis system should include five main components:
1. A data input subsystem for collecting data maps, images, textural sources, DCPS and other data sources;
2. A data storage and retrieval subsystem for organizing and quickly retrieving the data;
3. A data manipulation subsystem model that allows user-selected data to be aggregated and modified, and;
4. A data reporting subsystem for output manipulated data in map, tabular or image from (Marble 1984);
5. Simulation model subsystems for predictive purposes. The data input subsystem will have digitizing capabilities that will allow maps to be input using an X-Y unitizing table. The digitizing software will allow maps of various projections and scales to be mathematicall tied to the digitizer so that point, line, and polgon data can be manually traced and entered into existing data bank. The software will also have facilities for correcting digitizing formats such as the U.S. Geological Survey Digital Line Graph (DLG) or that obtained from other mapping system such as Intergraph or Arc-Info will be capable of entry via phone lines or magnetic tapes. The system will allow geometrically-corrected digital images obtained from aircraft, satellites and digitized photographs to be entered via entered via magnetic tape and textural data can be entered via the keyboard, phone line, or magnetic tapes.
The data storage and retrieval subsystem will provide a means of string data in an organized and efficient manner for later retrieval and manipulations. Retrieval options will include quicklook capabilities for on-one maps, images, and text which allow these data to be input into other system ad externals programs for further analysis.

To be an effective management and planning tool, a spatial data system must provide user friendly and efficient means of selectively retrieving data. A typical query of the data bank would be: "locate within a user-defined polygon all soils of a particular type within five miles of a specific highway and not currently under cultivation." This request would require boolean, polygon, and proximity queries of the data bank. Data query options to be included in the proposed system will include boolean operations, distance, and promixity calculations, area measurements, and polygon, point, and line retrieval. These operations can be done for area defined by irregular polygons, circles and rectangles.

The data reporting subsystem will provide a means to outputting processed data in a clear and concise manner. Data output will include textual data in report form of formatted for input into other external statistical or graphics programs, business graphics such as bar-charts, line-graphs and pie-charts, and maps output to graphics terminals or pen plotters. Typical map generalization routines for coordinate thinning, edge matching and projection changes will be available.

The analysis subsystem will have comprehensive functions for georeferencing, classifying , distance calculations, statistical analysing,, and image processing functions. The modeling subsystem will have transportation and plume models. The INGRES data manager runs on many types of computers. It is the data base system chosen by U.S. Federal Emergecy Management Agency (FEMA) as a part of the (IEMIS) system. INGRES has the ability to take in text, properly formatted, and transform it to data base from whence to data may be retrieved using the INGRES query language. The SAMS is modeled on the IEMIS software.
Hardware and Software
SAMS is a software package which was developed on Digital Equipment Corporation's (DEC) VAX line of computers, using DEC's proprietary operating software VMS. It is written in the FORTRAN language and consists of hundreds of thousands of lines of program code. Color, high-resolution graphics terminals are needed to output the images to a screen. These terminals should be Tektronix terminals or SAMS is a software package which was developed on Digital Equipment Corporations' (DEC) VAX line of computers, using DEC's proprietary operating software WMS. It is written in the FORTRAN language and consists of hundreds of thousands of lines of program code. Color, high-resolution graphics terminals are needed to output the images to a screen. These terminals should be Tektronix terminals or SAMS is a software package which was developed on Digital Equipment Corporation's (DEC) VAX line of computers, using DEC's proprietary operating software VMS. It is written in the FORTRAN language and consists of hundred of thousands of line program code. Color high-resolution graphics terminals are needed to output the images emulations thereof. PC-based terminals are available as output devices. Digitizers, plotters, cameras and other peripheral devices can be linked to the system.
Summary of NASA - SAMS
The Spatial Analysis and Modeling System has evolved over a number of years from two important software packages-NASA's direct satellite reception and analysis systems as exemplified by the ground processing system developed for SPARRSO-Bangladesh, also the Regional Severe Storm Warning System developed for the Fiji Island Meteorological Service and the US Federal Emergency Management System Integrated Emergency Management Information System developed to facilitate U.S. disaster response and planning. Thus, the system is presently suited for applications in emergency management. Simulation models available include plume generation for nuclear power plants (adaptable for chemical plume plotting), evacuation time estimation for point-source and regional disasters), siren propagation model TYAN cyclone track model and projected models including dam break, storm surge, hazard mitigation models, etc. The need for such a system has already been expressed by numerous agencies and industries. The occurrence of the Bhopal explosion in the early 1980s and the series of spills on the Rhine River years have created a latent demand for cost-effective industrial accident management systems. The spatial analysis system can serve both as a simulation and learning tool, and as an actual dents. The base capabilities available from the system and the projected capabilities of the near future form a combination that could allow the use of the system, with modifications, for planning for economic development, coastal management, resource management and areawide planning.