The Relative Merits of the Raster and Vector G.I.S. Data Models

The Relative Merits of the Raster and Vector Data Models -

in Relation to Environmental and Socio-economic Applications of G.I.S.

John J Bark
Geographic Information Systems use two main types of geographical models to represent data and a long running debate exists over whether the raster or vector model is the best approach. It is not only a question of each models ability to represent different geographical features but how suitable each model is performing the different spatial operations GIS users wish to perform. Both data structures can handle the three basic geographic features used in GIS (points, lines & polygons), and permit analysis on them with varying ease and success. In this essay I will briefly explain each structure, discuss some of the advantages of using them and investigate their use in a number of real life environmental and socio-economic GIS applications. Appendix 1 contains a comparison of the advantages and disadvantages for each data model complied from several respected GIS texts. There is a third model recently applied to GIS using an object orientated approach. This model places real world entities with a specific theme in hierarchical order, inheriting attribute values from the levels above. This model is in a relatively early stage of development so I will not consider it in this essay.

Considered the simpler model, the raster approach comprises of a grid cell matrix where each square is allocated a value indicating which entity the it belongs to (Cadoux-Hudson & Heywood 1991). A single cell represents a point whilst lines and polygons are represented by collections of cells.

The vector model attempts to reproduce objects as exact to reality as possible (Burrough 1991). Feature information about points, lines and polygons is stored as a series of x, y co-ordinates. The location of a point is described by a single co-ordinate, a line as a string of co-ordinate points and a polygon by a closed loop of co-ordinates.

Computers find raster structures easier to handle due to their row and column structure, making them simpler to store, analyse and display but the creation of many layers requires large volumes of storage space. One of the easiest analytical functions to perform with raster data is overlay analysis. Due to the two dimensional nature of the grids each cell can only represent one geographical attribute reducing overlay to a direct comparison of the same grid squares in different layers. Overlay analysis is possible with vector data but is more complex due to its structure. There is a possible problem using the raster model due to the features being represented on layers that are "not continuous, but quantized"(Burrough 1991). The larger the grid squares used the more difficult it becomes to accurately measure the length and area of features due to an increasing loss of precision. If the surface is considered to be continuous this allows easy application of many mathematical formulas and simulation and modelling techniques. This presents a major advantage over vector methods.

Within the vector model space is considered to be "continuous, not quantized" (Burrough 1991), allowing the calculation of precise lengths and dimensions and reducing the volume of data requiring computer storage. The output produced by this method is considered more elegant against what many considered to be crude raster maps.

Most early GIS developments were done with vector structures "simply because vector structures were the most familiar form of geographical expression until alternatives developed" (Burrough 1991). Raster alternatives developed and due to their co-ordinate structure are more efficient at some functions such as overlay analysis and the retrieval of items on the basis of their location. Other functions are only possible with the vector models owing to its topology construction. The data is structured to include the relationships such as adjacency and connectivity between nodes, lines and polygons. Using these linkages allows the creation of networks and analysis of them.

Often GIS are used in the management of natural resources such as forests, and involve the integration of a number of different data types. For example creating a GIS to investigate suitable sites for the planting of new trees involves integrating information about the existing vegetation cover, the climate, soil types, land usage, possible water sources, etc. The best source for much of this data is satellite images owing to their consistency and resolution, high positional accuracy and the low levels of human interpretation needed (Maguire & Dangermond 1991). Most natural resource management GIS applications in order to reduce resolution loss in the satellite images store them in a raster structure. Unfortunately this requires a large volume of computer storage space and if this was the only advantage of using a vector structure many users with small storage capacity may change to a vector structure. Using raster data has other benefits. Much of the information on satellite images is of a continual nature, such as vegetation cover or soil types and as I mentioned above if the raster structure is assumed to be continuous, analysis benefits present themselves. Much of the data analysis in a natural resource GIS is done using overlay analysis, for example overlaying remotely sensed data layers to produce the optimum site for planting the trees. Another possible use of the overlay process is to monitor continuous surface processes and detect changes. With increasing technology the detection and modelling of intermittent high magnitude events such as fires and floods will become quicker to detect (Maguire, Goodchild & Rhind 1991) allowing better emergency resource management.

The first GIS, the Canadian Land Information System was designed to monitor land ownership and this remains a common GIS usage. A modern system requires an "accurate description of the boundary of a large number of land parcels, together with fast access to a relatively small quantity of attribute data about the owner" (Maguire & Dangermond 1991). It must also have the capability to deal with the large number of ownership transactions. Holding the data in a raster structure can present problems depending upon the size of the grid squares used. If the grid squares become too large, the smaller land parcels are lost as the squares become too big to deal with each individual parcel and accurate land size searches of the database become impossible. By decreasing the grid size to allow the representation of small land parcels the volume of data increases, necessitating larger and more expensive storage capabilities. Storing data in a vector structure allows the recording of every individual parcel independent of size, and also allows quick and easy updating to the attribute information.

Another good example of the vector data model within GIS is road route finding. The GIS contains a geographical description of the road network as a line feature allowing the calculation and analysis of various routes. Network analysis allows the calculation of the time taken using different roads and then devise routes taking the shortest distance, the quickest time or avoiding certain features. Many of these systems are p.c. based and include attribute information about services available along the route, whereas commercial applications can include impedance times such as the time taken to unload at each delivery point. A recent development is the in-vehicle systems that use the latest positioning and telecommunication technology to offer the same route finding service but consider impedance’s that develop whilst on route, such as traffic congestion. Details are transmitted to the system in your vehicle which can then alter you planned route to avoid any future trouble spots. These systems use vector models allowing for quick and easy updating, often within 20 minutes of the delay developing and permit the necessary network analysis needed to devise the possible routes.

Until recently a problem using vector models in the production of high density feature maps was with the sizing and positioning of labels. For maps containing only moderate feature density a vector based approach is fast and efficient, but with increasing feature density there is a greater likelihood of overlapping labels. A raster approach although inherently more complex, was better suited to solve the complex construction needed to fit a large number of labels into a small area. Recently ESRI (Environmental Systems Research Institute) has introduced a software package called ‘Maplex’ to accomplish easier label placement on vector data structures. The package is able to automatically reposition or re-size labels quickly and easily, preventing any conflict.

The choice of which data model to use in a GIS regularly does not consider any advantage or disadvantage offered by one of the structure. Often both systems are suitable but in many fields such as surveying, geology and hydrology, general agreements have developed about which data model to use. This is often due to the historical way their data was represented (Burrough & Frank 1996). For example, digital soil maps are predominantly vector based because this kind of presentation is traditional. This may not necessarily be the best method to use. Using vector models, sharply defined polygons represent different soil types which does not reflect reality. They omit any soil type variations along soil boundaries imposing a distinct boundary. Representing the fuzziness of soil boundaries through a raster format uses a great amount of computer memory and complex formulas now exist which can consider variance.

The reasons for choosing either vector or raster data structures for a GIS are decreasing. Traditionally users faced a choice of either raster GIS’s allowing easy analysis but producing unattractive maps or, vector structures providing manageable databases with high quality graphics but in which spatial analysis was more difficult (Burrough 1991). This was a technological problem as most early GIS developments used vector data as this was the most familiar form of representation e.g. soil maps. As both models are valid, GIS have developed that used both and it is possible to covert data from one format to the other. It is even possible to program analysis routines that choose the most efficient structure for problem solving. Often spatial analysis is present "in both raster and vector form, particularly when lines of boundary data need to be represented by connecting networks or drawn in a particular style and the spaces between must be filled with a print raster of a given symbolism or colour" (Burrough 1991).

A major consideration by many firms choosing a GIS that uses one or the other model is cost. The technology for raster GIS has become the cheapest and is developing rapidly whilst vector technology is more sophisticated and thus more expensive. Many firms just require a simple mapping tool and so choose the cheaper raster systems.

As modern GIS can use both models and conversion methods exist the question arises of whether we need to choose an approach to represent the data? The answer may appear to be no but in reality vector and raster structures should be considered because as we have seen certain data types are better represented or analysis is better accomplished by one structure than the other.
Top of Document
Bibliography

Aronoff, S.(1991) GIS A Management Perspective. WDL Publications
Burrough, P. A. (1991) Principles of Geographic Information Systems for Land Resource Assessment. Claredon
Ed. Burrough, P. A. & Frank, A. U. (1996) Geographic Objects With Indeterminate Boundaries. Taylor & Francis
Mitchell, D. & Stuart, N. (1991) The Raster Vector Issue Revisited. Geographic Information 1991 : The Yearbook of the Association for Geographic Information, Ed. J. Cadoux-Hudson & I Heywood. Taylor & Francis
Grimshaw, D. J. (1994) Bringing Geographical Information Systems into Business. Longman Scientific & Technical
Maguire, D. J. & Dangermond, J. (1991) The Functionality of GIS. Geographic Information Systems : Principles & Applications Vol. 1 Principles, Ed. D. J. Maguire, M. F. Goodchild & D. W. Rhind. Longman Scientific & Technical
McDonnell R. & Kemp K. (1995) International GIS Dictionary. Geoinformation International
Burnett G. & Ross T. (1997) Navigating the Long and Winding Road. GIS Asian Pacific October/November 1997
Maplex (1997) ESRI. http://www.esri.com/
Top of Document
Appendix 1. A Comparison of Vector And Raster Models

These advantages and disadvantages of raster and vector data models are complied from Geographic Information Systems: A Management Perspective (Aronoff, S 1991), Principles of Geographic Information Systems for Land Resource Assessment (Burrough, P. 1991) & Bringing Geographical Information Systems into Business (Grimshaw, D. 1994).
Vector Methods

Advantages

Good representation of phenomenological data structure
Compact Data Structure
Less Storage Space Required
Topology can be completely described with network linkages
Accurate graphics, that closely approximate hand-drawn maps
Retrieval, updating & generalisation of graphics & attributes are possible
Efficient for topology
Output of maps is of good quality images
Disadvantages

It is a more complex data structures than a simple raster
Combination of several vector polygon maps or polygon & raster maps through overlay creates difficulties
Simulation is difficult because each unit has a different topological form
Display & plotting can be expensive, particularly for high quality, colour and cross-hatching
The technology is expensive, particularly for the more sophisticated software and hardware
Spatial analysis and filtering within polygons are impossible
Manipulation and enhancement of digital images cannot be effectively done in the vector domain
The representation of high spatial variability is inefficient

Raster Methods

Advantages

Simple data structures
The overlay and combination of mapped data with remotely sensed data is easy
Various kinds of spatial data are easy, especially overlay
Simulation is easy because each spatial unit has the same size and shape
The technology is cheap and is being energetically developed
High spatial variability is effectively represented
The raster format is required for efficient manipulation and enhancement of digital images
Disadvantages

The raster data structure is less compact. Data compression techniques can often overcome this problem
The use of large cells to reduce data volumes means that phenomenological recognisable structures can be lost and there can be a serious loss of information
Out put of maps do not have a smooth appearance because boundaries tend to have a blocky appearance. This can be overcome by using a vary large number of cells, but may result in unacceptably large file
topology is difficult to represent so network linkages are difficult to establish
Projection transformation are time consuming unless special algorithms or hardware are used
data redundancy

Top of Document

GIS Index

Home

Last updated 05/03/99