The Relative Merits of the Raster and Vector Data Models -
in Relation to Environmental and Socio-economic Applications of G.I.S.
John J BarkGeographic Information Systems use two main types of geographical models to represent data and a long running debate exists over whether the raster or vector model is the best approach. It is not only a question of each models ability to represent different geographical features but how suitable each model is performing the different spatial operations GIS users wish to perform. Both data structures can handle the three basic geographic features used in GIS (points, lines & polygons), and permit analysis on them with varying ease and success. In this essay I will briefly explain each structure, discuss some of the advantages of using them and investigate their use in a number of real life environmental and socio-economic GIS applications. Appendix 1 contains a comparison of the advantages and disadvantages for each data model complied from several respected GIS texts. There is a third model recently applied to GIS using an object orientated approach. This model places real world entities with a specific theme in hierarchical order, inheriting attribute values from the levels above. This model is in a relatively early stage of development so I will not consider it in this essay.
Considered the simpler model, the raster approach comprises of a grid cell matrix where each square is allocated a value indicating which entity the it belongs to (Cadoux-Hudson & Heywood 1991). A single cell represents a point whilst lines and polygons are represented by collections of cells.
The vector model attempts to reproduce objects as exact to reality as possible (Burrough 1991). Feature information about points, lines and polygons is stored as a series of x, y co-ordinates. The location of a point is described by a single co-ordinate, a line as a string of co-ordinate points and a polygon by a closed loop of co-ordinates.
Computers find raster structures easier to handle due to their row and column structure, making them simpler to store, analyse and display but the creation of many layers requires large volumes of storage space. One of the easiest analytical functions to perform with raster data is overlay analysis. Due to the two dimensional nature of the grids each cell can only represent one geographical attribute reducing overlay to a direct comparison of the same grid squares in different layers. Overlay analysis is possible with vector data but is more complex due to its structure. There is a possible problem using the raster model due to the features being represented on layers that are "not continuous, but quantized"(Burrough 1991). The larger the grid squares used the more difficult it becomes to accurately measure the length and area of features due to an increasing loss of precision. If the surface is considered to be continuous this allows easy application of many mathematical formulas and simulation and modelling techniques. This presents a major advantage over vector methods.
Within the vector model space is considered to be "continuous, not quantized" (Burrough 1991), allowing the calculation of precise lengths and dimensions and reducing the volume of data requiring computer storage. The output produced by this method is considered more elegant against what many considered to be crude raster maps.
Most early GIS developments were done with vector structures "simply because vector structures were the most familiar form of geographical expression until alternatives developed" (Burrough 1991). Raster alternatives developed and due to their co-ordinate structure are more efficient at some functions such as overlay analysis and the retrieval of items on the basis of their location. Other functions are only possible with the vector models owing to its topology construction. The data is structured to include the relationships such as adjacency and connectivity between nodes, lines and polygons. Using these linkages allows the creation of networks and analysis of them.
Often GIS are used in the management of natural resources such as forests, and involve the integration of a number of different data types. For example creating a GIS to investigate suitable sites for the planting of new trees involves integrating information about the existing vegetation cover, the climate, soil types, land usage, possible water sources, etc. The best source for much of this data is satellite images owing to their consistency and resolution, high positional accuracy and the low levels of human interpretation needed (Maguire & Dangermond 1991). Most natural resource management GIS applications in order to reduce resolution loss in the satellite images store them in a raster structure. Unfortunately this requires a large volume of computer storage space and if this was the only advantage of using a vector structure many users with small storage capacity may change to a vector structure. Using raster data has other benefits. Much of the information on satellite images is of a continual nature, such as vegetation cover or soil types and as I mentioned above if the raster structure is assumed to be continuous, analysis benefits present themselves. Much of the data analysis in a natural resource GIS is done using overlay analysis, for example overlaying remotely sensed data layers to produce the optimum site for planting the trees. Another possible use of the overlay process is to monitor continuous surface processes and detect changes. With increasing technology the detection and modelling of intermittent high magnitude events such as fires and floods will become quicker to detect (Maguire, Goodchild & Rhind 1991) allowing better emergency resource management.
The first GIS, the Canadian Land Information System was designed to monitor land ownership and this remains a common GIS usage. A modern system requires an "accurate description of the boundary of a large number of land parcels, together with fast access to a relatively small quantity of attribute data about the owner" (Maguire & Dangermond 1991). It must also have the capability to deal with the large number of ownership transactions. Holding the data in a raster structure can present problems depending upon the size of the grid squares used. If the grid squares become too large, the smaller land parcels are lost as the squares become too big to deal with each individual parcel and accurate land size searches of the database become impossible. By decreasing the grid size to allow the representation of small land parcels the volume of data increases, necessitating larger and more expensive storage capabilities. Storing data in a vector structure allows the recording of every individual parcel independent of size, and also allows quick and easy updating to the attribute information.
Another good example of the vector data model within GIS is road route finding. The GIS contains a geographical description of the road network as a line feature allowing the calculation and analysis of various routes. Network analysis allows the calculation of the time taken using different roads and then devise routes taking the shortest distance, the quickest time or avoiding certain features. Many of these systems are p.c. based and include attribute information about services available along the route, whereas commercial applications can include impedance times such as the time taken to unload at each delivery point. A recent development is the in-vehicle systems that use the latest positioning and telecommunication technology to offer the same route finding service but consider impedance’s that develop whilst on route, such as traffic congestion. Details are transmitted to the system in your vehicle which can then alter you planned route to avoid any future trouble spots. These systems use vector models allowing for quick and easy updating, often within 20 minutes of the delay developing and permit the necessary network analysis needed to devise the possible routes.
Until recently a problem using vector models in the production of high density feature maps was with the sizing and positioning of labels. For maps containing only moderate feature density a vector based approach is fast and efficient, but with increasing feature density there is a greater likelihood of overlapping labels. A raster approach although inherently more complex, was better suited to solve the complex construction needed to fit a large number of labels into a small area. Recently ESRI (Environmental Systems Research Institute) has introduced a software package called ‘Maplex’ to accomplish easier label placement on vector data structures. The package is able to automatically reposition or re-size labels quickly and easily, preventing any conflict.
The choice of which data model to use in a GIS regularly does not consider any advantage or disadvantage offered by one of the structure. Often both systems are suitable but in many fields such as surveying, geology and hydrology, general agreements have developed about which data model to use. This is often due to the historical way their data was represented (Burrough & Frank 1996). For example, digital soil maps are predominantly vector based because this kind of presentation is traditional. This may not necessarily be the best method to use. Using vector models, sharply defined polygons represent different soil types which does not reflect reality. They omit any soil type variations along soil boundaries imposing a distinct boundary. Representing the fuzziness of soil boundaries through a raster format uses a great amount of computer memory and complex formulas now exist which can consider variance.
The reasons for choosing either vector or raster data structures for a GIS are decreasing. Traditionally users faced a choice of either raster GIS’s allowing easy analysis but producing unattractive maps or, vector structures providing manageable databases with high quality graphics but in which spatial analysis was more difficult (Burrough 1991). This was a technological problem as most early GIS developments used vector data as this was the most familiar form of representation e.g. soil maps. As both models are valid, GIS have developed that used both and it is possible to covert data from one format to the other. It is even possible to program analysis routines that choose the most efficient structure for problem solving. Often spatial analysis is present "in both raster and vector form, particularly when lines of boundary data need to be represented by connecting networks or drawn in a particular style and the spaces between must be filled with a print raster of a given symbolism or colour" (Burrough 1991).
A major consideration by many firms choosing a GIS that uses one or the other model is cost. The technology for raster GIS has become the cheapest and is developing rapidly whilst vector technology is more sophisticated and thus more expensive. Many firms just require a simple mapping tool and so choose the cheaper raster systems.
As modern GIS can use both models and conversion methods exist the question arises of whether we need to choose an approach to represent the data? The answer may appear to be no but in reality vector and raster structures should be considered because as we have seen certain data types are better represented or analysis is better accomplished by one structure than the other.
Bibliography
Appendix 1. A Comparison of Vector And Raster Models
These advantages and disadvantages of raster and vector data models are complied from Geographic Information Systems: A Management Perspective (Aronoff, S 1991), Principles of Geographic Information Systems for Land Resource Assessment (Burrough, P. 1991) & Bringing Geographical Information Systems into Business (Grimshaw, D. 1994).
Vector Methods
Advantages
Disadvantages
Raster Methods
Advantages
Disadvantages
Last updated 05/03/99