GIS Data Input and Editing Methods
GIS Data Input and Editing Methods
Syllabus:
Methods of data input into GIS, Data editing, spatial data models and structures, Attribute data
management, integrating data (map overlay) in GIS, Application of remote sensing and GIS for
the management of land and water resources.
Data input is the method of encoding data in computer readable form and capture the data in
GIS database. Data entry is usually the main hurdle in applying GIS. The initial cost of creating
a database is usually higher. Generally, two types of data entered into the GIS systems. One is
spatial data and another is non-spatial data.
Four types of data entry methods are commonly used in GIS. These are
1. Keyboard entry.
2. Manual digitizing.
3. Automatic digitizing.
4. Conversion of digital data files.
1. KEYBOARD ENTRY: This method is also known as keyboard encoding method. Most of
the case attribute data is usually input by keyboard but spatial data is rarely input by keyboard.
Both spatial and attribute data is inserted into GIS system using the keyboard terminal of the
computer.
Advantages: i) Very easy to use.
ii) Both spatial and attribute data can input.
iii) Most precise and accurate technique compared to others.
Disadvantages: i) This method becomes difficult to perform when number of entries is huge.
hard copy map, this method also known as hard copy digitizing. The purpose of digitization is
P a g e | 82
quality images or information on maps.
P a g e | 83
encode the wrong line; and folds and stains can easily be scanned and mistaken for real
geographical features. During data transfer, conversion of data between different formats
required by different packages may lead to a loss of data. Errors in attribute data are relatively
easy to spot and may be identified using manual comparison with the original data. For example,
a forest area can be wrongly identified as agricultural land or if a railway line has been
erroneously digitized as a road, then the attribute database may be corrected accordingly.
Various methods, in addition to manual comparison, exist for the correction of attribute errors.
Errors in spatial data are often more difficult to identify and correct than errors in attribute data.
These errors take many forms, depending on the data model being used (vector or raster) and
the method of data capture. There is a possibility that certain types of error can help to identify
other problems with encoded data. For example, in an -
indicate missing lines, overshoots or undershoots.
The user can look for these features to direct editing rather than having to examine the whole
map. Most GIS packages will provide a suite of editing tools for the identification and removal
of errors in vector data.
P a g e | 84
Table 5.1 Common spatial errors
Automatic corrections can save many hours of work but need to be used with care as incorrectly
specified tolerances may miss some errors
Errors will also be present in raster data. In common with vector data, missing entities and noise
are particular problems. Data for some areas may be difficult to collect, owing to environmental
or cultural obstacles. Similarly, it may be difficult to get clear images of vegetation cover in an
area during a rainy season using certain sensors. Noise may be inadvertently added to the data,
either when they were first collected or during processing. This noise often shows up as
scattered pixels whose attributes do not conform to those of neighboring pixels. For example,
an individual pixel representing water may be seen in a large area of forest. While this may be
correct, it could also be the result of noise and needs to be checked. This form of error may be
removed by filtering. Filtering involves passing a filter (a small grid of pixels specified by the
user-often a 3 × 3 pixel square is used) over the noisy data set and recalculating the value of the
central (target) pixel as a function of all the pixel values within the filter. This technique needs
to be used with care as genuine features in the data can be lost if too large a filter is used.
5.2.2 Re-projection, transformation and generalization: Once spatial and attribute data have
been encoded and edited, it may be necessary to process the data geometrically in order to
provide a common framework of reference. The scale and resolution of the source data are also
important and need to be taken into account when combining data from a range of sources into
a final integrated database.
Data derived from maps drawn on different projections will need to be converted to a common
projection system before they can be combined or analyzed. If not re-projected, data derived
from a source map drawn using one projection will not plot in the same location as data derived
from another source map using a different projection system. For example, if a coastline is
digitized from a navigation chart drawn in the Mercator projection (cylindrical) and the internal
state boundaries of the country are digitized from a map drawn using the
(conic) projection, then the state boundaries along the coast will not plot directly on top of the
coastline. In this case they will be offset and will need to be re-projected into a common
projection system before being combined.
P a g e | 85
Data derived from different sources may also be referenced using different co-ordinate systems.
The grid systems used may have different origins, different units of measurement or different
orientation. If so, it will be necessary to transform the co-ordinates of each of the input data sets
onto a common grid system. This is quite easily done and involves linear mathematical
transformations.
Some of the other methods commonly used are:
Translation and scaling: One data set may be referenced in 1-metre co-ordinates while another
is referenced in 10-metre co-ordinates. If a common grid system of 1-metre coordinates is
required, then this is a simply a case of multiplying the coordinates in the 10metre data set by a
factor of 10.
Creating a common origin: If two data sets use the same co-ordinate resolution but do not
share the same origin, then the origin of one of the data sets may be shifted in line with the other
simply by adding the difference between the two origins (dx, dy) to its co-ordinates.
Rotation: Map co-ordinates may be rotated using simple trigonometry to fit one or more data
sets onto a grid of common orientation.
Data may be derived from maps of different scales. The accuracy of the output from a GIS
analysis can only be as good as the worst input data. Thus, if source maps of widely differing
scales are to be used together, data derived from larger-scale mapping should be generalized to
be comparable with the data derived from smaller-scale maps. This will also save processing
time and disk space by avoiding the storage of unnecessary detail. Data derived from largescale
sources can be generalized once they have been input to the GIS. Routines exist in most vector
GIS packages for weeding out unnecessary points from digitized lines such that the basic shape
of the line is preserved. The simplest techniques for generalization delete points along a line at
a fixed interval (for example, every third point).
These techniques have the disadvantage that the shape of features may not be preserved. Most
other methods are based on the Douglas-Peucker algorithm. This involves the following stages:
i. Joining the start and end nodes of a line with a straight line.
ii. Examining the perpendicular distance from this straight line to individual vertices along the
digitized line.
iii. Discarding points within a certain threshold distance of the straight line.
iv. Moving the straight line to join the start node with the point on the digitized line that was
the greatest distance away from the straight line.
v. Repeating the process until there are no points left which are closer than the threshold
distance.
P a g e | 86
Fig. 5.2 Different forms of generalization
When it is necessary to generalize raster data the most common method employed is to
aggregate or amalgamate cells with the same attribute values. This approach results in a loss of
detail which is often very severe. A more sympathetic approach is to use a filtering algorithm.
If the main motivation for generalization is to save storage space, then, rather than resorting to
one of the two techniques outlined above, it may be better to use an appropriate data compaction
technique as this will result in a volume reduction without any loss in detail.
5.2.3 Edge Matching and Rubber Sheeting: When a study area extends across two or more
map sheets small differences or mismatches between adjacent map sheets may need to be
resolved.
Normally, each map sheet would be digitized separately and then the adjacent sheets joined
after editing, re-projection, transformation and generalization. The joining process is known as
edge matching and involves three basic steps.
First, mismatches at sheet boundaries must be resolved. Commonly, lines and polygon
boundaries that straddle the edges of adjacent map sheets do not meet up when the maps are
joined together. These must be joined together to complete features and ensure topologically
correct data. More serious problems can occur when classification methods vary between map
sheets. For example, different soil scientist may interpret the pattern and type of soils
differently, leading to serious differences on adjacent map sheets. This may require quite radical
reclassification and reinterpretation to attempt a smooth join between sheets. This problem may
also be seen in maps derived from multiple satellite images. If the satellite images were taken
at different times of the day and under different weather and seasonal conditions then the
P a g e | 87
classification of the composite image may produce artificial differences where images meet.
These can be seen as clear straight lines at the sheet edges.
P a g e | 88
5.2.4 Geocoding address data: Geocoding is the process of converting an address into a point
location. Since addresses are an important component of many spatial data sets, geocoding
techniques have wide applicability during the encoding and preparation of data for analysis.
P a g e | 89
other words, data model represents a set of rules or guidelines which are used to convert the
real world features into digitally and logically represented spatial objects. In GIS, data models
comprise the rules which are essential to define what is in operational GIS and its supporting
system. Data model is the core of any GIS which gives a set of constructs for describing and
representing selected aspects of the real world in a computer.
You have already read that in GIS data models, all real world features are represented as points,
lines or arcs and polygons. Data modellers often use multiple models during the representation
of real world in a GIS environment (Fig. 5.5). First is reality, which consists of real world
phenomena such as natural and man-made features. Other three stages are conceptual, logical
and physical models. The conceptual model is the process of developing a graphical
representation from the real world. It determines the aspects of the real world to include and
exclude from the model and the level of detail to model each aspect. It is human-oriented and
partially structured. Logical model is the representation of reality in the form of diagrams and
lists. It has an implementation-oriented approach. Physical model presents the actual
implementation in a GIS environment and comprises tables which are stored as databases.
Physical model has specific implementation approach.
Geospatial data is numerical representation which analyses and describes real world features in
GIS. The nature of geospatial database is a dynamic rather than static and allows a range of
functions such as organizing, storing, processing, analyzing and visualizing spatial data.
Geospatial data depicts the real world in two basic models such as the object-based model
and the field-based model as shown in Fig. 5.6.
P a g e | 90
Fig. 5.6 Illustration representing an outline model
Object-Based Model: The object is a spatial feature and has some characteristics like spatial
boundary, application relevant and feature description (attributes). Spatial objects represent
discrete features with well defined or identifiable boundaries, for example, buildings, parks,
forest lands, geomorphological boundaries, soil types, etc. In this model, data can be obtained
by field surveying methods (chain-tape, theodolite and total station surveying, GPS/DGPS
survey) or laboratory methods (aerial photo interpretation, remote sensing image analysis and
onscreen digitization). Depending on the nature of the spatial objects we may represent them as
graphical elements of points, lines and polygons.
Field-Based Model: Spatial phenomena are real world features that vary continuously over
space with no specific boundary. Data for spatial phenomena may be organized as fields which
are obtained by direct or indirect sources. Source of direct data is from aerial photos, remote
sensing imagery, scanning of hard copy maps, and field investigations made at selected sample
locations. We can obtain or generate the data by using mathematical functions such as
interpolation, sampling or reclassification from selected sample locations. This approach comes
under indirect data source. For example, Digital Elevation Model (DEM) can be generated from
topographic data such as spot heights and contours that are usually obtained by indirect
measurements.
P a g e | 91
Note: The Digital Elevation Model (DEM) consists of an array of uniformly spaced elevation data. A DEM is
point based but it can be easily converted to raster data by placing each elevation point at the center of a cell.
Spatial database may be organized as either object-based model or the field based model. In
object-based databases, the spatial units are discrete objects which can be obtained from field-
based data by means of object recognition and mathematical interpolation. In the object-based
generally called as the vector data model. When a spatial phenomena database is structured on
the field-based model in the form of grid of square or rectangular cells then the representation
is generally called as the raster data model. Geospatial database possess two distinct
components such as locations and attributes. Geographical features in the real world are very
difficult to capture and may requires large scale database. GIS can organize reality through the
data models. Each model tends to fit certain types of data and applications better than others.
All spatial data models fall into two basic categories: raster and vector.
Let us now discuss in brief about these two types of models.
5.3.1.1 Raster Data Models:
The raster data model is composed of a regular grid of cells in specific sequence and each cell
within a grid holds data. The conventional sequence is row by row which may start from the
top left corner. In this model, basic building block is the cell. The representation of the
geographic feature in this model is used by coordinate, and every location corresponds to a cell.
Each cell contains a single value and is independently addressed with the value of an attribute.
One set of cells and associated value is a layer. Cells are arranged in layers. A data set can be
composed of many layers covering the same geographical areas e.g., water, paddy, forest,
cashew (Fig. 5.7). Points, lines and polygons representation in grid format is presented in Fig.
5.8. The raster model, which is most often used to represent continuously varying phenomena
such as elevation or climate, is also used to store pictures or satellite images and plane based
images. A raster image comprises a collection of grid cells rather like a scanned map or photo.
5.3.1.2 Vector Data Models:
Vector data model comprises discrete features. Features can be discrete locations or events
(points), lines, or areas (polygons). This model uses the geometric objects of point, line and
polygon (Fig. 5.9). In vector model, the point is the fundamental object. Point represents
anything that can be described as a discrete x, y location (e.g., hospital, temple, well, etc.). Line
or polyline (sequence of lines) is created by connecting the sequence of points. End points are
usually called as nodes and the intermediate points are termed as vertices. If we know the start
and end node coordinates of each line or polyline we can compute the length of line or polyline.
These are used to represent features that are linear in nature e.g., stream, rail, road, etc. Polygon
is defined in this model by a closed set of lines or polylines.
P a g e | 92
Fig. 5.7 Illustration of raster data; (a) raster grid matrix with their cell location and
coordinates, and (b) raster grid and its attribute table
Fig. 5.8 Representation of raster gird format; (a) point (cell), line (sequence of cells), and
polygon (zone of cells) features and (b) no data cells (black in color)
P a g e | 93
Areas are often referred to as polygons. A polygon can be represented by a sequence of nodes
where the last node is equal to the first node. Polygons or areas identified as closed set of lines
are used to define features such as rock type, land use, administration boundaries, etc.
Fig. 5.9 Vector model represents point, line and polygon features
Points, lines and polygons are features which can be designated as a feature class in a geospatial
database. Each feature class pertains to a particular theme such as habitation, transportation,
forest, etc. Feature classes can be structured as layers or themes in the database (Fig. 5.10).
Feature class may be linked to an attribute table. Every individual geographic feature
corresponds to a record (row) in the attribute table (Fig. 5.10).
P a g e | 94
the database overlap but do not intersect, just like spaghetti on a plate. The polygon features are
defined by lines which do not have any concept of start and end node or intersection node.
However, the polygons are hatched or colored manually to represent something. There is no
data attached to it and, therefore, no data analysis is possible in the spaghetti model (Fig. 5.11)
Fig. 5.11 Vector spaghetti data model; (a) Spaghetti data, (b) cleaned spaghetti data
and (c) polygons in spaghetti data
P a g e | 95
5.3.1.4 Advantages and Disadvantages of Raster and Vector Data Models:
The representation of raster and vector models for geospatial data has two divergent views of
the real world as well as data processing and analysis. To solve different geospatial problems
obviously these two models can be used. But for the purpose of GIS applications geospatial
data requirement should be determined. In order to understand the relationships between data
representation and analysis in GIS, it is necessary to know the relative advantages and
disadvantages of raster and vector models. Both raster and vector models for storing geospatial
data have unique advantages and disadvantages. It is generally agreed that the raster model is
best suitable for integrating GIS analysis for various resource applications. Now-a-days most
of the GIS packages are able to handle both models.
Advantages of Raster Data:
.
.
a collection is easy.
.
due to the
inherent nature of raster images.
Disadvantages of Raster Data:
.
.
.
resolution). Hence, the
network analysis is difficult to establish.
cell shape
P a g e | 96
ch as linear transformation, similarity transformation
and affine transformation could be done easily.
Disadvantages of Vector Data:
.
functionality for
large data sets, e.g., a large number of features.
.
coloring, shading and also displaying may be time consuming and
unbearable.
ve.
P a g e | 97
Fig. 5.12 Cell-by-cell encoding data structure
When the raster data contains more missing data, the cell-by-cell encoding method cannot be
suggested. In RLE method, adjacent cells along a row with the same value are treated as a group
called a run. If a whole row has only one class, it is stored as the class and the same attributes
are kept without change. Instead of repeatedly storing the same value for each cell, the value is
stored once together with the number of the cells that makes the run. Fig. 5.13 explains the run-
length encoding structure of a polygon. In the figure, the starting cell and the end cell of the
each row denote the length of group and is generally called as run. RLE data compression
method is used in many GIS packages and in standard image formats.
P a g e | 98
Fig. 5.13 Quadtree data structure
5.3.2 Vector Data Structure:
As you know description of geographical phenomena explained in the form of point, line, or
polygons is called as vector data structure. Vector data structures are now widely used in GIS
and computer cartography. This data structure has an advantage in deriving information from
digitization, and is more exact in representation of complex features such as administration
boundaries, land parcels, etc. In early GIS, vector files were simply lines and were having only
starting and ending points. The vector file consists of a few long lines, many short lines, or even
a mix of the two. The files are generally written in a binary or ASCII (American Standard Code
for Information Interchange) code which refers to a set of codes used to represent alpha
numerical characters in computer data processing. Therefore, a computer programmer needs to
follow the line from one place to another in the file to enter the data in system. This unstructured
vector data are called as cartographic spaghetti. Vector data in the spaghetti data model may not
be usable by GIS. However, most of the systems still use this basic data structure because of
To express the spatial relationships more accurately between the features, the concept of
topology has evolved. Topology can explain the spatial relationships of adjacent, connectivity
and containment between spatial features. Topological data are useful for detecting and
correcting digitizing errors e.g., two streams do not connect perfectly at an intersection point.
Therefore, topology is necessary for carrying out some types of spatial analysis such as network
and proximity. There are commonly two data structures used in vector GIS data storage viz.
topological and non-topological structures.
Let us now discuss about the two types of data structure.
a) Topological Data Structure:
Topologic data structure is often referred to as an intelligent data structure because spatial
relationships between geographic features are easily derived when using them. Because of this
reason topological vector data structure is important in undertaking complex data analysis. In a
P a g e | 99
topological data structure, lines cannot overlap without a node whereas lines can overlap
without nodes in a nontopological data structure (e.g., spaghetti).
The arc-node topological data structure is now used in most of the systems. In the arc-node data
structure, the arc is used for the data storage and it also works when it is needed to reconstruct
a polygon. In file of arcs, point data is stored and linked to the arc file. Arc is a line segment
and its structure is given in Fig. 5.14. Node refers to the end points of the line segment. The arc
has information not only related to that particular arc but also to its neighbors in
P a g e | 100
ii) Coverage Data Structure:
Coverage data structure was practiced by many GIS companies like ESRI, in their software
packages in 1980s to separate GIS from CAD (Computer Aided Design). A coverage data
structure is a topology based vector data structure that can be a point, line or polygon coverage.
A point is a simple spatial entity which can be represented with topology. The point coverage
data structure contains feature identification numbers (ID) and pairs of x, y coordinates, as for
example A (2, 4) (Fig. 5.16). Data structure of line coverage is represented in Fig. 5.17. The
starting point of the arc is called from node (F-Node) and where it ends to node (T-Node). The
arc-node list represents the x, y coordinates of the nodes and the other points (vertices) that
generate each arc. For example, arc C consists of three line segments comprising F-Node at (7,
2), the T-Node at (2, 6) and vertex at (5, 2). Fig. 5.18 shows the relationship between polygons
and arcs (polygon/arc list), arcs and their left and right polygons (left poly/right poly list), and
the nodes and vertices (arc-
ed line from
The common boundary between two polygons (o and a) is stored in the arccoordinate list once
only, and is not duplicated.
P a g e | 101
Fig. 5.17 Line coverage data structure
P a g e | 102
polygons will be stored twice, once for each feature. This format allows user to draw each layer
by using different line symbols, colors and texts. In this structure, polygons are independent and
difficult to answer about the adjacency of features. The CAD vector model lacks the definition
of spatial relationships between features that is defined by the topological data model.
Since 1990s almost all commercial GIS packages such as ArcGIS, MapInfo, Geomedia have
adopted non-topological data structure. Shape file (.shp) is a standard non-topological data
format used in GIS packages. In ArcInfo coverage, the geometry of shape file is stored into two
extension types such as .shp and .shx. Shape file (.shp) stores the feature geometry and .shx file
maintains the spatial index of the feature geometry. The advantage of nontopological
data structure, i.e. shape file, lies in quick display on the system than the topological data. Many
software packages such as ArcGIS, MapInfo uses the .shp file format.
Note: Shape file comprises points (a pair of x, y coordinates), lines (series of points), polygons (series of lines).
There are no files to describe the spatial relationship between geometric objects and polygon boundaries have
duplicate in shape file.
5.4 ATTRIBUTE DATA MANAGEMENT:
The Attribute data management system (ADMS) requires fulfilling the following requirements:
1. Attribute data input and management Attribute data should be stored in the form of table.
The ADMS is able to accept, process and sort various types of data automatically as
well as guarantee data security always.
2. Attribute data query: The ADMS should provide multiple query schemes in order that
users can fast query things that they require in various practices.
3. Attribute data dynamic updating: The ADMS should be designed to support different
types of attribute data updating, such as, editing, erasing, deleting, adding, and so on so
that the data is current, accurate and reliable.
4. Statistics and analysis: The ADMS should have the functions of statistic analysis and
development prediction to various attribute data.
5. Data (base) operating: The ADMS should be simple to learn, convenient to operate
because the users vary from beginner to sophisticated operators.
6. User interface: Friendly user interface, such as menu bar, pull-down menu, diagonal
box, pop-up menu, toolbar, and so on.
P a g e | 103
5.4.1 Architecture of ADMS in GIS:
P a g e | 104
Effective: To be developed software should save memory and be less time-consuming;
and algorithm should be optimum.
Adaptive: To be organized data can be shared and be communicated with other GIS
and can be called by other GIS softwares.
High quality: inputted data is reliable, renewable, current and accurate.
5.4.4 The Flowchart of ADMS in GeoStar:
Overlay is a GIS operation that superimposes multiple data sets (representing different themes)
together for the purpose of identifying relationships between them. An overlay creates a
P a g e | 105
composite map by combining the geometry and attributes of the input data sets. Tools are
available in most GIS software for overlaying both Vector or raster data.
Before the use of computers, a similar effect was developed by Ian McHarg and others by
drawing maps of the same area at the same scale on clear plastic and actually laying them on
top of each other.
5.5.1 Overlay with Vector Data:
Feature overlays from vector data are created when one vector layer (points, lines, or polygons)
is merged with one or more other vector layers covering the same area with points, lines, and/or
polygons. A resultant new layer is created that combines the geometry and the attributes of the
input layers.
An example of overlay with vector data would be taking a watershed layer and laying over it a
layer of counties. The result would show which parts of each watershed are in each county.
P a g e | 106
5.5.1.1 Polygon Overlay Functions:
Various GIS software packages offer a variety of polygon overlay tools, often with differing
names. Of these, the following three are used most commonly for the widest variety of purposes:
Intersection, where the result includes all those polygon parts that occur in both input layers
and all other parts are excluded. It is roughly analogous to AND in logic and multiplication
in arithmetic.
Union, where the result includes all those polygon parts that occur in either A or B (or
both), so is the sum of all the parts of both A and B. Different from identify in that individual
layers are no longer identifiable. It is roughly analogous to OR in logic and addition in
arithmetic.
Subtract, also known as Difference or Erase, where the result includes only those polygon
parts that occur in one layer but not in another. It is roughly analogous to AND NOT in
logic and subtraction in arithmetic.
The remainder are used less often, and in a narrower range of applications. If a tool is not
available, all of these could be derived from the first three in two or three steps.
Symmetric Difference, also known as Exclusive Or, which includes polygons that occur
in one of the layers but not both. It can be derived as either (A union B) subtract (A intersect
B), or (A subtract B) union (B subtract A). It is roughly analogous to XOR in logic.
Identity covers the extent of one of the two layers, with the geometry and attributes merged
in the area where they overlap. It can be derived as (A subtract B) union (A intersect B).
Cover, also known as Update, is similar to union in extent, but in the area where the two
layers overlap, only the geometry and attributes of one of the layers is retained. It is called
"cover" because it looks like one layer is covering the other; it is called "update" because
its most common usage is when the covering layer represents recent changes that need to
replace polygons in the original layer, such as new zoning districts. It can be derived as A
union (B subtract A).
Clip contains the same overall extent as the intersection, but only retains the geometry and
attributes of one of the input layers. It is most commonly used to trim one layer by a polygon
represent an area of interest for the task. It can be derived as A subtract (A subtract B).
It is important to note that these functions can change the original polygons and lines into new
polygons and lines and their attributes.
5.5.2 Overlay with Raster Data:
Raster overlay involves two or more different sets of data that derive from a common grid. The
separate sets of data are usually given numerical values. These values then are mathematically
merged together to create a new set of values for a single output layer. Raster overlay is often
used to create risk surfaces, sustainability assessments, value assessments, and other
procedures. An example of raster overlay would be to divide the habitat of an endangered
P a g e | 107
species into a grid, and then getting data for multiple factors that have an effect on the habitat
and then creating a risk surface to illustrate what sections of the habitat need protecting most.
P a g e | 108
are least developed and are in the transformation stage. It extended geographically from 9o
North latitude to 10o North latitude and 78o
approximately 100 m above MSL. The terrain of the city is gradually sloped from the north to
south and west to east.
The River Vaigai is the prominent physical feature which bisects the city into North and South
zones with the north sloped towards Vaigai River and the south zone sloped away from the
river. The city became municipality in 1867 and was upgraded as a corporation in 1971 after
104 years. The corporation limit was extended from 52.18 km2 to 151 km2 in 2011. As per 2011
census the population of the city is 15.35 lakhs. The area has been experiencing remarkable
land cover changes due to urban expansion, population pressure and various economic activities
in the recent years.
5.6.1.2 Methodology:
5.6.1.2.1 Data:
For this study, Landsat ETM+ (path 143, row 53) images were used (Table 5.3). Landsat images
were downloaded from USGS Earth Resources Observation Systems data center. A base map
of Madurai city was provided by Local Planning Authority of Madurai. The Landsat ETM+
image data consists of eight spectral bands, with the same spatial resolution as the first five
bands of the Landsat TM image. Its 6th and 8th (panchromatic) bands have resolutions of 60 m
and 15 m, respectively. All visible and infrared bands (except the thermal infrared) were
P a g e | 109
included in the analysis. Remote sensing image processing was performed using ERDAS
Imagine 9.1 software. Landsat data of 1999, 2006, and SOI Toposheet were selected and used
to find the spatial and temporal changes in the study area, during the period of study.
Table 5.3 LANDSAT Satellite Data used in the study
P a g e | 110
5.6.1.3 LULC change analysis:
The LU/LC classification results are summarized form the year 1999 to 2006 in Table 5.5.
From 1999 to 2006, built-up area increased by 17.09. On the other hand open land decreased
by 11.82 % respectively. The fluctuations were observed in vegetation and water area due to
seasonal variation found in the study area. All these land use change are closely related with the
development of regional economy and the population growth in the city. The trend of LU/LC
and urban change in the city is shown in the Fig. 5.24.
Table 5.5 Summary of Areas for LU/LC Classes from 1999 to 2006
P a g e | 111
At the beginning of the design period of the water project (if no hydrological data are available)
a hydrological network (at least one stream gauge) is installed which collects data in form of
"ground truth" during a short period (planning period), e.g. between one and three years only.
A mathematical model was developed which connects this ground truth to data obtained from
satellite (Meteosat) imagery. The parameters of the nonlinear mathematical model are calibrated
based on short-term simultaneous satellite and ground truth data. The principle of the technique
is illustrated in Fig. 5.25.
Fig. 5.25 Design of a water supply reservoir with the aid of satellite imagery (Meteosat)
The model works in two consecutive steps:
(a) estimation of monthly precipitation values with the aid of Meteosat infrared data and
(b) transformation of the monthly rainfall volumes into the corresponding runoff values with
the aid of the calibrated model.
The model was applied to the Tano River basin (16 000 km2) in Ghana, West Africa. Fig. 5.26
shows two consecutive infrared Meteosat images of the region of the Tano River in Ghana. The
spatial resolution of Meteosat is 5 km x 5 km, the temporal resolution is 30 minutes. The
relatively coarse spatial resolution allows the application of this technique only for larger
drainage basins, i.e. larger than 5000 km2. The high temporal resolution of Meteosat provides
48 images per day in three spectral channels.
P a g e | 112
Fig. 5.26 Cloud development, Tano River basin, Ghana, West Africa. Two successive
Meteosat IR images
An example of the model performance is given in Fig. 5.27, which shows the monthly runoff
of the Tano River in three different curves. One represents the observed runoff, the second
shows the runoff which was computed with the rainfall runoff model based on observed rainfall
data (ground truth) and the third curve represents monthly runoff values calculated based on
remote sensing information (Meteosat, IR data).
5.6.2.2 Reservoir sediment monitoring and control:
Dams built to store water for various purposes (e.g. water supply, irrigation, hydropower)
usually undergo sediment processes, i.e. erosion or silting up. It is very important to know the
state of sedimentation within a reservoir in order to either prevent deterioration of the reservoir
due to erosion or to restore the reservoir capacity in case of silting up (e.g. by excavation of
sediments). Both processes, erosion and silting up, are unfavorable for the use of the reservoir:
erosion since it endangers the structure of the dam itself, siltation since it reduces the available
storage capacity considerably. Since a reservoir is a three dimensional body the required
knowledge of the reservoir state can be gained only with the aid of monitoring techniques
showing a high resolution in space. Under such conditions remote sensing techniques are very
useful, in this case multi frequency echo sounding taken from a boat in a multi-temporal fashion,
which allows detection of changes of the sediment conditions. In the example presented here,
both, erosion and silting up of a reservoir occurs. Fig. 5.28 shows the reservoir Lake Kemnade
in the Ruhr River valley in Germany. Twenty-five cross-sections can be seen, which are
observed periodically by echo sounder from a boat.
P a g e | 113
Fig. 5.27 Monthly runoff, Tano River basin (16 000 km2), Ghana, West Africa:
observed, computed from observed rainfall and from Meteosat IR data.
Fig. 5.29 shows a cross-section in the upper region of the lake eight years after construction.
We observe siltation in the center and the right hand part of the cross-section, while there is
some minor erosion in the left part of the section. While in the upper region and center part of
the lake siltation is dominant, near the dam and close to the weir erosion is the dominant process.
5.6.2.3 Flood forecasting and control:
In recent years we observe increasing damages and losses of lives due to severe floods on all
continents of the Earth. This shows that the problem of flood elevation still needs increasing
attention. Flood warning on the basis of flood forecasts is one way to reduce problems, another
and better way is reduction of floods with the aid of flood protection reservoirs. For both
purposes, flood warning and the operation of flood protection reservoirs, it is necessary to have
real time flood forecasts available. The usefulness of these forecasts is the higher, the sooner
they are available. In order to gain lead-time of the forecast it is advisable to compute forecast
flood hydrographs on the basis of rainfall observed in real time. Since the variability of rainfall
in time and space is very high it is advisable to monitor rainfall with the aid of remote sensing
devices having a high resolution in time and space. For this purpose ground-based weather radar
operating on the basis of active microwave information is most useful.
P a g e | 114
Fig. 5.28 Lake Kemnade with cross-sections
Fig. 5.29 Reservoir cross-section with sediments (Lake Kemnade, Ruhr River, Germany)
P a g e | 115
Fig. 5.30 shows schematically the acquisition of rainfall information by radar and its
transformation into real time flood forecasts, which in turn may be used for the computation of
an optimum real time reservoir operation strategy. Fig. 5.31 shows two consecutive isohyet
maps of the Giinz River
drainage basin in Germany observed by ground-based weather radar. This information has a
high resolution in time and space which can be used in order to compute a forecast flood
hydrograph in real time with the aid of a distributed system rainfall runoff model. With the aid
of observed (by radar) and forecasted rainfall it is possible to compute real time forecast flood
hydrographs. Fig. 5.32 shows such flood forecasts for different probabilities of non exceedance
in comparison to the (later) observed flood hydrograph. Although such forecasts are by no
means perfect they are still useful for the computation of the optimum reservoir operating
strategy of flood protection reservoirs as shown in Fig. 5.33.
Fig. 5.30 Reservoir operation based on real time flood forecasts with the aid of radar
rainfall measurements
P a g e | 116
Fig. 5.31 Isohyets for the River Giinz drainage basin, Germany, obtained from two
consecutive radar measurements
5.6.2.4 Hydropower scheduling:
In many regions of the world river runoff occurring in spring and summer originates from
snowmelt in mountainous regions. Thus the hydropower production during Spring and Summer
depends to a great extent on the quantity of snow which fell in the mountains during winter and
early spring. If therefore the quantity of snow and its water equivalent is known early during
the year it is possible to make forecasts of the expected runoff in the following months. If the
reservoirs feeding the hydropower plants are large enough, it is possible to optimize hydropower
production by scheduling the releases from the reservoirs to the power plants accordingly. This
technique has been used already very early, i.e. in the late seventies and early eighties in Norway
(0strem et al., 1981). Since most of the high mountain basins in Norway are not forested
variations in the snow cover can easily be monitored with the aid of satellite data (e.g. NOAA).
During the main snowmelt period, May to July, a forecast of expected river flows to be used in
the power plants is of high interest for proper management of the plants.
5.6.2.4 Irrigation scheduling:
The allocation of water to the various farmers within a irrigation scheme is usually regulated by
certain fixed rules. These rules may allocate certain quantities to certain farmers or allocate
water proportional to the irrigated area. Such rigid rules may be sub-optimal since they cannot
be adapted to actual and real time water demand of the crops according to their present state. It
is better to allocate water either to match crop water requirements or to maximize effectiveness.
In order to allocate water to match crop water requirements it is necessary to know the actual
water demand of the crop in real time. As long as the water supply meets the demand usually
no problems occur. If, however, crop water stress occurs the water allocation is certainly not
P a g e | 117
optimal. In order to improve the situation in real time it is necessary to detect crop water stress.
In order to do this crop water stress indices have to be defined and the major unknown
parameters in such an index should be detectable with the aid of remote sensing data. The
evapotranspiration of crops under stress is different from crops under normal conditions and
this difference can be detected with the aid of e.g. thermal infrared data.
Fig. 5.32 Flood forecast based on radar rainfall measurement and rainfall forecast
5.6.2.5 Groundwater exploration for water supply purposes:
Direct groundwater exploration or observations with the aid of remote sensing techniques is not
feasible due to the fact that most remote sensing techniques with the exception of airborne
geophysics and radar have no penetrating capabilities beyond the uppermost layer, i.e. less
than 1 m. Therefore, the use of remote sensing techniques in groundwater exploration is limited
to being a powerful additional tool to the standard geophysical methods. Therefore, the general
application of remote sensing in hydrogeology lies in the domain of image interpretation, i.e.
qualitative information, which is, however, very useful and may enable the groundwater
explorer to reduce the very expensive conventional techniques considerably. This qualitative or
indirect information, which can be obtained from remote sensing sources is e.g. (a) likely areas
for the existence of groundwater, (b) indicators of the existence of groundwater, (c) indicators
P a g e | 118
of regions of groundwater recharge and discharge and (d) areas where wells might be drilled.
These indicators are usually based on geologic and geomorphology structures or on multi-
temporal observations of surface water and on transpiring vegetation. Landsat visible and
infrared data are preferred for these purposes, but also other sensors including microwave
sensors are used. In the thermal infrared band temperature changes in multi-temporal imagery
may provide information on groundwater, e.g. areas containing groundwater being warmer than
the environment in certain seasons of the year. Shallow groundwater can be inferred by soil
moisture measurements and by changes in vegetation types and patterns. Groundwater recharge
and discharge areas within drainage basins can be inferred from soils, vegetation and shallow
or perched groundwater. Lineaments detected by Landsat or SPOT imagery are straight to
slightly curving lines formed in many different types of landscape. Many linear features, which
are not continuous may be extended or joint in image analysis. It is assumed that lineaments
mark the location of joints and faults, which again are indicators of potential groundwater
resources. Also soil type and changes in vegetation types and patterns in the area may give
certain indications of the potential availability of groundwater. It should be stated, however,
that in the field of groundwater exploration remote sensing information can only add to the
conventional exploration techniques, but certainly cannot replace them.
Fig. 5.33 Optimal release policy for two parallel reservoirs. Flood of February 1970
P a g e | 119