0% found this document useful (0 votes)
24 views53 pages

Ch4-Data Sources

The document discusses the fundamentals of Geographic Information Systems (GIS), focusing on data sources, input, quality, and standards. It categorizes spatial data into primary and secondary sources, detailing methods for data collection, encoding, and editing, as well as the importance of data quality and error detection. Additionally, it emphasizes the need for maintaining up-to-date spatial databases and adhering to data standards for effective GIS analysis.

Uploaded by

abhi1361yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views53 pages

Ch4-Data Sources

The document discusses the fundamentals of Geographic Information Systems (GIS), focusing on data sources, input, quality, and standards. It categorizes spatial data into primary and secondary sources, detailing methods for data collection, encoding, and editing, as well as the importance of data quality and error detection. Additionally, it emphasizes the need for maintaining up-to-date spatial databases and adhering to data standards for effective GIS analysis.

Uploaded by

abhi1361yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Fundamentals of Geographical

Information System
BGE II/II
Chapter 4 Data Sources [4 hrs]
1. Sources of Spatial Data
2. Spatial Data Input
3. Data Quality and Standards
4. Major Data Feeds
5. Data Formats
6. Metadata

Asst. Prof., Er. Bikash Sherchan 1


Sources of Spatial Data
 Spatial data – Various sources Data Collection
- Two categories Most time consuming
- Primary and Secondary and expensive, yet
important task in GIS
 Primary Sources:
 Collected from scratch
 Using spatial data acquisition techniques
 In-Situ or Remote sensing
 In-Situ data – Ground based, human being or special
instruments (e.g.; gauge/sensors/receivers)
- precipitation, temperature, wind, etc.
- social, economic and demographic data
- important characteristics/instruments – known
location/GPS
2
Sources of Spatial Data
 Primary Sources:
 Remote sensing
 Usually not fit for immediate use – many sources of errors
and distortions exist
 Large portion of GIS data – extracted by analyzing remotely
sensed data, e.g.; land use, land cover, building footprints,
transportation and utility networks, DTM, etc.
 Although primary source – usually comes from other sources
 Remote sensing instruments – cameras, multispectral and
hyperspectral scanners, thermal-infrared detectors, RADAR
and LiDAR sensors
 Airplanes, helicopters and UAVs
 Sound navigation and SONAR – ships and submarines for
bathymetric survey
3
Sources of Spatial Data
 Secondary Sources:
 Huge numbers of historical and current maps, aerial
photographs, diagrams and other types of geospatial
information – in the form of hard-copy format
 Valuable resource – need to be handled carefully
- often one-of-a-kind
- originals are very fragile
 Development in techniques to turn historical
maps/photographs to digital data,
- Can be related to GIS with other geospatial
information
- Digitization
4
Sources of Spatial Data
 Secondary Sources:
 Digitizing – 3 methods
 Digitizing tables or tablets
with hand-held cursor or
electronic pen
 Heads-up on-screen
digitization with cursor or
electronic pen
 Raster scanning

5
Spatial Data Input and Editing
 Data encoding – process of getting data into
the computer (Heywood et al, 2011)
 Fundamental process in almost every GIS
project, e.g.;
 Archeology - Encoding aerial photographs of
ancient remains to integrate with newly
collected field data
 Planner – Digitize outlines of new buildings or
roads and plot on existing topographical data
 Ecology – Add new remotely sensed data to a GIS
to examine changes in habitats;
 and many more
6
Spatial Data Input and Editing
 Data input – normally before being
structured or analyzed
 Data input – Depends on the characteristics
of data and the way they are to be
modelled
 Once fed into GIS, always need to be
corrected and manipulated to make sure
that they can be structure according to the
required data model

7
Spatial Data Input and Editing
 Activities addressed at this stage are:
 Re-formating of data (e.g.; conversion of postal
code to grid reference)
 Re-projection of data from different sources to
a common projection
 Generalization of complex data to provide a
simpler data set
 Matching and joining of adjacent sheets once
the data are in digital form

8
Spatial Data Input and Editing
 Summary of Data Encoding Methods

9
Spatial Data Input and Editing
 Data Streaming (Heywood et al., 2011):
 Range of methods to get data into GIS
(keyboard entry, digitizing, scanning and
electronic data transfer)
 Data editing and manipulation include re-
projection, transformation, edge-matching
(Geo-referencing), etc.
 The process of encoding and editing – the data
streaming

10
Spatial Data Input and Editing
Possible encoding methods for different data sources
(Heywood et al., 2011)

11
Spatial Data Input and Editing
Possible encoding methods for different data sources
(Heywood et al., 2011)

12
Spatial Data Input and Editing
 Data Editing (Heywood et al., 2011):
 Problems during data encoding: unlikely to
get an error-free data
 Errors may also be derived from the original
source as well or
 May be introduced during the encoding
process
 Errors may in co-ordinate data or
 Inaccuracy and uncertainty in attribute data
13
Spatial Data Input and Editing
 Better to intercept errors before they go
on to propagate to higher levels of
information
 The process is also known as data cleaning
 Data editing – covered in heading
 Detection and correction of errors
 Re-projection, Transformation and
generalization
 Edge matching and rubber sheeting and
 Updating the spatial databases
14
Spatial Data Input and Editing
 Detecting and Correcting Errors
 Errors in input data – derived from 3 main
sources
 Errors in source data - erroneous in paper map,
printing errors, etc. – difficult to identify
 Errors during encoding – typing mistake, encoding
wrong line, mistaken for folds and stains as
geographical features
 Errors propagated during transfer/conversion –
conversion between different formats required by
different conversion packages – lead to loss of data

15
Spatial Data Input and Editing
 Detecting and Correcting Errors
 Errors in attribute data – relatively easy to identify –
manual comparison with the original
 Example: road coded as river
 Various other methods as well, e.g.;
 Impossible values – values falling outside the range -
incorrect
 Extreme values – elevation values greater than the height
of Mt. Everest
 Internal consistencies – totals, means, min and max shall
comply with the original data
 Scattergram – errors in variables in correlation with each
other can be detected by scatterplot
 Trend Surface – values that depart significantly from the
general trend identified and corrected
16
Spatial Data Input and Editing
 Errors in spatial data are more difficult to
detect compared to errors in attribute data
 Errors in spatial data – different forms
based on data models (vector or raster) and
method of data capture
 Rain gauge station may be wrongly located
 Landuse/lancover boundary may be wrongly
delineated
 Railway line has been erroneously digitized
as road, etc.
17
Spatial Data Input and Editing
 Common errors in spatial data (Vector)

18
Spatial Data Input and Editing
 Problems in raster data - Like in vector data,
missing entities and noise
 Difficult to collect data – restricted area,
obstacles (environmental or cultural), e.g.;
airports, rainy day, , snow covered area, etc.
 Noise – in the data itself or introduced
during processing
 Scattered pixels – characteristics of which do
not conform to the neighboring pixels
 Can be removed by filtering
19
Spatial Data Input and Editing
 Re-projection, transformation and
generalization
 Once encoded and edited, it is required to
process geometrically – common reference frame
 Scale and resolution of data source – also
important – to be taken into account when
combining data from different sources
 Grid systems may have different origins, units of
measurement and orientations – necessary to
transform onto common grid system

20
Spatial Data Input and Editing
 Re-projection, transformation and
generalization
 Data from large scale map – generalized
 Generalized map – comparable to small-scale maps
 Saves processing time and disk space
 Several routines available in GIS packages
 Simplest technique – delete points along a line at a
fixed interval, e.g.; every third point
 Original shape of the feature may be lost
 Raster data generalization – aggregate or
amalgamate cells with same attribute values
- more appropriate approach - filtering 21
Spatial Data Input and Editing
 Edge matching and rubber sheeting
 Dealing with multiple map sheets – mismatch
between adjacent maps – need to be resolved
 Each map sheet digitized separately -> adjacent
sheets joined after editing, re-projection,
transformation and generalization
 This process is known as edge matching
 At first sheet boundary mismatches resolved
- such problems are seen in maps derived from
multiple satellite images
 Secondly in the case of vector data, topology
must be reconstructed
22
Spatial Data Input and Editing
 Edge matching and rubber sheeting

23
Spatial Data Input and Editing
 Edge matching and rubber sheeting
 In certain data – internal distortions within
individual map sheets
 In aerial photographs – movement of aircraft and
distortion of camera lens
 Rectified through Rubber Sheeting (Conflation)
 Stretching the map in various directions as if it is
drawn on a rubber sheet
 Accurate points are fixed while the others having
wrong co-ordinates are stretched to fit with the
control points
 Can be used for reprojection
24
Spatial Data Input and Editing
 Edge matching and rubber sheeting

25
Spatial Data Input and Editing
 Updating and maintaining spatial databases
 Important to keep data up-to-date
 Dynamic world: place and things changes over
time
 Spatial data can go out of data – need regular
updating

26
Data Quality and Standards
Data Quality
 Measure how good data are
 Describes overall fitness/suitability of data
for specific purpose
 Examine issues such as error, accuracy,
precision and bias
 Resolution and generalization of source data
 Also deal with completeness, compatibility
and consistency, and applicability for
analysis
27
Data Quality and Standards
Data Quality - Errors
 Flaws in data
 Physical difference with real world
 Single, definable departures from reality or
 Persistent, widespread deviations
 A coordinate pair indicating a bank ATM –
incorrectly entered
 Systematic error – coordinates of all ATMs
entered in (y,x) instead of (x,y)
28
Data Quality and Standards
Data Quality - Errors
 Sources of Error in GIS
 Spatial and attribute errors can occur at any
stage in GIS project
 The best way to detect them is to observe them
within the context of a typical GIS project
 Errors from Conceptualization
 Originate from how we perceive, understand and
model a reality – conceptual errors
 Perception of reality influence the definition of
reality and to the use of spatial data
29
Data Quality and Standards
Data Quality - Errors
 Errors from Conceptualization
 Inconsistencies among data collected by
different surveyors
 Use of different spatial data model for
representation of reality (raster, vector, TIN,
etc.)
 All of these have limitations – portraying reality
 Errors in Source Data
 Variety of data sources: Survey data, remotely
sensed data, map data, etc.
 All are likely to include errors
30
Data Quality and Standards
Data Quality - Errors
 Errors in Source Data
 Human mistakes in device operation or recording
observations
 Technical problems with the device/equipment
 Examples:
 Recording features incorrectly
 GPS receiver or leveling machine malfunctioning
 Wrong spatial referencing
 Mistakes in interpretation and classification
 Cloud and shadows obscure interesting details
 Generalization
31
Data Quality and Standards
Data Quality - Errors
 Errors in Data Encoding
 Probably the greatest source of error in most GIS
 Digitizing (both manual and automatic)
 Source map error or operational error
 Requires correct registration of original map
document
 Cell size determined by the resolution of the
machine
 Always require editing and cleaning

32
Data Quality and Standards
Data Quality - Errors
 Errors in Data Editing and Conversion
 Last line of defence against errors before it is
being used for analysis
 Impossible to spot and remove all errors
 Many problems can be eliminated by careful
examination of the data
 Vector GIS contain routines to check and build
topology, e.g.; open polygons, dangling lines
(overshoots), etc. – automated procedures
 In raster GIS – noise may be mistaken with
randomly scattered cell - filtering
33
Data Quality and Standards
Data Quality - Errors
 Errors in Data Editing and Conversion
 Vector to raster conversion – size of the raster
and method of rasterization matters
 Pose positional error and in some cases attribute
uncertainty
 Smaller cell size – greater precision – reduce
classification error (a form of attribute error)
 Positional and attribute errors - generalization

34
Data Quality and Standards
Data Quality - Errors
 Errors in Data Editing and Conversion

Classification
Error

35
Data Quality and Standards
Data Quality - Errors
 Errors in Data
Editing and
Conversion
Loss of Connectivity

Loss of Information 36
Data Quality and Standards
Data Quality - Errors
 Errors in Data Processing and Analysis
 Inappropriate phrasing of spatial queries
 Overlaying maps having different coordinate
systems
 Combining maps having attributes measured in
incompatible units
 Combining maps from different source (widely
different map scales)
 Classification of data, aggregation or
disaggregation of area data
37
Data Quality and Standards
Data Quality - Accuracy
 Extent to which an estimated value
approaches its true value (Aronoff, 1991)
 Data is accurate – true representation of
reality
 Impossible for spatial data to be 100%
accurate
 Accuracy within a specified tolerance is
possible
 Location of an ATM may be accurate within
10 m radius
38
Data Quality and Standards
Data Quality - Precision
 Precision – exactness of the measurements
 Also refers to number of decimal places,
e.g.; measurement of temperature at 1
degree interval or half degree
 Which one is more precise?
 Measurement of coordinates to 12 decimal
places and the one expressed to the 3
decimal places

39
Data Quality and Standards
Data Quality – Accuracy vs Precision
 A data set highly accurate – not precise or
vice versa
 A – high accuracy, low
precision
 B – low accuracy, high
precision
 C – low accuracy, low
precision
 D – high accuracy, high
precision 40
Data Quality and Standards
Data Standards
 Spatial data standards – methods for
structuring, describing, and delivering
spatially referenced data
 Categorized into four areas: media
standards, format standards, accuracy
standards and documentation standards
 All are important- latter 2 more complex
than the first two

41
Data Quality and Standards
Data Standards
 Media standards -
 Physical form in which data are transferred
 Specific formats for CD-ROM, magnetic tape,
optical or solid state storage or some proprietary
drive or other media type
 Standardized formats are specified by
International Standards Organization (ISO)

42
Data Quality and Standards
Data Standards
 Format standards -
 Specify data components and structures
 Establishes number of files used to store a
spatial data set including the basic components
to be contained in each file
 Order, size, range of values for data element
contained in each file are defined
 Information such as spacing, variable types and
file encoding may be included
 Aid in transferring data between different
computer hardware and software
43
Data Quality and Standards
Data Standards
 Accuracy standards -
 Document the quality of the positional and
attribute values
 Knowledge on data quality is crucial for the
effective use of GIS
 Field sampling – expensive, leads to additional
funds for collecting additional data
 Time limit and fund limit – tempted to use less
quality data
 Accuracy standards ensure – spatial data quality
in a well-defined, established manner
44
Data Quality and Standards
Data Standards
 Documentation standards -
 Define spatial data description
 Agreed-upon way of describing – source,
development, and form of spatial data
 Ensure complete description of the data origin,
methods of development, accuracy and delivery
formats
 Allows to maintain the data to assess the
appropriateness of the data for an intended task

45
Data Quality and Standards
National and International Standards
 Standards organizations at national and
international levels
 Define and maintain geospatial standards
 The Federal Geographic Data Committee
(FGDC) – USA
 The International Standards Organization
(ISO)
 The Open Geospatial Consortium (OGC),
e.g.; Web mapping services (WMS) standards
48
Metadata: Data Documentation
 Special type of non-geometric data
 Simply defined as ‘data about data’
 Essential for the efficient use of the spatial
data
 Describes - content, quality, methods,
developer, coordinate system, extent,
structure, spatial accuracy, attributes and
authority
 Allow users to evaluate data in terms of
suitability for their intended use
 Provides record of changes or modifications
that have been made
49
Metadata: Data Documentation
 Some are derived automatically by the
software, eg.; length, area, extent of data,
count, etc.
 Some others shall be collected explicitly,
e.g.; owner name, quality, and original
source, etc.
 Explicitly collected metadata can be
entered in the same way as other
attributes

50
Metadata: Data Documentation

51
Metadata: Data Documentation

52
Metadata: Data Documentation
 Created data travels through a network –
transformed – modified – used for many
different applications
 Retransmitted to another user and then to
another and so on.
 It is important to document any changes
made to any dataset by updating its
associated metadata
 Standard methods to be established for
reporting metadata
53
Metadata: Data Documentation
 In US Federal Geographic Data Committee
(FGDC) – defined a Content Standard for
Digital Geospatial Metadata (CSDGM)
 There are 10 basic types of information in
the CSDGM:
1) Identification, describing the data set,
2) Data quality,
3) Spatial data organization,
4) Spatial reference coordinate system,
5) Entity and attribute,
54
Metadata: Data Documentation
 There are 10 basic types of information in
the CSDGM:
6) Distribution and options for obtaining the
data set,
7) Currency of metadata and responsible party,
8) Citation,
9) Time period information, used with other
sections to provide temporal information, and
10) Contact organization or person.

55

You might also like