0% found this document useful (0 votes)

52 views31 pages

Wild Fires Typically Depicted With Polygons Showing Burned vs. Not Burned Or, Bird Distribution Indicating Presence or Absence of Birds

this is a presentation by cornel university on regression algorithm.I feel this is very useful for people who is trying to understand on the regression algorithm

Uploaded by

smarttag99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views31 pages

Wild Fires Typically Depicted With Polygons Showing Burned vs. Not Burned Or, Bird Distribution Indicating Presence or Absence of Birds

this is a presentation by cornel university on regression algorithm.I feel this is very useful for people who is trying to understand on the regression algorithm

Uploaded by

smarttag99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 31

Logistic Regression

• Often, the spatial phenomenon under investigation can

only be described by a categorical variable.
– Wild fires typically depicted with polygons showing burned vs.
not burned
– Or, bird distribution indicating presence or absence of birds

• Previous regression technique is not suitable because the

dependent variable is neither interval or ratio
• Logistic regression treats the distribution in a
probabilistic manner, that is, the occurrence of the study
phenomenon is evaluated in terms of probability
Logistic Regression
• If the probability of presence of a phenomenon is P a, then Pb
represents the absence of the phenomenon and
Pa + Pb = 1

Ua = 0 + 1XEXP
1 + 2(X
U2 a+) …+ nXn + 
Pa 
1  EXP(U )
Ua is the utility function ofa event a expressed as a linear
combination of a number of explanatory variables X1, X2, .., and
n is the estimated parameter of variable Xn
Logistic Regression
• A greater value of Ua implies a greater
probability for the event to take place. When
Ua approaches infinity, Pa approaches 1,
indicating a high likelihood for the event to
occur. When Ua approaches negative infinity,
Pa approaches 0.

• When Ua equals zero, the probability is .50,

implying a 50/50 chance for the event to occur.
Logistic Regression Example

• Example from Chou

• Fires in San Jacinto Ranger District of the San
Bernardino National Forest were examined to
map the distribution of fire occurrence
probability. The basic model consisted of eight
independent variables
– Area, perimeter, vegetation, proximity to buildings,
proximity to campgrounds, proximity to roads,
maximum temperature in July, and annual precipitation
Variables in Fire Distribution
Study
• X1 Area: area of geographic unit
X2 Perimeter: perimeter of geographic unit
X3 Vegetation: vegetation computed by rotation period
X4 Building: proximity to structures
X5 Campground: proximity to campgrounds
X6 Road: proximity to roads
X7 Temperature: maximum temperature in July
X8 Precipitation: annual precipitation
• Dependent variable is a code indicating whether or not a geographic unit is burned
or not. Area and perimeter provide general geometric characteristics. Vegetation,
precipitation, and temperature represent environmental factors, while building,
campground, and road represent human-related factors
Results of Logistic Regression
Variable Coefficient Chi-square P-Value
• The model indicates X0 -6.3246 31.13 0
X1 0 1.42 0.234
that perimeter, X2 -0.0002 8.13 0.0043
vegetation, campground, X3 1.5577 43.65 0
road, and temperature X4 -1.1451 1.93 0.1648
X5 -294.58 4.61 0.0318
are variables to be X6 -0.5244 4.46 0.0348
included in the model. X7 0.179 28.19 0
Other variables are not X8 0.0023 0.21 0.6493
included as they are not Log Likelihood -1366
statistically different from PCE 60
0 Chi-square 0.384 for alpa = .05
Results of Logistic Regression

• Percentage-correctly-estimated (PCE)
index shows the maximum level of
estimation accuracy of a model.
• In this example, PCE is 60%, not much
better than a random 50/50 chance.
• Therefore, another parameter was
evaluated…
Alternative Model
• Included an additional variable to determine whether
it makes any significant difference in model
performance
– New variable represents neighborhood effects, or conditions
of the surrounding geographic units
– Assumes that fire occurrence probability is not only affected
by the environmental and human-related variables listed in
the basic model, but by the distribution of fire occurrence
probability of adjacent units
– The new spatial term X9 is defined by the percentage of
neighboring units that were burned during the study period
New Results
• Results from the new study are
quite different
• Only two variables are statistically X1 0 1.03 0.3106
significant: vegetation and X2 -0.0003 0.97 0.3249
neighborhood effects X3 -1.6738 6.88 0.0087
X4 -0.8416 0.19 0.6669
• Vegetation appears to be the
X5 -42.28 0 0.9701
determining environmental X6 1.0241 3 0.0831
variable in the distribution of X7 -0.1121 1 0.3168
wildfires in the study area X8 -0.0127 0.55 0.4597
X9 17.951 2359.3 0
• Finally, wildfires are influenced by
neighborhood conditions Log Likelihood -164.788
PCE 97
Chi-square 3.84 for alpa = .05
Testing Statistical Signficance
• Did the neighborhood effects significantly change the model? Need to
test the chi-square test of likelihood ratio

L0
 1
• Where L0 denotes the likelihood of the basic L
model
1 and L 1 denotes the
likelihood of the study model

Log  L0  L1
• Statistical testing suggests
 1198 .283 that the.197
1366 neighborhood variable significantly
 167.914
improved the performance of the model
 2 Log  2396.566
Procedure for Regression
Analysis (Barber, p. 448)
• Specify the variables in the model and the exact form of
the relationship between them
• Collect data
• Estimate the parameters of the model
• Statistically test the utility of the developed model, and
check whether the assumptions of the simple linear
regression model are satisfied
• Use the model for prediction
Example of Data
Manipulation and
Programming in ArcView
Manipulating Yield Data with
DataManipulation.ave
Spatial Prediction of
Landslide Hazard Using
Logistic Regression and GIS
Art Lembo
620 Presentation
Based on paper by Gorsevski,
Gessler, and Folz
Introduction

• Landslides are natural geologic processes

that cause different types of damage,
causing billions of dollars in damage and
thousands of deaths each year
• 95% of landslides occur in developing
countries
Causes of Landslides

• Human activities, such as deforestation and

urban expansion, accelerate the process of
landslides
• Roads and harvest activities in timberlands
increase the occurrence of landslides
• In undisturbed forest, soil erosion is generally
negligible
Clearwater National Forest

• 1995-1996
– Major landslides occurred during the winter following
heavy rains, snowmelt, and high river flow
– Over 900 landslides were recorded on the unstable
slopes of the forest
– Landslide occurrence was widely distributed and
included artificial slopes such as road cuts and fills, or
natural slopes in clearcut areas
Landslide Data

• Within the large remote area, a DEM was

used to generate quantitative topographic
attributes
– Slope, elevation, aspect, profile, curvature,
tangent curvature, plan curvature, flow path,
and contributing area

• Photo interpretation and field inventory

identified landslide areas
Considerations in Creating
Hazard Models
• Datasets combined and stored in a GIS database
• Hazard Model assumptions
– Strength of a model depends on the quality of the data
collected
– Data driven models are not appropriate to extrapolate to
neighboring areas
– Climatic conditions may change so that the past is not an
indicator of the future

• Uncertainty exists when a hazard map is derived from a

statistically based model
Models Used in Study

• Logistic regression was used, which

correlated the environmental attributes
and landslide distribution
• Because of the existence of uncertainty, a
Receiver-Operating Curve curve plots the
proportion of false positives against the
true positives at each level of the criterion
Assessing Landslide Hazard
• Field inspection using a check list to identify sites susceptible
to landsliding
• Projection of future patterns of instability from analysis of
landslide inventories
• Multivariate analysis of factors characterizing observed sites
of slope instability
• Stability ranking based on criteria such as slope, land forms,
or geologic structure
• Failure probability analysis based on slope stability models
with stochastic hydrologic simulation
Preparing the Data
• Primary and secondary attributes are derived from a
DEM, reducing the high cost of collecting the data
(30m)
• Landslides assessed through aerial reconnaissance
• Landslide hazard area are then identified based on
spatial correlation between the attributes
• Identifying landslide hazard is based on spatial
correlation between the attributes derived from the
DEM
• ROC curves used for decision making
Data Sampling
• 15% of non-landslide cells were randomly sampled for an
absence of landslides
– Multivariate subset was derived from the coverages where
landslides were absent

• The landslide coverage was a point data set sampled grid

cells where landslides were present
• Both samples were joined together where the dependent
variable had a binary response (present or absent)
• Final output stored in ASCII and used in SAS
Statistical Analysis
• Normal plot of data to determine if the data followed a normal
Normal plot of data to determine if the data followed a normal
distribution
– Plot showed that data points do not fall along a straight line. The data is
not multivariate normal

• Logistic regression is used

when the predictor variables
are not normally distributed,
and some predictor variables
are categorical
• Factor analysis was
applied to determine the
number of underlying variables
– Only significantly loaded variables were considered
Statistical Analysis
• The form of the logistic regression model is defined
as:

• Where x is the data vector for a randomly selected

experimental unit and y is the value of the binary
outcome variable. Maximum likelihood was used to
estimate B for the predictive equation
• Variables not significant at the .1 level were eliminated
Logit Results

• Logit showed that the most important variables

contributing to the slope instability were Flow
Path and mean slope of upland area

• log (p/(1-p)) = (-2.2642 + FACTOR8 * 0.4969 + FLPATH * 0.6039)

or
p = exp (-2.2642 + FACTOR8 * 0.4969 + FLPATH * 0.6039)/(1 + exp(-2.2642
+ FACTOR8 * 0.4969 + FLPATH * 0.6039)
______________________________________________________________

p – probability of landslide hazard

FACTOR8 – factor with underlying characteristics of aspect
FLPATH – Maximum distance of water to the point in the catchment
Logit Results
• Coefficients of Logit model included positive
coefficients. Therefore, higher scores would increase
the probability of landslide hazard.
• Logit model assumes a nonlinear relationship between
the probability and the explanatory variables
• Hazard map based on ROC curve technique groups the
hazard into two classes: Low Hazard and High Hazard,
showing five classes of probabilities of landslide hazard
Final Results
• 59.1% of the landslides and 69.8% of non
landslides were correctly determined
• Model can be applied to large geographic areas
• ROC curves are incorporated as a sophisticated
tool for decision makers for the spatial prediction
of landslide hazard
a) Cut-off based on ROC curve technique b) Probability of
landslide hazard

Logistic Regression From Introductory To Advanced Concepts and Applications - Scott Menard-Ch 1
No ratings yet
Logistic Regression From Introductory To Advanced Concepts and Applications - Scott Menard-Ch 1
18 pages
Loges Tic
No ratings yet
Loges Tic
30 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Week 14. Regression Analysis (Classification)
No ratings yet
Week 14. Regression Analysis (Classification)
22 pages
XSTK
No ratings yet
XSTK
8 pages
Exam Final 1 Exam
No ratings yet
Exam Final 1 Exam
12 pages
Categorical Dependent Variable Regression Models Using STATA, SAS, and SPSS
No ratings yet
Categorical Dependent Variable Regression Models Using STATA, SAS, and SPSS
32 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
7 pages
Logistic Regression Tree Analysis
No ratings yet
Logistic Regression Tree Analysis
21 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logit
No ratings yet
Logit
48 pages
Background 2.1. Logistic Definition
No ratings yet
Background 2.1. Logistic Definition
6 pages
Logistic Regression: Continued Psy 524 Ainsworth
0% (1)
Logistic Regression: Continued Psy 524 Ainsworth
29 pages
Understanding Logistic Regression Basics
0% (1)
Understanding Logistic Regression Basics
49 pages
Logistic-Regression
No ratings yet
Logistic-Regression
3 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Poster 2
No ratings yet
Poster 2
1 page
Understanding Logistic Regression Basics
No ratings yet
Understanding Logistic Regression Basics
12 pages
Random Notes
No ratings yet
Random Notes
11 pages
Logistic Regression
0% (1)
Logistic Regression
71 pages
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
No ratings yet
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
17 pages
Logistic Regression WRD File
No ratings yet
Logistic Regression WRD File
11 pages
Chapter 15 Qualitative Response Regression Models Part 2
No ratings yet
Chapter 15 Qualitative Response Regression Models Part 2
31 pages
Logistic and Linear Regression Overview
No ratings yet
Logistic and Linear Regression Overview
32 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Thesis Using Logistic Regression
100% (2)
Thesis Using Logistic Regression
7 pages
Classification With Logistic Regression: DR Sandipan Karmakar Mnit Jaipur
No ratings yet
Classification With Logistic Regression: DR Sandipan Karmakar Mnit Jaipur
54 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Logistic Regression
No ratings yet
Logistic Regression
41 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Understanding Logistic Regression in Biostatistics
No ratings yet
Understanding Logistic Regression in Biostatistics
32 pages
Logistic Regression Guide for Researchers
No ratings yet
Logistic Regression Guide for Researchers
4 pages
Logistic Regression in Biostatistics
No ratings yet
Logistic Regression in Biostatistics
19 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Exercises Introduction Statistics
No ratings yet
Exercises Introduction Statistics
2 pages
6 - Poisson Reg
No ratings yet
6 - Poisson Reg
46 pages
Ordered Logistic Regression Guide
No ratings yet
Ordered Logistic Regression Guide
9 pages
26GeneralizedLinearModelBernoulliAnnotated PDF
No ratings yet
26GeneralizedLinearModelBernoulliAnnotated PDF
46 pages
Logistic Regression Basics
No ratings yet
Logistic Regression Basics
22 pages
Understanding Binary Logistic Regression
No ratings yet
Understanding Binary Logistic Regression
48 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
An Introduction To Logistic Regression
No ratings yet
An Introduction To Logistic Regression
48 pages
655 656bridge
No ratings yet
655 656bridge
23 pages
UT Dallas Syllabus For Poec6344.501.07f Taught by Paul Jargowsky (Jargo)
No ratings yet
UT Dallas Syllabus For Poec6344.501.07f Taught by Paul Jargowsky (Jargo)
9 pages
ML & DS Unit 1-2 Insem Pyq
No ratings yet
ML & DS Unit 1-2 Insem Pyq
16 pages
Logistic Regression Model Insights
No ratings yet
Logistic Regression Model Insights
11 pages
Logistic Regression Guide
100% (3)
Logistic Regression Guide
20 pages
S 15 Notes
No ratings yet
S 15 Notes
216 pages
Busa5325 SLRtests
No ratings yet
Busa5325 SLRtests
10 pages
Logistic Regression Guide
100% (1)
Logistic Regression Guide
34 pages
Practical Guide To Logistic Regression - Even
100% (1)
Practical Guide To Logistic Regression - Even
42 pages
Lecture 10
No ratings yet
Lecture 10
13 pages
Composite Plate Optimization Model
No ratings yet
Composite Plate Optimization Model
11 pages
NCS Delhi Teacher Vacancies 2025
No ratings yet
NCS Delhi Teacher Vacancies 2025
8 pages
NH MIL-53 (Al) Metal Organic Framework As The Smart Platform For Simultaneous High-Performance Detection and Removal of HG
No ratings yet
NH MIL-53 (Al) Metal Organic Framework As The Smart Platform For Simultaneous High-Performance Detection and Removal of HG
9 pages
Mathematics Concepts for Class 11
No ratings yet
Mathematics Concepts for Class 11
8 pages
Modelling The Adsorption of Iron and Manganese by
No ratings yet
Modelling The Adsorption of Iron and Manganese by
20 pages
Lecture 16 - The PN Junction Diode (2) - Handout
No ratings yet
Lecture 16 - The PN Junction Diode (2) - Handout
20 pages
Gear Ratios
No ratings yet
Gear Ratios
2 pages
Fundamentals of Business Statistics: 6E John Loucks
No ratings yet
Fundamentals of Business Statistics: 6E John Loucks
40 pages
PayTM Hiring Process Insights
No ratings yet
PayTM Hiring Process Insights
22 pages
Global HR and IT Consulting Solutions
No ratings yet
Global HR and IT Consulting Solutions
9 pages
SECATEUR Electrique - 20220331 - 0001
No ratings yet
SECATEUR Electrique - 20220331 - 0001
2 pages
OL 240712-58 AMAN LAB-Gas Oil
No ratings yet
OL 240712-58 AMAN LAB-Gas Oil
1 page
4 X 30 Targeting Optrics
No ratings yet
4 X 30 Targeting Optrics
2 pages
International Business Management Assignment
No ratings yet
International Business Management Assignment
1 page
1.2:rates of Change & Limits Learning Goals
No ratings yet
1.2:rates of Change & Limits Learning Goals
58 pages
Exam Strategies for Positive Attitude
No ratings yet
Exam Strategies for Positive Attitude
7 pages
175 170200
No ratings yet
175 170200
2 pages
Lesson Plan Evaluation for Teachers
No ratings yet
Lesson Plan Evaluation for Teachers
11 pages
Loss in Look Back in Anger
No ratings yet
Loss in Look Back in Anger
3 pages
Basics of Motor Starters and Contactors
100% (1)
Basics of Motor Starters and Contactors
37 pages
Adobe India Hackathon 2025
No ratings yet
Adobe India Hackathon 2025
1 page
Par-Q Physical Activity Readiness Questionnaire: Please Answer The Following Questions Honestly With A YES or A NO
No ratings yet
Par-Q Physical Activity Readiness Questionnaire: Please Answer The Following Questions Honestly With A YES or A NO
1 page
API RP 500 - Recommended Practice For Cla
100% (1)
API RP 500 - Recommended Practice For Cla
6 pages
DLA Testing: Relationship Between Power Factor and Dissipation Factor
No ratings yet
DLA Testing: Relationship Between Power Factor and Dissipation Factor
3 pages
ESA Cordex400W System
No ratings yet
ESA Cordex400W System
2 pages
Account Statement Summary: Nov 2023
No ratings yet
Account Statement Summary: Nov 2023
15 pages
3.3.10 Practice - Written Assignment - I - Mi Profesión - I - ... (Practice)
No ratings yet
3.3.10 Practice - Written Assignment - I - Mi Profesión - I - ... (Practice)
4 pages
ABS L1 Bibliography EOB 221123 100 PDF
No ratings yet
ABS L1 Bibliography EOB 221123 100 PDF
5 pages
How To Install Cracked Ipa Files (AppStore Apps) On Ipad 2
No ratings yet
How To Install Cracked Ipa Files (AppStore Apps) On Ipad 2
7 pages
Knowledge Representation and Expert System
No ratings yet
Knowledge Representation and Expert System
36 pages

Wild Fires Typically Depicted With Polygons Showing Burned vs. Not Burned Or, Bird Distribution Indicating Presence or Absence of Birds

Uploaded by

Wild Fires Typically Depicted With Polygons Showing Burned vs. Not Burned Or, Bird Distribution Indicating Presence or Absence of Birds

Uploaded by

Logistic Regression

• Often, the spatial phenomenon under investigation can

• Previous regression technique is not suitable because the

• When Ua equals zero, the probability is .50,

• Example from Chou

• Landslides are natural geologic processes

• Human activities, such as deforestation and

• Within the large remote area, a DEM was

• Photo interpretation and field inventory

• Uncertainty exists when a hazard map is derived from a

• Logistic regression was used, which

• The landslide coverage was a point data set sampled grid

• Logistic regression is used

• Where x is the data vector for a randomly selected

• Logit showed that the most important variables

• log (p/(1-p)) = (-2.2642 + FACTOR8 * 0.4969 + FLPATH * 0.6039)

p – probability of landslide hazard

You might also like