Curve Fitting General
Curve Fitting General
com
Chapter 351
Curve Fitting –
General
Introduction
Curve fitting refers to finding an appropriate mathematical model that expresses the relationship between a
dependent variable Y and a single independent variable X and estimating the values of its parameters using
nonlinear regression. An introduction to curve fitting and nonlinear regression can be found in the chapter entitled
Curve Fitting, so these details will not be repeated here. Here are some examples of the curve fitting that can be
accomplished with this procedure.
This program is general purpose curve fitting procedure providing many new technologies that have not been
easily available. It is preprogrammed to fit over forty common mathematical models including growth models like
linear-growth and Michaelis-Menten. It also fits many approximating models such as regular polynomials,
piecewise polynomials and polynomial ratios. In addition to these preprogrammed models, it also fits models that
you write yourself.
This routine includes several innovative features. First, it can fit curves to several batches of data simultaneously.
Second, it compares fitted models across groups using graphics and numerical tests such as an approximate F-test
for curve coincidence and a computer-intensive randomization test that compares curve coincidence and
individual parameter values. Third, this procedure computes bootstrap confidence intervals for parameter values,
predicted means, and predicted values using the latest computer-intensive bootstrapping technology.
351-1
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
1. Linear: Y=A+BX
This common model is usually fit using standard linear regression techniques. We include it here to allow for
various special forms made by transforming X and Y
Plot of Y = 1+X
Y
2. Quadratic: Y=A+BX+CX^2
The quadratic or second-order polynomial model results in the familiar parabola.
Plot of Y = 1+X+X^2
Y
3. Cubic: Y=A+BX+CX^2+DX^3
This is the cubic or third-order polynomial model.
Plot of Y = 1+X+X^2+X^3
Y
351-2
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
4. PolyRatio(1,1): Y=(A+BX)/(1+CX)
The ratio of first-order polynomials model is a slight extension of the Michaelis-Menten model. It may be used to
approximate many more complicated models.
Plot of Y = (5+X)/(1+2*X) Plot of Y = (1+X)/(1-X)
Y
Y
X X
5. PolyRatio(2,2): Y=(A+BX+CX^2)/(1+DX+EX^2)
The ratio of second-order polynomials model may be used to approximate many complicated models.
Plot of Y = (1+X-X^2)/(1-X+X^2) Plot of Y = (1+X+X^2)/(5-X+X^2)
Y
X X
6. PolyRatio(3,3): Y=(A+BX+CX^2+DX^3)/(1+EX+FX^2+GX^3)
The ratio of third-order polynomials model may be used to approximate many complicated models. However,
care must be used when estimating such high-degree models.
Plot of Y = (1+X+X^2+X^3)/(1-X+X^2-X^3) Plot of Y = (1+2*X+X^2+X^3)/(1+X+8*X^2+X^3)
Y
X X
351-3
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
8. Michaelis-Menten: Y=AX/(B+X)
This is a popular growth model.
Plot of Y = X/(1+X)
Y
9. Reciprocal: Y=1/(A+BX)
This model, known as the reciprocal or Shinozaki and Kira model, is mentioned in Ratkowsky (1989, page 89)
and Seber (1989, page 362).
Plot of Y = 1/(1+X) Plot of Y = 1/(4+2*X^2)
Y
X X
351-4
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
X X
X X
351-5
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
X X
351-6
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
X X
X X
351-7
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
X X
X X
351-8
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
X X
X X
351-9
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
351-10
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
351-11
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
351-12
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Parameter Identities
A=(a1+a3)/2 B=(b1+b3)/2 C=(b2-b1)/2
D=J1 E=(b3-b2)/2 F=J2
a1=A+CD+EF b1=B-C-E J1=D
a2=A-CD-EF b2=B+C-E J2=F
a3=A-CD+EF b3=B+C+E
Plot of Y = Quadratic-Quadratic
Y
351-13
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Y
X X
Custom Models
You are not limited to the preset models that are shown above. You can enter your own custom model using
standard mathematical notation. The only difference between using a preset model and using your own model is
that with a preset model the starting values of the search algorithm are chosen based on the model. When using a
custom model, you will have to set your own starting values based on the data you are trying to fit. When you do
not specify starting values, the program uses all zeros, which may or may not lead to a reasonable solution.
Confidence Intervals
Two methods are used to calculate confidence intervals of the regression parameters and predicted values. The
first method is based on the usual normality and constant variance of residuals assumption. When the data follow
these assumptions, standard expressions for the confidence intervals are used based on the Student’s t distribution.
Unfortunately, nonlinear regression dataset rarely follow these assumptions.
The second method is called the bootstrap method. This is a modern, computer-intensive method that has only
become available in recent years as extensive computer power has become available.
351-14
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Modified Residuals
Davison and Hinkley (1999) page 279 recommend the use of a special rescaling of the residuals when
bootstrapping to keep results unbiased. Because of the high amount of computing involved in bootstrapping, these
modified residuals are calculated using
ej
e*j = −e
1
1−
N
where
N
∑e
j =1
j
e=
N
Note that there is a different rescaling than Davison and Hinkley recommended. We have used this rescaling
because it is much quicker to calculate.
Hypothesis Testing
When curves are fit to two or more groups, it is often of interest to test whether certain regression parameters are
equal and whether the fitted curves coincide. Although some approximate results have been obtained using indicator
variables, these are asymptotic results and little is known about their appropriateness in sample samples. We provide
a test of the hypothesis that all group curves coincide using an F-test that compares the residual sum of squares
obtained when the grouping is
ignored with the total of the residual sum of squares obtained for each group. This test is routinely used in the
analysis of variance associated with linear models and its application to nonlinear models has occasionally been
suggested. However, it is based on naive assumptions that seldom occur.
Because of the availability of fast computing speed in recent years, a second method of hypothesis testing, called the
randomization test, is now available. This test will be discussed next.
Randomization Test
Randomization testing is discussed by Edgington (1987). The details of the randomization test are simple: all
possible permutations of the group variable while leaving the dependent and independent variables in their original
order are investigated. For each permutation, the difference between the estimated group parameters is calculated.
The number of permutations with a magnitude greater than or equal to that of the actual sample is counted. Dividing
this count by the number of permutations gives the significance level of the test.
351-15
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
The randomization test is suggested because an exact test is achieved without making unrealistic assumptions about
the data such as constant variance, normality, or model accuracy. The test was not used in the past because the
amount of computations was prohibitive. In fact, the randomization test was originally proposed by Fisher and he
chose his F-test because its distribution close approximated the randomization distribution.
The only assumption that a randomization test makes is that the data values are exchangeable under the null
hypothesis.
For even moderate sample sizes, the total number of permutations is in the trillions, so a Monte Carlo approach is
used in which the permutations are found by random selection rather than enumeration. Using this approach, a
reasonable approximation to the test’s probability level may be found by considering only a few thousand
permutations rather than the trillions needed for complete enumeration. Edgington suggests that at least 1000
permutations be computed. We suggest that this be increased to 10000 for important results.
The program tests two types of hypotheses using randomization tests. The first is that each of the estimated model
parameters is equal. The second is that the individual fitted curves coincide across all groups.
Data Structure
The data are entered in two variables: one dependent variable and one independent variable. Additionally, you
may specify a frequency variable containing the observation count for each row and a group variable that is used
to partition the data in to independent groups.
Missing Values
Rows with missing values in the variables being analyzed are ignored in the calculations. When only the value of the
dependent variable is missing, predicted values are generated.
351-16
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Procedure Options
This section describes the options available in this procedure.
Variables Tab
This panel specifies the variables used in the analysis.
Variables
Y (Dependent) Variable
Specifies a single dependent (Y) variable from the current dataset. This variable is being predicted using the
(preset or custom) model you specify. The actual values fed into the algorithm depend on which transformation (if
any) is selected for this variable.
Y Transformation
Specifies a power transformation of the dependent variable. Available transformations are
Y’=1/(Y*Y), Y’=1/Y, Y’=1/SQRT(Y), Y’=LN(Y), Y’=SQRT(Y), Y’=Y (none), and Y’=Y*Y
Care must be taken so that you do not apply a transformation that omits much of your data. For example, you
cannot take the square root of a negative number, so if you apply this transformation to negative values, those
observations will be treated as missing values and ignored. Similarly, you cannot have a zero in the denominator
of a quotient and you cannot take the logarithm of a number less than or equal to zero.
X (Independent) Variable
Specify the independent (X) variable. This variable is used to predict the dependent variable using the model you
have specified. This variable is referred to as ‘X’ in the Preset and Custom model statements. The actual values
used depend on which transformation (if any) is selected for this variable.
X Transformation
Specifies a power transformation of the independent variable. Available transformations are
X’=1/(X*X), X’=1/X, X’=1/SQRT(X), X’=LN(X), X’=SQRT(X), X’=X (none), and X’=X*X
Care must be taken so that you do not apply a transformation that omits much of your data. For example, you
cannot take the square root of a negative number, so if you apply this transformation to negative values, those
observations will be treated as missing values and ignored. Similarly, you cannot have a zero in the denominator
of a quotient and you cannot take the logarithm of a number less than or equal to zero.
Frequency Variable
An optional column containing a set of counts (frequencies). Normally, each row represents one observation. On
occasion, however, each row of data may represent more than one observation. This variable contains the number
of observations that a row represents. Rows with zeroes and negative values are ignored.
Group Variable
This optional variable divides the observations into groups. When specified, a separate analysis is generated for
each unique value of this variable. Use the Value Label option under the Format tab to specify the way in which
the group values are displayed.
351-17
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Model
Preset Model
Select the model that you want to fit. Select ‘Custom’ to use a model you have entered in the ‘Custom Model’
box. Whenever possible, use one of the preset models since reasonable starting values for the parameters will be
calculated for you. The minimum, maximum, and starting values of each letter in the preset model are defined in
the corresponding MIN START MAX box on the Options panel. The preset models available are
0 Custom Use the custom model
1 Y=A+BX Simple Linear
2 Y=A+BX+CX^2 Quadratic
3 Y=A+BX+CX^2+DX^3 Cubic
4 Y=(A+BX)/(1+CX) PolyRatio(1,1)
5 Y=(A+BX+CX^2)/(1+DX+EX^2) PolyRatio(2,2)
6 Y=(A+BX+CX^2+DX^3)/(1+EX+FX^2+GX^3) PolyRatio(3,3)
7 Y=(A+BX+CX^2+DX^3+EX^4) /
(1+FX+GX^2+HX^3+IX^4) PolyRatio(4,4)
8 Y=AX/(B+X) Michaelis-Menten
9 Y=1/(A+BX) Reciprocal
10 Y=(A+BX)^(-1/C) Bleasdale-Nelder
11 Y=1/(A+BX^C) Farazdaghi and Harris
12 Y=1/(A+BX+CX^2) Holliday
13 Y=EXP(A(X-B)) Exponential
14 Y=A(1-EXP(-B(X-C))) Monomolecular
15 Y=A/(1+B(EXP(-CX))) Three Parameter Logistic
16 Y=D+(A-D)/(1+B(EXP(-CX))) Four Parameter Logistic
17 Y=A(EXP(-EXP(-B(X-C)))) Gompertz
18 Y=A-(A-B)EXP(-(C|X|)^D) Weibull
19 Y=A-(A-B)/(1+(C|X|)^D) Morgan-Mercer-Floding
20 Y=A(1+(B-1)EXP(-C(X-D)))^(1/(1-B)) Richards
21 Y=B(LN(|X|-A)) Logarithmic
22 Y=A(1-B^X) Power
23 Y=AX^(BX^C) Power^Power
24 Y=A(EXP(-BX))+C(EXP(-DX)) Sum of Exponentials
25 Y=A(X^B)EXP(-CX) Exponential Type 1
26 Y=(A+BX)EXP(-CX)+D Exponential Type 2
27 Y=A+B(EXP(-C(X-D)^2)) Normal
28 Y=A+(B/X)EXP(-C(LN(|X|)-D)^2) Lognormal
29 Y=A Exp(-BX) Exponential
30 Y=AX/(B+X) + CX/(D+X) Michaelis-Menten(2)
31 Y=AX/(B+X) + CX/(D+X) + EX/(F+X) Michaelis-Menten(3)
32 Y=A + BX + C(X-D)SIGN(X-D) Linear-Linear
33 Y=A+BX+CX^2+(X-D)SIGN(X-D)[C(X+D)+E] Linear-Quadratic
34 Y=A+BX+CX^2+(X-D)SIGN(X-D)[E(X+D)+F] Quadratic-Linear
35 Y=A+BX+CX^2+(X-D)SIGN(X-D)[E(X+D)+F] Quadratic-Quadratic
36 Y=A+BX+C(X-D)SIGN(X-D)+E(X-F)SIGN(X-F) Linear-Linear-Linear
37 Y=Exp((A/B)(1-Exp(BX))) Gompertz 2
38 Y=AX^C/(B^C+X^C) Hill
351-18
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Custom Model
This box is only used when the Preset Model option is set to ‘Custom Model’. When used, it contains the
regression model written in standard mathematical notation.
Use ‘X’ to represent the independent variable specified in the X Variable box, not its variable name. Hence, if
your independent variable is HEAT, you would enter A+B*LN(X), not A+B*LN(HEAT).
Use the letters (case ignored) A,B,C,... (except X and Y) to represent the parameters to be estimated from the data.
The letters used must be specified in one of the Parameter boxes listed under the Search tab. Note that you do not
include a ‘Y=’ in the expression. That is, you would enter A+B*X, not Y=A+B*X.
Expression Syntax
Construct the expression using standard mathematical syntax. Possible symbols and functions are
Symbols
+ add
- subtract
* multiply
/ divide
^ exponent (X^2 = X*X)
() parentheses
< less than.
> greater than
= equals
<= less than or equal
>= greater than or equal
<> not equal
Functions
(a logic b) Indicator function. If true, result is 1; otherwise, result is 0. Logic values are <, >, =,
<>, <=, and >=. The symbols a and b are replaced by numbers or letters.
ABS(X) Absolute value of X.
ARCOSH(X) Arc cosh of X.
ARSINH(X) Arc sinh of X.
ARTANH(X) Arc tanh of X.
ASN(X) Arc sine of X.
ATN (X) Arc tangent of X.
COS(X) Cosine of X.
COSH(X) Hyperbolic cosine of X.
ERF(X) The error function of X
EXP(X) Exponential of X.
INT(X) Integer part of X.
LN(X) Log base e of X.
LOG(X) Log base 10 of X.
LOGGAMMA(X) Log of the gamma function.
NORMDENS(X) Normal density.
NORMPROB(X) Normal CDF (probability).
NORMVALUE(X) Inverse normal CDF.
SGN(X) Sign of X which is -1 if X<0, 0 if X=0, and 1 if X>0.
SIN(X) Sine of X.
SINH(X) Hyperbolic sine of X.
SQR(X) Square root of X.
351-19
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
TAN(X) Tangent of X.
TANH(X) Hyperbolic tangent of X.
TNH(X) Hyperbolic tangent of X.
TRIGAMMA(X) Trigamma function.
Independent Variable
Use ‘X’ in your expression to represent the independent variable you have specified.
Parameters
The letters of the alphabet (except X and Y) may be used to represent the parameters. Parameters can be only one
character long and case is ignored. Each parameter must be defined in the Parameter fields below.
Numbers
You can enter numbers in standard format such as 23.456 and 254.43, or you can use scientific notation such as
1E-5 (which is 0.00001) and 1E5 (which is 100000).
Examples
Standard mathematical syntax is used. This is discussed in detail in the Transformation section. Examples of valid
expressions are:
A+B*X
C+D*X+E*X*X or G+H*X+B*X^2
A*EXP(B*X)
(X<=5)*A+(X>5)*B+C
Bias Correction
This option controls whether a bias-correction factor is applied when the dependent variable has been
transformed. Check it to correct the predicted values for the transformation bias. Uncheck it to leave the predicted
values unchanged. See the Introduction to Curve Fitting chapter for a discussion of the amount of bias that may
occur and the bias correction procedures used.
Model Parameters
The following options control the nonlinear regression algorithm.
Parameter
Enter a letter (other than X and Y) used in the Model. Note that the case of the character is ignored. Each letter
used in a Model (either Preset or Custom) must be defined in this section by entering its letter, bounds, and
starting value.
For example, suppose the model is A + B*X + C*X^2. The parameters in this expression are A, B, and C. Each
must be defined here.
Min Start Max
Enter the minimum, starting value, and maximum of this parameter by entering three numbers separated by
blanks or commas. You may enter ‘?’ as the starting value to instruct the program pick one for you (in which case
a zero is often used). The program searches for the best value between the minimum and the maximum values,
beginning with the starting value.
Make sure that the starting values you supply are possible. For example, if the model includes the phrase 1/B,
don’t start with B=0. Before taking a lot of time trying to find a starting value, make a few trial runs using starting
values of 0.0, 0.1, and 1.0. Often, one of these values will work.
351-20
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Examples
-1000 1 1000 which means starting value = 1, lower bound = -1000, and upper bound = 1000.
-1 ? 1E9 which means starting value is unspecified, lower bound = -1, and upper bound = 1000000000.
• Minimum
This is the smallest value that the parameter can take on. The algorithm searches for a value between this and
the maximum. If you want to search in an unlimited range, enter a large negative number such as -1E9, which
is -1000000000.
Since this is a search algorithm, the narrower the range that you search in, the quicker it will converge.
Care should be taken to specify minima and maxima that keep calculations in range. Suppose, for example,
that your equation includes the expression LOG(B*X) and that values of X are positive. Since you cannot
take the logarithm of zero or a negative number, you should set the minimum of B as a small positive number,
insuring that the estimation procedure will not fail because of impossible calculations.
• Starting Value
Enter a starting value for this parameter or enter ‘?’ to have the system estimate a starting value for you.
When using a custom model, a ‘?’ is replaced by zero.
• Maximum
This is the largest value that the parameter can take on. The algorithm searches for a value between the
minimum and this value, beginning at the Starting Value. If you want to search in an unlimited range, enter a
large positive number such as 1E9, which is 1000000000.
Since this is a search algorithm, the narrower the range that you search in, the quicker the process will
converge.
Resampling
Bootstrap Confidence Intervals
This option causes bootstrap confidence intervals and associated bootstrap reports and plots to be generated using
resampling simulation as specified under the Resampling tab.
Bootstrapping may be time consuming when the bootstrap sample size is large. A reasonable strategy is to keep
this option unchecked until you have considered all other reports. Then run this option with a bootstrap size of
100 or 1000 to obtain an idea of the time needed to complete the simulation.
Randomization Hypothesis Tests
This option hypothesis tests and associated reports to be generated using Monte Carlo simulation as specified
under the Resampling tab.
Randomization tests may be time consuming when the Monte Carlo sample size is large. A reasonable strategy is
to keep this option unchecked until you have run and considered all other reports. Then run this option with a
Monte Carlo size of 100, then 1000, and then 10000 to obtain an idea of the time needed to complete the
simulation.
351-21
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Options Tab
The following options control the nonlinear regression algorithm.
Options
Lambda
This is the starting value of the lambda parameter as defined in Marquardt’s procedure. We recommend that you
do not change this value unless you are very familiar with both your model and the Marquardt nonlinear
regression procedure. Changing this value will influence the speed at which the algorithm converges.
Nash Phi
Nash supplies a factor he calls phi for modifying lambda. When the residual sum of squares is large, increasing
this value may speed convergence.
Lambda Inc
This is a factor used for increasing lambda when necessary. It influences the rate at which the algorithm
converges.
Lambda Dec
This is a factor used for decreasing lambda when necessary. It also influences the rate at which the algorithm
converges.
Max Iterations
This sets the maximum number of iterations before the program aborts. If the starting values you have supplied
are not appropriate or the model does not fit the data, the algorithm may diverge. Setting this value to an
appropriate number (say 50) causes the algorithm to abort after this many iterations.
Zero
This is the value used as zero by the nonlinear algorithm. Because of rounding error, values lower than this value
are reset to zero. If unexpected results are obtained, you might try using a smaller value, such as 1E-16. Note that
1E-5 is an abbreviation for the number 0.00001.
Reports Tab
This section controls which reports and plots are displayed.
Select Reports
Combined Summary Report ... Residual Report
These options specify which reports are displayed.
Predicted Values
Predict Y at these X Values
Enter an optional list of X values at which to report the predicted value of Y and corresponding confidence
interval. You can enter a single number or a list of numbers. The list can be separated with commas or spaces.
The list can also be of the form ‘XX:YY(ZZ)’ which means XX to YY by ZZ.
351-22
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Examples
10
10 20 30 40 50
0:90(10) which means 0 10 20 30 40 50 60 70 80 90
100:950(200) which means 100 300 500 700 900
1000:5000(500) which means 1000 1500 2000 2500 3000 3500 4000 4500 5000
Report Options
Alpha Level
Enter the value of alpha for the confidence limits. Usually, this number will range from 0.1 to 0.001. A common
choice for alpha is 0.05. You should determine a value appropriate for your needs.
Precision
Specify the precision of numbers in the report. Single precision will display seven-place accuracy, while the
double precision will display thirteen-place accuracy. Note that all reports are formatted for single precision only.
Variable Names
Specify whether to use variable names or (the longer) variable labels in report headings.
Value Labels
Value Labels may be used with the Group Variable to make reports more legible by assigning meaningful labels
to numbers and codes.
• Data Values
All data are displayed in their original format, regardless of whether a value label has been set or not.
• Value Labels
All values of variables that have a value label variable designated are converted to their corresponding value
label when they are output. This does not modify their value during computation.
• Both
Both data value and value label are displayed.
Example
A variable named GENDER (used as a grouping variable) contains 1's and 2's. By specifying a value label for
GENDER, the printout will display Male instead of 1 and Female instead of 2 on the reports. This option specifies
whether (and how) to use the value labels.
351-23
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Plots Tab
This section controls the plot(s) showing the data with the fitted function line as well as the residual plots.
Select Plots
Combined Function Plot: Y ... Probability Plot: Trans(Y)
These options specify which plots are displayed.
Resampling Tab
The following options control the bootstrapping and randomization tests.
351-24
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
C.I. Method
This option specifies the method used to calculate the bootstrap confidence intervals. The reflection method is
recommended.
• Percentile
The confidence limits are the corresponding percentiles of the bootstrap values.
• Reflection
The confidence limits are formed by reflecting the percentile limits. If X0 is the original value of the
parameter estimate and XL and XU are the percentile confidence limits, the Reflection interval is (2 X0 - XU,
2 X0 - XL).
Bootstrap Confidence Coefficients
These are the confidence coefficients of the bootstrap confidence intervals. Since bootstrapping calculations may
take several minutes, it may be useful to obtain confidence intervals using several different confidence
coefficients.
All values must be between 0.50 and 1.00. You may enter several values, separated by blanks or commas. A
separate confidence interval is given for each value entered.
Examples
0.90 0.95 0.99
0.90:0.99(0.01)
0.90
Storage Tab
The predicted values and residuals may be stored on the current database for further analysis. This group of
options lets you designate which statistics (if any) should be stored and which variables should receive these
statistics. The selected statistics are automatically stored to the current database while the program is executing.
351-25
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Note that existing data is replaced. Be careful that you do not specify variables that contain important data.
Storage Variables
Store Predicted Values, Residuals, Lower Prediction Limit, and Upper Prediction Limit
The predicted (Yhat) values, residuals (Y-Yhat), lower 100(1-alpha) prediction limits, and upper 100(1-alpha)
prediction limits may be stored in the columns specified here.
351-26
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
This report displays a summary of the results for each group and then for the case in which all groups are
combined into one group.
Group Name (Type)
This column, headed by the name of the Group Variable, lists the group value that is displayed on this line. Note
that the Value Labels option may be used to give more meaningful names to these values.
Count
This is the number of observations used by the nonlinear regression algorithm.
Iter’s
This is the number of iterations used by the nonlinear regression algorithm to find the estimates. You should note
whether the maximum number of iterations has been reached (in which case the algorithm did not converge).
R2
This is the value of the pseudo R-squared value. A value near one indicates that the model fits the data well. A
value near zero indicates that the model does not fit the data well.
AB
The final values of the estimated parameters are displayed so that you may compare them across groups.
This report displays goodness of fit results for each group and then for the case in which all groups are combined
into one dataset. The final row of the report, labeled ‘Ignored’, gives the goodness of fit statistics for the model in
which a separate curve is fit for each group.
Group Name (Type)
This column, headed by the name of the Group Variable, lists the group value that is displayed on this line.
Count
This is the number of observations used by the nonlinear regression algorithm.
Iter’s
This is the number of iterations used by the nonlinear regression algorithm to find the estimates. You should note
whether the maximum number of iterations has been reached (in which case the algorithm did not converge).
351-27
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
R2
This is the value of the pseudo R-squared value. A value near one indicates that the model fits the data well. A
value near zero indicates that the model does not fit the data well. Note
Error DF
The degrees of freedom are the number of observations minus the number of parameters fit.
Sum Squares Error
This is the sum of the squared residuals for this group.
Mean Square Error
This is a rough estimate of the variance of the residuals for this group.
This report displays an F-Test of whether all of the group curves are equal. This test compares the residual sum of
squares obtained when the grouping is ignored with the total of the residual sum of squares obtained for each
group. This test is routinely used in analysis linear models and its application to nonlinear models has
occasionally been suggested. However, it is based on normality assumptions which seldom occur. When testing
curve coincidence is important, we suggest you use a randomization test.
Curves Tested
This column indicates the term presented on this row.
DF
The degrees of freedom of this term.
Mean Square
The mean square associated with this term.
F Ratio
The F-ratio for testing the hypothesis that all curves coincide.
F-Test Prob Level
This is the probability level of the F-ratio. When this value is less than 0.05 (a common value for alpha), the test is
‘significant’ meaning that the hypothesis of equal curves is rejected. If this value is larger than the nominal level
(0.05), the null hypothesis cannot be rejected. We do not have enough evidence to reject.
This report displays the results of a randomization test whose null hypothesis is that the all the group curves
coincide. When more than two groups are present, a separate test is provided for each pair of groups, plus a
combined test of the equality of all groups.
351-28
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Curves Tested
This column indicates the groups whose equality is being test on this row.
Randomization Prob Level
This is the two-sided probability level of the randomization test. When this value is less than 0.05, the test is
‘significant’ meaning that the null hypothesis of equal curves is rejected. If this value is larger than the nominal
level (0.05), there is not enough evidence in the data to reject the null hypothesis of equality.
(Note: because this is a Monte Carlo test, your results may vary from those displayed here.)
Monte Carlo Samples
The number of Monte Carlo samples.
Number of Points Compared Along the Curve
The number of values along the X axis at which a comparison between curves is made. Of course, the more X
values used, the more accurate (and time consuming) will be the test.
This report displays the results of randomization tests about the equality of each parameter across groups. When
more than two groups are present, a separate test is provided for each pair of groups, plus a combined test of
parameter equality of all groups.
Curves Compared
This column indicates the groups being test on this row.
Parameter Test
This column indicates model parameter whose equality is being tested.
Randomization Prob Level
This is the two-sided probability level of the randomization test. When this value is less than 0.05, the test is
‘significant’ meaning that the null hypothesis of equal parameter values across groups is rejected. If this value is
larger than the nominal level (0.05), there is not enough evidence in the data to reject the null hypothesis of
equality.
(Note: because this is a Monte Carlo test, your results may vary from those displayed here.)
Monte Carlo Samples
The number of Monte Carlo samples.
Number of Points Compared Along the Curve
The number of values along the X axis at which a comparison between curves is made. Of course, the more X
values used, the more accurate (and time consuming) will be the test.
351-29
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
This plot displays all of the data and fitted curves, allowing you to quickly assess the quality of the results.
This report displays the progress of the search algorithm in its search for a solution. It allows you to assess
whether the algorithm had indeed converged or whether the program should be re-run with the Maximum
Iterations increased or the model changed.
Note that if over ten iterations were needed, the program does not display every iteration.
Estimated Model
(10.7279796046503)*(x)/((4.95940560324547)+(x))
This report displays the details of the estimation of the model parameters.
351-30
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Parameter Name
The name of the parameter whose results are shown on this line.
Parameter Estimate
The estimated value of this parameter.
Asymptotic Standard Error
An estimate of the standard error of the parameter based on asymptotic (large sample) results.
Lower 95% C.L.
The lower value of a 95% confidence limit for this parameter. This is a large sample (at least 25 observations for
each parameter) confidence limit. In most cases, the bootstrap confidence interval will be more accurate.
Upper 95% C.L.
The upper value of a 95% confidence limit for this parameter. This is a large sample (at least 25 observations for
each parameter) confidence limit. In most cases, the bootstrap confidence interval will be more accurate.
Iterations
The number of iterations that were completed before the nonlinear algorithm terminated. If the number of
iterations is equal to the Maximum Iterations that you set, the algorithm did not converge, but was aborted.
R-Squared
There is no direct R-squared defined for nonlinear regression. This is a pseudo R-squared constructed to
approximate the usual R-squared value used in multiple regression. We use the following generalization of the
usual R-squared formula:
R-Squared = (ModelSS - MeanSS)/(TotalSS-MeanSS)
where MeanSS is the sum of squares due to the mean, ModelSS is the sum of squares due to the model, and
TotalSS is the total (uncorrected) sum of squares of Y (the dependent variable).
This version of R-squared tells you how well the model performs after removing the influence of the mean of Y.
Since many nonlinear models do not explicitly include a parameter for the mean of Y, this R-squared may be
negative (in which case we set it to zero) or difficult to interpret. However, if you think of it as a direct extension
of the R-squared that you use in multiple regression, it will serve well for comparative purposes.
Random Seed
This is the value of the random seed that was used when running the bootstrap confidence intervals and
randomization tests. If you want to duplicate your results exactly, enter this random seed into the Random Seed
box under the Simulation tab.
Estimated Model
This is the model that was estimated with the parameters replaced with their estimated values. This expression
may be copied and pasted as a variable transformation in the spreadsheet. This will allow you to predict for
additional values of X. Note that to insure accuracy, the parameter estimates are always given to double-precision
accuracy.
351-31
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Source
The labels of the various sources of variation.
DF
The degrees of freedom.
Sum of Squares
The sum of squares associated with this term. Note that these sums of squares are based on Y, the dependent
variable. Individual terms are defined as follows:
Mean The sum of squares associated with the mean of Y. This may or may not be a part of the
model. It is presented since it is the amount used to adjust the other sums of squares.
Model The sum of squares associated with the model.
Model (Adjusted) The model sum of squares minus the mean sum of squares.
Error The sum of the squared residuals. This is often called the sum of squares error or just
“SSE.”
Total The sum of the squared Y values.
Total (Adjusted) The sum of the squared Y values minus the mean sum of squares.
Mean Square
The sum of squares divided by the degrees of freedom. The Mean Square for Error is an estimate of the
underlying variation in the data.
351-32
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
Sampling Method = Observation, Confidence Limit Type = Reflection, Number of Samples = 3000.
This report provides bootstrap estimates and confidence intervals for the parameters, predicted means, and
predicted values. Note that bootstrap confidence intervals and prediction intervals are provided for each of the X
(Temp) value requested. Details of the bootstrap method were presented earlier in this chapter.
Original Value
This is the parameter estimate obtained from the complete sample without bootstrapping.
Bootstrap Mean
This is the average of the parameter estimates of the bootstrap samples.
Bias (BM - OV)
This is an estimate of the bias in the original estimate. It is computed by subtracting the original value from the
bootstrap mean.
Bias Corrected
This is an estimated of the parameter that has been corrected for its bias. The correction is made by subtracting the
estimated bias from the original parameter estimate.
Standard Error
This is the bootstrap method’s estimate of the standard error of the parameter estimate. It is simply the standard
deviation of the parameter estimate computed from the bootstrap estimates.
Conf. Level
This is the confidence coefficient of the bootstrap confidence interval given to the right.
Bootstrap Confidence Limits - Lower and Upper
These are the limits of the bootstrap confidence interval with the confidence coefficient given to the left. These
limits are computed using the confidence interval method (percentile or reflection) designated on the Bootstrap
panel.
Note that to be accurate, these intervals must be based on over a thousand bootstrap samples and the original
sample must be representative of the population.
351-33
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
This report displays the asymptotic correlations of the parameter estimates. When these correlations are high
(absolute value greater than 0.98), the precision of the parameter estimates is suspect.
This section shows the predicted mean values and asymptotic (large sample) prediction intervals for the X values
that were specified. Note that these are prediction limits for a new value, not confidence limits for the mean of the
values.
351-34
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
This section shows the values of the predicted values, prediction limits, and residuals. If you have observations in
which the independent variable is given, but the dependent (Y) variable is blank, a predicted value and prediction
limits will be generated and displayed in this report.
This plot displays the data along with the estimated function. It is useful in deciding if the fit is adequate and the
prediction limits are appropriate.
351-35
© NCSS, LLC. All Rights Reserved.
NCSS Statistical Software NCSS.com
Curve Fitting – General
This is a scatter plot of the residuals versus the independent variable, X. The preferred pattern is a rectangular
shape or point cloud. Any nonrandom pattern may require a redefinition of the model.
If the residuals are normally distributed, the data points of the normal probability plot will fall along a straight
line. Major deviations from this ideal picture reflect departures from normality. Stragglers at either end of the
normal probability plot indicate outliers, curvature at both ends of the plot indicates long or short distributional
tails, convex or concave curvature indicates a lack of symmetry, and gaps or plateaus or segmentation in the
normal probability plot may require a closer examination of the data or model. We do not recommend that you
use this diagnostic with small sample sizes.
351-36
© NCSS, LLC. All Rights Reserved.